LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Abstract

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependent features to extract source code lexicons. Thus, each time a new programming language is developed, such retrieval system should be updated manually to handle that language. Such action may take a considerable amount of time, especially when parsing mechanism of such language is uncommon (e.g. Python parsing mechanism). To handle given issue, this paper proposes a source code retrieval approach which does not rely on programming-languagedependent features. Instead, it relies on Keyword & Identifier lexical pattern which is typically similar across various programming languages. Such pattern is adapted to four components namely tokenization, retrieval model, query expansion, and document enrichment. According to our evaluation, these components are effective to retrieve relevant source codes agnostically, even though the improvement for each component varies.

Authors and Affiliations

Oscar Karnalim

Keywords

Related Articles

LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependen...

THE DAWN OF METAHEURISTIC ALGORITHMS

Optimization has become such a favored area of research in recent times necessitating the need for technical papers and tutorials that will properly analyze and explain the basics of the field. At the heart of efficiency...

A SMART MONITORING SYSTEM FOR CAMPUS USING ZIGBEE WIRELESS SENSOR NETWORKS

The wireless sensor networks are autonomous sensors that are distributed to monitor environmental and physical conditions and pass them across the network to other areas, which is considered one of the key elements that...

DATA SECURITY ISSUES IN CLOUD COMPUTING: REVIEW

Cloud computing is an internet based model that empower on demand ease of access and pay for the usage of each access to shared pool of networks. It is yet another innovation that fulfills a client's necessity for comput...

TUTORIALS ON AFRICAN BUFFALO OPTIMIZATION FOR SOLVING THE TRAVELLING SALESMAN PROBLEM

The African Buffalo Optimization is a newly designed metaheuristic optimization algorithm inspired by the migration of African buffalos from place to place across the vast African forests, deserts and savannah in search...

Download PDF file
  • EP ID EP597364
  • DOI -
  • Views 116
  • Downloads 0

How To Cite

Oscar Karnalim (2018). LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN. International Journal of Software Engineering and Computer Systems, 4(1), 29-47. https://europub.co.uk/articles/-A-597364