LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Abstract

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependent features to extract source code lexicons. Thus, each time a new programming language is developed, such retrieval system should be updated manually to handle that language. Such action may take a considerable amount of time, especially when parsing mechanism of such language is uncommon (e.g. Python parsing mechanism). To handle given issue, this paper proposes a source code retrieval approach which does not rely on programming-languagedependent features. Instead, it relies on Keyword & Identifier lexical pattern which is typically similar across various programming languages. Such pattern is adapted to four components namely tokenization, retrieval model, query expansion, and document enrichment. According to our evaluation, these components are effective to retrieve relevant source codes agnostically, even though the improvement for each component varies.

Authors and Affiliations

Oscar Karnalim

Keywords

Related Articles

AN APPROACH TO INCREASE THE EFFECTIVENESS OF TLC VERIFICATION WITH RESPECT TO THE CONCURRENT STRUCTURE OF TLA+ SPECIFICATION

Modern approaches to distributed software systems engineering are tightly bounded with formal methods usage. The effective way of certain method application can leverage significant outcome, in terms of corresponding tim...

IMPLEMENTING COMBINED FSM WITH CPLDS

The subject of the research in this article is the logic circuit of the combined finite state machine (CFSM), which combines the functions of the both FSM Mealy and Moore. In practice, such a model of control automata is...

LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependen...

REVIEWING AND APPLYING SECURITY SERVICES WITH NON-ENGLISH LETTER CODING TO SECURE SOFTWARE APPLICATIONS IN LIGHT OF SOFTWARE TRADE-OFFS

Important software applications need to be secured by choosing the suitable security services. In this paper, a shopper program is designed and implemented using VB.NET to follow up the movement of goods in the store and...

MULTI-FACTOR ATTENDANCE AUTHENTICATION SYSTEM

Taking attendance in classes is a cumbersome task which can benefit from smartphone innovation. This study identifies the vulnerabilities of the technology and proposes a technique to identify cheating. Several smartphon...

Download PDF file
  • EP ID EP597364
  • DOI -
  • Views 115
  • Downloads 0

How To Cite

Oscar Karnalim (2018). LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN. International Journal of Software Engineering and Computer Systems, 4(1), 29-47. https://europub.co.uk/articles/-A-597364