LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Abstract

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependent features to extract source code lexicons. Thus, each time a new programming language is developed, such retrieval system should be updated manually to handle that language. Such action may take a considerable amount of time, especially when parsing mechanism of such language is uncommon (e.g. Python parsing mechanism). To handle given issue, this paper proposes a source code retrieval approach which does not rely on programming-languagedependent features. Instead, it relies on Keyword & Identifier lexical pattern which is typically similar across various programming languages. Such pattern is adapted to four components namely tokenization, retrieval model, query expansion, and document enrichment. According to our evaluation, these components are effective to retrieve relevant source codes agnostically, even though the improvement for each component varies.

Authors and Affiliations

Oscar Karnalim

Keywords

Related Articles

LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN

Despite the fact that source code retrieval is a promising mechanism to support software reuse, it suffers an emerging issue along with programming language development. Most of them rely on programming-language-dependen...

CATEGORIZATION OF GELAM, ACACIA AND TUALANG HONEY ODORPROFILE USING K-NEAREST NEIGHBORS

Honey authenticity refer to honey types is of great importance issue and interest in agriculture. In current research, several documents of specific types of honey have their own usage in medical field. However, it is qu...

THE IMPACTS OF SOCIAL NETWORKING SITES IN HIGHER LEARNING

Social networking sites, a web-based application have permeated the boundary between personal lives and student lives. Nowadays, students in higher learning used social networking site such as Facebook to facilitate thei...

THE NEED OF DASHBOARD IN SOCIAL RESEARCH NETWORK SITES FOR RESEARCHERS

Nowadays, dashboard has been widely used by organizations to display information based on their objectives such as monitoring business performance or checking the current trend in the niche market. There is a need to inv...

FINGERPRINT WATERMARKING WITH TAMPER LOCALIZATION AND EXACT RECOVERY USING MULTI-LEVEL AUTHENTICATION

This paper presents the tamper localization and exact recovery using multi-level authentication in fingerprint watermarking. The proposed scheme will be detecting the tampered sector of fingerprint images when the waterm...

Download PDF file
  • EP ID EP597364
  • DOI -
  • Views 92
  • Downloads 0

How To Cite

Oscar Karnalim (2018). LANGUAGE-AGNOSTIC SOURCE CODE RETRIEVAL USING KEYWORD & IDENTIFIER LEXICAL PATTERN. International Journal of Software Engineering and Computer Systems, 4(1), 29-47. https://europub.co.uk/articles/-A-597364