Software Code Clone Detection Using AST
Journal Title: International Journal of P2P Network Trends and Technology(IJPTT) - Year 2014, Vol 9, Issue 1
Abstract
The research which exists suggests that a considerable portion (10-15%) of the source code of large-scale computer programs is duplicate code. Detection and removal of such clones promises decreased software maintenance costs of possibly the same magnitude. Previous work was limited to detection of either near misses differing only in single lexemes, or near misses only between complete functions. This paper presents simple and practical methods for detecting exact and near miss clones over arbitrary program fragments in program source code by using abstract syntax trees. Previous work also did not suggest practical means for removing detected clones. Since our methods operate in terms of the program structure, clones could be removed by mechanical methods producing in-lined procedures or standard preprocessor macros.A tool using these techniques is applied to a C production software system of some 500K source lines, and the results confirm detected levels of duplication found by previous work. The tool produces macro bodies needed for clone removal, and macro invocations to replace the clones. The tool uses a variation of the well-known compiler method for detecting common sub-expressions. This method determines exact tree matches; a number of adjustments are needed to detect equivalent statement sequences, commutative operands, and nearly exact matches. We additionally suggest that clone detection could also be useful in producing more structured code, and in reverse engineering to discover domain concepts and their implementations.
Authors and Affiliations
G. Anil kumar , Dr. C. R. K. Reddy , Dr. A. Govardhan
An Efficient Data Hiding Technique for Steganography
Bose Chaudhuri Hochquenghem (BCH) based data hiding scheme for JPEG steganography is presented. Traditional data hiding approaches hide data into each block, where all the blocks are not overlapping each other. Two conse...
Improving Accuracy in Decision Making for Detecting Intruders
Normal host based Intrusion detection system provides us some alerts of data integrity breach on the basis of policy violation and unauthorized access. There are some factors responsible if any employee of the ente...
A Hybrid DWT-SVD Based Digital Image Watermarking Algorithm for Copyright Protection
Digital watermarking is the process of embedding information into a digital signal. In this paper to improve the robustness, the hybrid DWT-SVD based algorithm is proposed for embedding and extracting process. The sugges...
Extracting Multiwords From Large Document Collection Based N-Gram
Multiword terms (MWTs) are relevant strings of words in text collections. Once they are automatically extracted, they may be used by an Information Retrieval system, suggesting its users possible conceptual interes...
Compressed Sensing Based Image Encoding Technique for Wireless Sensor Networks
The Wireless Sensor Network (WSN) is the one, which generally consists of cameras themselves, which have some local image processing, communication and storage capabilities, and one or more central computers, where image...