ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers
Journal Title: e-Informatica Software Engineering Journal - Year 2017, Vol 11, Issue 1
Abstract
Background: Software developers insert log statements in the source code to record program execution information. However, optimizing the number of log statements in the source code is challenging. Machine learning based within-project logging prediction tools, proposed in previous studies, may not be suitable for new or small software projects. For such software projects, we can use cross-project logging prediction. Aim: The aim of the study presented here is to investigate cross-project logging prediction methods and techniques. Method: The proposed method is ECLogger, which is a novel, ensemble-based, cross-project, catch-block logging prediction model. In the research We use 9 base classifiers were used and combined using ensemble techniques. The performance of ECLogger was evaluated on on three open-source Java projects: Tomcat, CloudStack and Hadoop. Results: ECLogger Bagging, ECLogger AverageVote, and ECLogger MajorityVote show a considerable improvement in the average Logged F-measure ($LF$) on 3, 5, and 4 source$rightarrow $target project pairs, respectively, compared to the baseline classifiers. ECLogger AverageVote performs best and shows improvements of 3.12% (average $LF$) and 6.08% (average $ACC$ -- Accuracy). Conclusion: The classifier based on ensemble techniques, such as bagging, average vote, and majority vote outperforms the baseline classifier. Overall, the ECLogger AverageVote model performs best. The results show that the CloudStack project is more generalizable than the other projects.
Authors and Affiliations
Sangeeta Lal, Neetu Sardana, Ashish Sureka
Resolving Conflict and Dependency in Refactoring to a Desired Design
Refactoring is performed to improve software quality while leaving the behaviour of the system unchanged. In practice there are many opportunities for refactoring, however, due to conflicts and dependencies between refac...
Knowledge Management in Software Testing: A Systematic Snowball Literature Review
Description : Software testing benefits from the usage of Knowledge Management (KM) methods and principles. Thus, there is a need to adopt KM to the software testing core processes and attain the benefits that it provide...
Cross-Project Defect Prediction with Respect to Code Ownership Model: An Empirical Study
The paper presents an analysis of 83 versions of industrial, open-source and academic projects. We have empirically evaluated whether those project types constitute separate classes of projects with regard to defect pred...
Data Flow Approach to Testing Java Programs Supported with DFC
Code based (``white box'') approach to testing can be divided into two main types: control flow coverage and data flow coverage. The data flow testing was introduced to structural programming languages and later adopted...
Efficiency of Software Testing Techniques: A Controlled Experiment Replication and Network Meta-analysis
Background. Common approaches to software verification include static testing techniques, such as code reading, and dynamic testing techniques, such as black-box and white-box testing. Objective. With the aim of gaining...