Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification

Abstract

Classifying imbalanced data with traditional classifiers is a huge challenge now-a-days. Imbalance data is a situation wherein the ratio of data within classes is not same. Many real life situations deal with such problems e.g. Web spam detection, Credit card frauds, and Fraudulent telephone calls. The problem exists everywhere when our objective is to identify exceptional cases. The problem is handled by researchers either by modifying the existing classifications methods or by developing new methods. This paper review ensemble based approaches (Boosting and Bagging based) designed to address imbalance in classes by focusing on binary classification. We compared 6 Boosting based, 7 Bagging based and 2 hybrid ensembles for their performance in imbalance domain. We use KEEL tool to evaluate the performance of these methods by implementing the methods on seven imbalance data having class imbalance ratio from 1.82 to as high as 129.44. Area Under the curve (AUC) parameter is recorded as the performance metric. We also statistically analyzed the methods using Friedman rank test and Wilcoxon Matched Pair signed rank test to strengthen the visual interpretations. After analysis, it is proved that RusBoost ensemble outperformed every other ensemble in the imbalanced data situations.

Authors and Affiliations

Prabhjot Kaur, Anjana Gosain

Keywords

Related Articles

A Low Cost FPGA based Cryptosystem Design for High Throughput Area Ratio

Over many years, Field Programmable Gated Ar-rays (FPGA) have been used as a target device for various prototyping and cryptographic algorithm applications. Due to the parallel architecture of FPGAs, the flexibility of c...

Modelling, Command and Treatment of a PV Pumping System Installed in Tunisia

This paper studied the modeling, the command and the optimization of a photovoltaic (PV) pumping systems using performed strategies of command laws. The system is formed by a PV generator, a DC-DC converter with a maxima...

Automatic Classification of Academic and Vocational Guidance Questions using Multiclass Neural Network

The educational and professional orientation is an essential phase for each student to succeed in his life and his curriculum. In this context, it is very important to take into account the interests, occupations, skills...

Identify and Classify Critical Success Factor of Agile Software Development Methodology Using Mind Map

Selecting the right method, right personnel and right practices, and applying them adequately, determine the success of software development. In this paper, a qualitative study is carried out among the critical factors o...

Predictive Performance Comparison Analysis of Relational & NoSQL Graph Databases

From last three decades, the relational databases are being used in many organizations of various natures such as Education, Health, Business and in many other applications. Traditional databases show tremendous performa...

Download PDF file
  • EP ID EP498374
  • DOI 10.14569/IJACSA.2019.0100307
  • Views 101
  • Downloads 0

How To Cite

Prabhjot Kaur, Anjana Gosain (2019). Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification. International Journal of Advanced Computer Science & Applications, 10(3), 48-58. https://europub.co.uk/articles/-A-498374