Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification

Abstract

Classifying imbalanced data with traditional classifiers is a huge challenge now-a-days. Imbalance data is a situation wherein the ratio of data within classes is not same. Many real life situations deal with such problems e.g. Web spam detection, Credit card frauds, and Fraudulent telephone calls. The problem exists everywhere when our objective is to identify exceptional cases. The problem is handled by researchers either by modifying the existing classifications methods or by developing new methods. This paper review ensemble based approaches (Boosting and Bagging based) designed to address imbalance in classes by focusing on binary classification. We compared 6 Boosting based, 7 Bagging based and 2 hybrid ensembles for their performance in imbalance domain. We use KEEL tool to evaluate the performance of these methods by implementing the methods on seven imbalance data having class imbalance ratio from 1.82 to as high as 129.44. Area Under the curve (AUC) parameter is recorded as the performance metric. We also statistically analyzed the methods using Friedman rank test and Wilcoxon Matched Pair signed rank test to strengthen the visual interpretations. After analysis, it is proved that RusBoost ensemble outperformed every other ensemble in the imbalanced data situations.

Authors and Affiliations

Prabhjot Kaur, Anjana Gosain

Keywords

Related Articles

An Analysis of Brand Selection

It is often observed that consumers select upper class brand when they buy next time. Suppose that former buying data and current buying data are gathered. Also suppose that upper brand is located upper in the variable a...

RTS/CTS Framework Paradigm and WLAN Qos Provisioning Methods

Wireless local area network (WLAN) communications performance design and management have evolved a lot to be where they are today. They went through some technology’s amendments and innovations. But, some performance too...

Connectivity Resotration Techniques for Wireless Sensor and Actor Network (WSAN), A Review

Wireless Sensor and actor networks (WSANs) are the most promising research area in the field of wireless communication. It consists of large number of small independent sensor and powerful actor nodes equipped with commu...

Fault-Tolerant Model Predictive Control for a Z(TN)-Observable Linear Switching Systems

This work considers the control and the state observation of a linear switched systems with actuators faults. A particular problem is studied: the occurrence of non-observable subsystem in the switching sequence. Hence,...

Design and Implementation of a Communication System and Device Aimed at the Inclusion of People with Oral Communication Disabilities

Disability is part of human condition; it discriminates people who have this complication. The present work was carried out due to this and an experience in our research center. A prototype was designed and build that al...

Download PDF file
  • EP ID EP498374
  • DOI 10.14569/IJACSA.2019.0100307
  • Views 80
  • Downloads 0

How To Cite

Prabhjot Kaur, Anjana Gosain (2019). Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification. International Journal of Advanced Computer Science & Applications, 10(3), 48-58. https://europub.co.uk/articles/-A-498374