Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification

Abstract

Classifying imbalanced data with traditional classifiers is a huge challenge now-a-days. Imbalance data is a situation wherein the ratio of data within classes is not same. Many real life situations deal with such problems e.g. Web spam detection, Credit card frauds, and Fraudulent telephone calls. The problem exists everywhere when our objective is to identify exceptional cases. The problem is handled by researchers either by modifying the existing classifications methods or by developing new methods. This paper review ensemble based approaches (Boosting and Bagging based) designed to address imbalance in classes by focusing on binary classification. We compared 6 Boosting based, 7 Bagging based and 2 hybrid ensembles for their performance in imbalance domain. We use KEEL tool to evaluate the performance of these methods by implementing the methods on seven imbalance data having class imbalance ratio from 1.82 to as high as 129.44. Area Under the curve (AUC) parameter is recorded as the performance metric. We also statistically analyzed the methods using Friedman rank test and Wilcoxon Matched Pair signed rank test to strengthen the visual interpretations. After analysis, it is proved that RusBoost ensemble outperformed every other ensemble in the imbalanced data situations.

Authors and Affiliations

Prabhjot Kaur, Anjana Gosain

Keywords

Related Articles

VoIP QoS Analysis over Asterisk and Axon Servers in LAN Environment

Voice over IP (VoIP) is a developing technology and a key factor in both the emerging cyberspace engineering and also an accomplishment to set up its position in the telecom industry. VoIP technology is based on internet...

OpenCL-Accelerated Object Classification in Video Streams using Spatial Pooler of Hierarchical Temporal Memory

The paper presents a method to classify objects in video streams using a brain-inspired Hierarchical Temporal Memory (HTM) algorithm. Object classification is a challeng-ing task where humans still significantly outperfo...

 A Schema for Generating Update Semantics

 In this paper, we present a general schema for de ning new update semantics. This schema takes as input any basic logic programming semantics, such as the stable semantics, the p-stable semantics or the MMr semanti...

Parts of Speech Tagging for Afaan Oromo

The main aim of this study is to develop part-of-speech tagger for Afaan Oromo language. After reviewing literatures on Afaan Oromo grammars and identifying tagset and word categories, the study adopted Hidden Markov Mod...

Download PDF file
  • EP ID EP498374
  • DOI 10.14569/IJACSA.2019.0100307
  • Views 108
  • Downloads 0

How To Cite

Prabhjot Kaur, Anjana Gosain (2019). Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification. International Journal of Advanced Computer Science & Applications, 10(3), 48-58. https://europub.co.uk/articles/-A-498374