Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 3
Abstract
Classifying imbalanced data with traditional classifiers is a huge challenge now-a-days. Imbalance data is a situation wherein the ratio of data within classes is not same. Many real life situations deal with such problems e.g. Web spam detection, Credit card frauds, and Fraudulent telephone calls. The problem exists everywhere when our objective is to identify exceptional cases. The problem is handled by researchers either by modifying the existing classifications methods or by developing new methods. This paper review ensemble based approaches (Boosting and Bagging based) designed to address imbalance in classes by focusing on binary classification. We compared 6 Boosting based, 7 Bagging based and 2 hybrid ensembles for their performance in imbalance domain. We use KEEL tool to evaluate the performance of these methods by implementing the methods on seven imbalance data having class imbalance ratio from 1.82 to as high as 129.44. Area Under the curve (AUC) parameter is recorded as the performance metric. We also statistically analyzed the methods using Friedman rank test and Wilcoxon Matched Pair signed rank test to strengthen the visual interpretations. After analysis, it is proved that RusBoost ensemble outperformed every other ensemble in the imbalanced data situations.
Authors and Affiliations
Prabhjot Kaur, Anjana Gosain
VoIP QoS Analysis over Asterisk and Axon Servers in LAN Environment
Voice over IP (VoIP) is a developing technology and a key factor in both the emerging cyberspace engineering and also an accomplishment to set up its position in the telecom industry. VoIP technology is based on internet...
OpenCL-Accelerated Object Classification in Video Streams using Spatial Pooler of Hierarchical Temporal Memory
The paper presents a method to classify objects in video streams using a brain-inspired Hierarchical Temporal Memory (HTM) algorithm. Object classification is a challeng-ing task where humans still significantly outperfo...
A Schema for Generating Update Semantics
In this paper, we present a general schema for dening new update semantics. This schema takes as input any basic logic programming semantics, such as the stable semantics, the p-stable semantics or the MMr semanti...
Parts of Speech Tagging for Afaan Oromo
The main aim of this study is to develop part-of-speech tagger for Afaan Oromo language. After reviewing literatures on Afaan Oromo grammars and identifying tagset and word categories, the study adopted Hidden Markov Mod...
Healthcare Providers’ Perceptions towards Health Information Applications at King Abdul-Aziz Medical City, Saudi Arabia