Impacts of Unbalanced Test Data on the Evaluation of Classification Methods

Abstract

The performance of a classifier in a supervised machine learning problem is popularly evaluated by using the accuracy, precision, recall, and F1-score. These parameters could evaluate very well classifiers in the case that the number of positive label sample and the number of negative label sample in the testing set are balanced or nearly balanced. However, these parameters may miss-evaluate the classifiers in some case where the positive and negative samples in the testing set is unbalanced. This paper proposes some update in these parameters by taking into account the unbalanced factor which represents the unbal-ance ratio of positive and negative samples in the testing set. The new updated parameters are then experimentally evaluated to compare to the traditional parameters.

Authors and Affiliations

Manh Hung Nguyen

Keywords

Related Articles

Investigate the Performance of Document Clustering Approach Based on Association Rules Mining

The challenges of the standard clustering methods and the weaknesses of Apriori algorithm in frequent termset clustering formulate the goal of our research. Based on Association Rules Mining, an efficient approach for We...

Norm’s Trust Model to Evaluate Norms Benefit Awareness for Norm Adoption in an Open Agent Community

In recent developments, norms have become important entities that are considered in agent-based systems’ designs. Norms are not only able to organize and coordinate the actions and behaviour of agents but have a direct i...

Impacts of Unbalanced Test Data on the Evaluation of Classification Methods

The performance of a classifier in a supervised machine learning problem is popularly evaluated by using the accuracy, precision, recall, and F1-score. These parameters could evaluate very well classifiers in the case th...

An Analysis of Security Challenges in Cloud Computing

Vendors offer a pool of shared resources to their users through the cloud network. Nowadays, shifting to cloud is a very optimal decision as it provides pay-as-you-go services to users. Cloud has boomed high in business...

A Novel Approach for Background Subtraction using Generalized Rayleigh Distribution

Identification of the foreground objects in dynamic scenario video images is an exigent task, when compared to static scenes. In contrast to motionless images, video sequences offer more information concerning how items...

Download PDF file
  • EP ID EP499607
  • DOI 10.14569/IJACSA.2019.0100364
  • Views 111
  • Downloads 0

How To Cite

Manh Hung Nguyen (2019). Impacts of Unbalanced Test Data on the Evaluation of Classification Methods. International Journal of Advanced Computer Science & Applications, 10(3), 497-502. https://europub.co.uk/articles/-A-499607