Research of Imbalanced Data Classification in Data Mining

Journal Title: Scholars Journal of Physics, Mathematics and Statistics - Year 2016, Vol 3, Issue 3

Abstract

Classification is one of the most important research contents in data mining and traditional classification methods are relatively mature, when dealing with well-balanced data they can make good performances. But in real world the data is usually imbalanced, that is, most of the data are in majority class and little data are in minority class. Imbalanced data set cause the deduction of the precision of the minority class samples, when it is classified by traditional algorithm, which can tend to favor the more class samples. Making researches on imbalanced datasets are quite important. In order to help readers to have a clear idea of the currently proposed and future work data classification, in view of imbalanced data progress, this paper introduced three developed methods: data level, algorithmic level and developed methods that were the performance evaluation of imbalanced data classification. We are very glad to receive the valuable reference provided by the academics that interested in this field.

Authors and Affiliations

Xin Hua, ZhouShao Hua, Hu Jin Yan

Keywords

Related Articles

Common Fixed Point Results for Weakly Compatible Map in Digital Metric Spaces

This paper aims at proving fixed point results for weakly compatible maps in the setting of digital metric spaces. Also, an application and conclusion is cited in the end of this note.

Application of Halanay Inequality to the Stability of the Disease Free Equilibrium of a Delayed Malaria Transmission Model

By the applications of Halanay type inequality and the theory of nonsingular M-matrix, the global asymptotical stability of the disease free equilibrium of a delayed malaria transmission model is obtained when the basic...

Canonical Correlation Analysis of the Big Five Factors of Personality and Future Anxiety among Palestinian University Students

This study explores the relationship between the Big-Five factors of Personality, which are Neuroticism, Extraversion, Openness to Experience, Agreeableness and Conscientiousness, and four domains of Future Anxiety, whic...

Algorithm for Fuzzy Maximum Flow Probdlemin Hyper-Network Setting (II)

Maximum flow problem on hypergraphs (hyper-networks) is an extension of maximum flowproblem on normal graphs. In this report, we discuss a generalized fuzzy version of maximumflow problem in hyper-networks setting, and a...

Fabrication of Schottky Barrier Solar Cells of Copper (I) Oxide (Cu2O) by the Process of Partial Thermal Oxidation

Copper (1) Oxide (Cu2O) is a non-stoichiometric semi-conductor. It is envisaged that this semiconductor could be used for the fabrication of low-cost solar cells. These solar cells have been fabricated by researchers usi...

Download PDF file
  • EP ID EP385468
  • DOI -
  • Views 92
  • Downloads 0

How To Cite

Xin Hua, ZhouShao Hua, Hu Jin Yan (2016). Research of Imbalanced Data Classification in Data Mining. Scholars Journal of Physics, Mathematics and Statistics, 3(3), 117-122. https://europub.co.uk/articles/-A-385468