Research of Imbalanced Data Classification in Data Mining

Journal Title: Scholars Journal of Physics, Mathematics and Statistics - Year 2016, Vol 3, Issue 3

Abstract

Classification is one of the most important research contents in data mining and traditional classification methods are relatively mature, when dealing with well-balanced data they can make good performances. But in real world the data is usually imbalanced, that is, most of the data are in majority class and little data are in minority class. Imbalanced data set cause the deduction of the precision of the minority class samples, when it is classified by traditional algorithm, which can tend to favor the more class samples. Making researches on imbalanced datasets are quite important. In order to help readers to have a clear idea of the currently proposed and future work data classification, in view of imbalanced data progress, this paper introduced three developed methods: data level, algorithmic level and developed methods that were the performance evaluation of imbalanced data classification. We are very glad to receive the valuable reference provided by the academics that interested in this field.

Authors and Affiliations

Xin Hua, ZhouShao Hua, Hu Jin Yan

Keywords

Related Articles

Determination of Radioactivity Concentration and Annual Committed Effective Dose in Drinking Water Collected from Local Borehole in Gombe, Nigeria

The gross alpha and beta data were generated from EURISYS MEASURE IN20 low Background multiple (eight) channels alpha and beta detector stationed at Center for Energy Research and Training (CERT), Ahmadu Bello University...

On Quasi -r-Normal Spaces

In this paper, we introduce the concept of quasi-r-normal spaces in topological spaces by using regular open sets in topological spaces and obtain some characterizations and preservation theorems for πgr-closed sets. Mat...

Survey of Attitudes toward Statisticsfor BusinessUndergraduates

In business education, statistics course has traditionally been taught at the undergraduate level curriculum to enable them to design, present, analyze, and interpret data in their field and will serve asbasis in making...

Solving Fully Fuzzy Critical Path Analysis in Project Networks Using Linear Programming Problems

A new method for finding fuzzy optimal solution, the maximum total completion fuzzy time and fuzzy critical path for the given fully fuzzy critical path (FFCP) problems using crisp linear programming (LP) problem is prop...

Logistic Regression Modeling to Isolate Factors that Correlate with Usage of ITN as a Prophylactic to Malaria in Ghana

The study was conducted to isolate factors that correlate with ownership and usage of insecticide treated nets (ITNs) as a prophylactic to malaria in Asamankese, Ghana and explore the policy implications of the findings...

Download PDF file
  • EP ID EP385468
  • DOI -
  • Views 68
  • Downloads 0

How To Cite

Xin Hua, ZhouShao Hua, Hu Jin Yan (2016). Research of Imbalanced Data Classification in Data Mining. Scholars Journal of Physics, Mathematics and Statistics, 3(3), 117-122. https://europub.co.uk/articles/-A-385468