Performance Analysis of Classification Learning Methods on Large Dataset using two Data Mining Tools

Journal Title: Journal of Independent Studies and Research - Computing - Year 2015, Vol 13, Issue 2

Abstract

Data is increasing day to day thus, processing this data and selection of right method and tool is really a big problem. Computer scientists are process- ing and analysing data on different machine learning methods using various Data Mining tools to get the high accuracy of results and minimum time for building of Model. There are several data analysis and processing tools like WEKA, RapidMiner, Keel, and etc. available for the purpose of processing, analysis, modelling and etc. Still no single tool is perfect or nominated for data processing and analysis. In this concern, the authors present here a comparative and analytical research study on the performance of different classification machine learning algorithms like Naïve Bayes, KNN, IBK, Random Forest, C4.5, J48 and Data Mining tools which are WEKA and RapidMiner on a large datasets to evalu- ate their performance and analytical results with low cost of error. The data set Adult Income is taken from UCI Data repository for this research study. The significance and aim of this study is to evaluate and assess the range of performance of different machine learning methods and two diverse data mining tools on dissimilar datasets. The result of each classification method and Data mining tool is analysed and presented in the end.

Authors and Affiliations

Keywords

Related Articles

Analysis of SSD Utilization by Graph Processing Systems

Graph Processing Systems are highly productive when it comes to graph data. While using data parallel approach, it could not exploit common characteristics of a graph computation workload. To address all these challenges...

Standard Framework for Comparison of Graph Partitioning Techniques

Graph Partitioning is used to distribute graph partitions across nodes for processing. It is very important in the pre-processing step for distributed graph processing. In Math and Computer Science, many different distri...

Extracting a Graph Model by Mapping Two Heterogeneous Graphs

With the development of wireless communications, several studies have been performed on Location based Services due to their numerous applications. Amongst those recommendations, Travel Planning and Recommendations are f...

Information Extraction of Diseases and its Application

Named Entity Recognition is an essential module of Information Extraction in the field of bio-medical and diseases are one of the most important sector to study in the medical field, but since the amount of incessantly u...

Extracting patterns from Global Terrorist Dataset (GTD) Using Co-Clustering approach

Global Terrorist Dataset (GTD) is a vast collection of terrorist activities reported around the globe. The terrorism database incorporates more than 27,000 terrorism incidents from 1968 to 2014. Every record has spatial...

Download PDF file
  • EP ID EP643810
  • DOI 10.31645/jisrc/(2015).13.2.0005
  • Views 115
  • Downloads 0

How To Cite

(2015). Performance Analysis of Classification Learning Methods on Large Dataset using two Data Mining Tools. Journal of Independent Studies and Research - Computing, 13(2), 8-14. https://europub.co.uk/articles/-A-643810