Performance Analysis of Classification Learning Methods on Large Dataset using two Data Mining Tools

Journal Title: Journal of Independent Studies and Research - Computing - Year 2015, Vol 13, Issue 2

Abstract

Data is increasing day to day thus, processing this data and selection of right method and tool is really a big problem. Computer scientists are process- ing and analysing data on different machine learning methods using various Data Mining tools to get the high accuracy of results and minimum time for building of Model. There are several data analysis and processing tools like WEKA, RapidMiner, Keel, and etc. available for the purpose of processing, analysis, modelling and etc. Still no single tool is perfect or nominated for data processing and analysis. In this concern, the authors present here a comparative and analytical research study on the performance of different classification machine learning algorithms like Naïve Bayes, KNN, IBK, Random Forest, C4.5, J48 and Data Mining tools which are WEKA and RapidMiner on a large datasets to evalu- ate their performance and analytical results with low cost of error. The data set Adult Income is taken from UCI Data repository for this research study. The significance and aim of this study is to evaluate and assess the range of performance of different machine learning methods and two diverse data mining tools on dissimilar datasets. The result of each classification method and Data mining tool is analysed and presented in the end.

Authors and Affiliations

Keywords

Related Articles

Standard Framework for Comparison of Graph Partitioning Techniques

Graph Partitioning is used to distribute graph partitions across nodes for processing. It is very important in the pre-processing step for distributed graph processing. In Math and Computer Science, many different distri...

Improving ATM User Interface (UI) of Pakistani Banks Using Keystroke Level Modelling (KLM)

The ATM connotes as Automated Teller Machine or Cash Machine. This machine has earned its currency on a larger scale in our modern society. However, unfortunately, most users have met bad experiences. For instance, reins...

Detection of Duplicate and Near-Duplicate Content for Web Crawlers

There is an abundance of duplicated web documents on the internet. For example, two documents online could be very similar to each other except for a very small portion, such as URLs and advertisements. While such differ...

Performance Analysis of Classification Learning Methods on Large Dataset using two Data Mining Tools

Data is increasing day to day thus, processing this data and selection of right method and tool is really a big problem. Computer scientists are process- ing and analysing data on different machine learning methods using...

Comparative Analysis of Collaborative Filtering on GraphLab, MLlib and Mahout

Recommendation systems are used to recommend items or products to the user based on their previous purchases, visits, interests, ratings, wish-lists or reviews to develop interest and to display the accurate and suitable...

Download PDF file
  • EP ID EP643810
  • DOI 10.31645/jisrc/(2015).13.2.0005
  • Views 126
  • Downloads 0

How To Cite

(2015). Performance Analysis of Classification Learning Methods on Large Dataset using two Data Mining Tools. Journal of Independent Studies and Research - Computing, 13(2), 8-14. https://europub.co.uk/articles/-A-643810