A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Document Categorization

Abstract

Assigning documents to related categories is critical task which is used for effective document retrieval. Automatic text classification is the process of assigning new text document to the predefined categories based on its content. In this paper, we implemented and performed comparison of Naïve Bayes and Centroid-based algorithms for effective document categorization of English language text. In Centroid Based algorithm, we used Arithmetical Average Centroid (AAC) and Cumuli Geometric Centroid (CGC) methods to calculate centroid of each class. Experiment is performed on R-52 dataset of Reuters-21578 corpus. Micro Average F1 measure is used to evaluate the performance of classifiers. Experimental results show that Micro Average F1 value for NB is greatest among all followed by Micro Average F1 value of CGC which is greater than Micro Average F1 of AAC. All these results are valuable for future research.

Authors and Affiliations

Rupali P. Patil, R. P. Bhavsar, B. V. Pawar

Keywords

Related Articles

“Behavior of Seam Puckering of Polyester, Cotton & blends fabric on High Sewing Thread Tension”

The garment quality means the quality ofseam, which is the very important feature of any form of fabric assembly using sewing operations. The investigation has attempted to find out the relationship between fabric elonga...

Finite Element Analysis of Typical Ground Based Composite Sandwich Radome

Radome encapsulates the Radar and serves as radio frequency transparent shield to the antenna. Radome protects the antenna from external environments which are detrimental to the Electromagnetic performance of the radar....

The Evaluation of Consultant Supervisors Performance on Road Construction Project in East Borneo

The Road constructions in East Borneo, which is developed in 2015, is expected to produce road infrastructure that could be done as the plan and its regardless from the role of the consultant supervisors performance whic...

A Comparison Study of Different Community Detection Approaches and Its Potential Applications for Online Networks

The incredible rising of online networks show that these networks are complex and involving massive data. Giving a very strong interest to set of techniques developed for mining these networks. One of the fundamental app...

Surface Color Identification in Crust & Finished Leathers using K-Means Clustering Algorithm

In general, color identification on the surface level is carried out in the leather industry. Further, the leather is grouped by various parameters. Primarily it takes place through visual assessment only. This practice...

Download PDF file
  • EP ID EP390838
  • DOI 10.9790/9622- 0703045963.
  • Views 118
  • Downloads 0

How To Cite

Rupali P. Patil, R. P. Bhavsar, B. V. Pawar (2017). A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Document Categorization. International Journal of engineering Research and Applications, 7(3), 59-63. https://europub.co.uk/articles/-A-390838