A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Document Categorization

Abstract

Assigning documents to related categories is critical task which is used for effective document retrieval. Automatic text classification is the process of assigning new text document to the predefined categories based on its content. In this paper, we implemented and performed comparison of Naïve Bayes and Centroid-based algorithms for effective document categorization of English language text. In Centroid Based algorithm, we used Arithmetical Average Centroid (AAC) and Cumuli Geometric Centroid (CGC) methods to calculate centroid of each class. Experiment is performed on R-52 dataset of Reuters-21578 corpus. Micro Average F1 measure is used to evaluate the performance of classifiers. Experimental results show that Micro Average F1 value for NB is greatest among all followed by Micro Average F1 value of CGC which is greater than Micro Average F1 of AAC. All these results are valuable for future research.

Authors and Affiliations

Rupali P. Patil, R. P. Bhavsar, B. V. Pawar

Keywords

Related Articles

Frequency Stability Analysis of 9-stage Ring Oscillator in CMOS 45nm Technology using Cadence Tool

This proposed paper focuses on design and analysis of a nine stage Ring Oscillator in terms of frequency stability. For a ring oscillator, accuracy is very important. A 9-stage ring oscillator is designed and simulated u...

Invitro Evaluation of Antimicrobial Activity of the plant extracts of Elytraria acaulis

Antimicrobial activity of the aerial parts of the Elytraria acaulis a stem less perennial herb of Acantheceae family has been carried out in the present study. Extracts of the aerial parts of the plant (Stem & Leaves) we...

Development of a Simulation and Analysis Tool for Chemical Reactors

A reactor analysis and simulation tool (ASchem) for the design and performance analysis of batch, continuous stirred-tank and plug-flow reactors has been developed. The simulation tool is robust, allows for choice of rea...

Conceptualization of IoT Powered Parking System

In the present world, traffic congestion caused by vehicle is a frustrating problem and it has been growing exponentially. Car parking problem is one of the major contributor for this traffic congestion especially in urb...

A Wall Finishing Machine For Civil Construction

A wall finishing machine is used for civil construction by using this machine finishing of vertical surface in less time with accuracy. This machine works with the help of electric motor. This motor is connected with cha...

Download PDF file
  • EP ID EP390838
  • DOI 10.9790/9622- 0703045963.
  • Views 136
  • Downloads 0

How To Cite

Rupali P. Patil, R. P. Bhavsar, B. V. Pawar (2017). A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Document Categorization. International Journal of engineering Research and Applications, 7(3), 59-63. https://europub.co.uk/articles/-A-390838