An Efficient Algorithm for Document Clustering in Information Retrieval

Abstract

Document clustering is a set of documents can be divided into similar groups called clusters, so that documents within a cluster have high similarity in comparison to other documents in different clusters. It has been considered intensively due to the fact of its extensive applicability in various areas like information retrieval, web mining and search engines like Google. It is determining the similarity between documents and based on the similarity it will group the documents together. It offers efficient representation and visualization of the documents; thus helps in convenient navigation also. The main objective of this research work is to cluster the documents into similar groups based on the content of the documents. In order to perform this task this research work uses two existing documents clustering algorithms, namely K-means and DBSCAN and also this work proposes a new clustering algorithm, E-DBSCAN. From the experimental results it is observed that the E-DBSCAN gives the better clustering accuracy than other algorithms.

Authors and Affiliations

Ms. R. Janani, Dr. S. Vijayarani

Keywords

Related Articles

Computing while Charging: Building a Distributed Computing Infrastructure using Smartphones

Every night, a large range of idle smartphones square measure blocked into an influence supply for recharging the battery. Given the increasing computing capabilities of smartphones, these idle phones constitute a sizea...

A Comparative Study between Hamming Code and Reed-Solomon Code in Byte Error Detection and Correction

This work concerns the comparative study between Hamming code and Reed-Solomon (RS) code in byte error detection and correction. Data are either stored in storage applications or transferred through a media. In either c...

Traffic Noise Pollution Studies of Gwalior, M.P.

sound is basically generated through vibrations. The sound generated through unsystematic and unorganised vibrations is termed as noise. Noise pollution is an often-overlooked source of environmental stress that can rai...

Automatic Smart Home Monitoring System with Open Source Hardware

Smart home is an observing, controlling and investigating service which includes Wireless transmission technology and electronic sensor innovation. It allows the client to get the full scope of services, the opportunity...

Aggregate on Self Compacting Concrete Using M70 Grade

Advancement in concrete technology is reducing the consumption of natural resources and energy resources and lessening the burden of pollutant on environment. In this experimental study the aggregate on Self Compacting...

Download PDF file
  • EP ID EP22903
  • DOI -
  • Views 178
  • Downloads 5

How To Cite

Ms. R. Janani, Dr. S. Vijayarani (2016). An Efficient Algorithm for Document Clustering in Information Retrieval. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 4(12), -. https://europub.co.uk/articles/-A-22903