Correlation Preserved Indexing Based Approach For Document Clustering

Apply

Correlation Preserved Indexing Based Approach For Document Clustering 

Journal Title: International Journal of Advanced Research in Computer Engineering & Technology(IJARCET) - Year 2013, Vol 2, Issue 2

Abstract

Document clustering is the act of collecting similar documents into clusters, where similarity is some function on a document. Document clustering method achieves 1) a high accuracy for documents 2) document frequency can be calculated 3) term weight is calculated with the term frequency vector. Document clustering is closely related to the concept of data clustering. Document clustering is a more specific technique for unsupervised document organization, automatic topic extraction and fast information retrieval or filtering. Clustering methods can be used to automatically group the retrieved documents into a list of meaningful categories. The correlation preserving indexing method is performed to find the correlation between the documents. The Term Frequency-Inverse Document Frequency (TF-IDF) method is used to find the frequency of occurrence of words in each document. The disadvantage of this method is computation complexity. In this paper Significant Score Calculation method is introduced, where similarity between the words are calculated using word net tool. Here the related words are identified. The 98% accuracy is occurred with significant score calculation for finding correlation preserving indexing. 

Authors and Affiliations

Meena. S. U , P. Parthasarathi

Keywords

Modified Dactylogram Sifting

Fingerprint recognition has been with success utilized by enforcement agencies to spot suspects and victims for nearly a hundred years. Recent advances in automated fingerprint identification technology, in addition...

Survey on Certain Algorithms Computing Best Possible Routes for Transportation Enquiry Services 

Shortest Path problems are very common in road network applications where the optimal routings have to be found. As the traffic condition among a city changes from time to time and there are usually a huge amounts...

A Survey of Identification of Soybean Crop Diseases  

— In this survey ,we are trying to get a detailed survey of soybean crop and its diseases which are affecting the agriculture of world in a large amount and after collecting all the details we are trying to identif...

A Comparative Analysis in Terms of Message Passing & Complexity of Different Coordinator Selection Algorithms in Distributed System  

In distributed systems, many of the algorithms that have been used are typically not completely symmetrical, and some node has to take the lead in initiating the algorithm. The main role of an elected coordinator i...

A Scalable Privacy-Preserving Verification Correctness Protocol to Identify Corrupted Data in Cloud Storage

Cloud computing prevails over the whole world which makes the mankind to rely more on a number of online storage systems to back up data or for using it in real time which gives an anywhere, anytime access. Thus lot o...

EP ID EP120560
DOI -
Views 76
Downloads 0