Usage of Cosine Similarity and term Frequency count for Textual document Clustering

Abstract

This paper presents textual document clustering using two approaches namely cosine similarity and frequency and inverse document frequency. With the combination of these approaches a similarity measure values are generated between keywords in the documents and between the documents. Using this approach, the best related document can be identified on the basis of clustering method called correlation preserving index in which related documents are stored in an index format.

Authors and Affiliations

B. Sindhuja, Mrs. VeenaTrivedi

Keywords

Related Articles

Properties of Concrete on Adding Polypropylene Fibre and Polyvinyl Chloride Fibre

Concrete is a structure material that can't be ignored indeed if it's weak in tension and lead to environmental problems. Properties of concrete can be modified using fibre in concrete. Using fibre from waste of plastic...

Alternative Approach of Handling DOS Vs Mobility Management with Effective Utilization of Client’s Needs

Nowadays, a wide range of money flow involved through network communication or wireless communication such as online transaction, mobile transaction, e-payment etc., but yet the 100% security towards the client side opin...

RDM Based Approach To Solving Decision Making Problem Under Uncertain Environment

The combination of fuzzy logic tools and multi criteria decision making has a great relevance in the literature. Real life decision making problem under uncertainty is usually associated with information that may be inco...

Voice Based Email for Blind

Now-a-days internet has become one of the basic aids for day-to-day living. Every human being is widely accessing the knowledge, information and also using for communication through internet. However, blind people face d...

Embedded System for Biometric Online Signature Verification Using ARM Processor

This paper describes the implementation biometric of an embedded system for online signature verification using ARM processor. Online signature verification is one of the biometric features which can be used as a common...

Download PDF file
  • EP ID EP749039
  • DOI -
  • Views 56
  • Downloads 0

How To Cite

B. Sindhuja, Mrs. VeenaTrivedi (2014). Usage of Cosine Similarity and term Frequency count for Textual document Clustering. International Journal of Innovative Research in Computer Science and Technology, 2(5), -. https://europub.co.uk/articles/-A-749039