An Efficient Text Clustering Approach using Affinity Propagation with weight modification

Journal Title: International Journal on Computer Science and Engineering - Year 2014, Vol 6, Issue 5

Abstract

Recently the text mining has emerged as one of the most important fields of data mining because of most of the searching in the web is done on the basis of provided text, also the increasing use of social web network uses the text as major component and extracting the effective information directly or indirectly requires an efficient grouping algorithm which should be capable of providing efficient clustering. The most widely used techniques use vector space model to find equivalent vector of the text for clustering. The vector space model represents the text on the form of n-tuples numeric array (vector) where each dimension represents the unique word and the value is the weight of that word on the basis of term frequency-inverse document frequency (tf-idf), the problem of the technique is that the unique words count in any document may be very large which will create the similarly long vectors whose processing will require large memory with processing power secondly analysis may be required a bias categorical grouping which not addressed in the above technique. Hence in this paper an efficient clustering approach is presented which uses one dimension for the group of the words representing the similar area of interest with that we have also considered the uneven weighting of each dimension depending upon the categorical bias during clustering. After creating the vector the clustering is performed using seedsaffinity clustering technique. Finally to study the performance of the presented algorithm, it is applied to the benchmark data set Reuters-21578 and compared it for F-measure, Entropy and Execution time with k-means algorithm and the original AP (affinity propagation) algorithm the results shows that the presented algorithm outperforms the others by acceptable margin.

Authors and Affiliations

Isha Sharma , Prof. mahak motwani

Keywords

Related Articles

A Novel Pair of Replacement Algorithms for L1 and L2 Cache for FFT

Processors speed is much faster than memory; to bridge this gap cache memory is used. This paper proposes a preeminent pair of replacement algorithms for Level 1 cache (L1) and Level 2 cache (L2) respectively for the Fas...

Prevention Of WormholeAttacks In Geographic Routing Protocol

As mobile ad hoc network applications are deployed, security emerges as a central requirement..Position aided routing protocols can offer a significant performance increase over traditional ad hoc routing protocols. Boun...

Performance and Evaluation of IEEE 802.11e using QUALNET

IEEE 802.11 MAC (Medium Access Control) algorithms is unable to support modern multimedia applications which require certain level of quality of service (QoS) guarantees in terms of consistent, in time and reliable data...

MATLAB Simulation of Fuzzy Traffic Controller for Multilane Isolated Intersection

This paper presents a MATLAB simulation of fuzzy traffic ontroller for controlling traffic flow at multilane isolated ignalized intersection. The controller is developed based on the waiting time and vehicles queue le...

Improve Performance of Extract, Transform and Load (ETL) in Data Warehouse

Extract, transform and load (ETL) is the core process of data integration and is typically associated with data warehousing. ETL tools extract data from a chosen source, transform it into new formats according to busines...

Download PDF file
  • EP ID EP99804
  • DOI -
  • Views 97
  • Downloads 0

How To Cite

Isha Sharma, Prof. mahak motwani (2014). An Efficient Text Clustering Approach using Affinity Propagation with weight modification. International Journal on Computer Science and Engineering, 6(5), 175-180. https://europub.co.uk/articles/-A-99804