Computational Intelligence Methods for Clustering of SenseTagged Nepali Documents

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 1

Abstract

 Abstract: This paper presents a method using hybridization of self organizing map (SOM ), particle swarmoptimization(PSO) and k-means clustering algorithm for document clustering. Document representation is animportant step for clustering purposes. The common way of represent a text is bag of words approach. Thisapproach is simple but has two drawbacks viz. synonymy and polysemy which arise because of the ambiguity ofthe words and the lack of information about the relations between the words. To avoid the drawbacks of bag ofwords approach words are tagged with senses in WordNet in this paper. Sense tagging of words provide exactsenses of words. Feature vectors are generated using sense tagged documents and clustering is carried outusing proposed hybrid SOM+PSO+K-means algorithm. In the proposed algorithm initially SOM is applied tothe feature vectors to produce the prototypes and then K-means clustering algorithm is applied to cluster theprototypes. Particle Swarm Optimization algorithm is used to find the initial centroid for K-means algorithm.Text documents in Nepali language are used to test the hybrid SOM+PSO+K-means clustering algorithm.

Authors and Affiliations

Sunita Sarkar , Arindam Roy , Bipul Syam Purkayastha

Keywords

Related Articles

 Agriculture Ontology for Sustainable Development in Nigeria

 Nigeria, a country of more than 160 million people; also, the biggest oil exporter in Africa [1] Nigeria with her oil wealth, food security, and unemployment remains a serious problem. Shortage and increase in...

Context-Centred Mobile Applications Development For Effective Adoption Of Mobile Technology

Abstract: Recent approaches in mobile computing consider context to be central to the design and implementation of mobile applications. Context considerations enables the mobile application to respond to the needs and pu...

 Comparison and Enhancement of Digital Image by Using Canny Filter and Sobel Filter

 In this research paper we have defining two different edge detection methods i.e canny edge detection and Sobel edge detection and we are also discussing some image quality parameters like PSNR, SNR, MSE, RMSE,...

 Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set

 Abstract: Due to the increasing demand for multivariate data analysis from the various application the dimensionality reduction becomes an important task to represent the data in low dimensional space for the robus...

An Enhanced Area Reduction Technique for Integrated Circuit using Genetic Algorithm

Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals, creatures, or p...

Download PDF file
  • EP ID EP127150
  • DOI -
  • Views 106
  • Downloads 0

How To Cite

Sunita Sarkar, Arindam Roy, Bipul Syam Purkayastha (2015).  Computational Intelligence Methods for Clustering of SenseTagged Nepali Documents. IOSR Journals (IOSR Journal of Computer Engineering), 17(1), 83-89. https://europub.co.uk/articles/-A-127150