A New Link Based Approach for Categorical Data Clustering

Journal Title: UNKNOWN - Year 2012, Vol 1, Issue 3

Abstract

The data generated by conventional categorical data clustering is incomplete because the information provided is also incomplete. This project presents a new link-based approach, which improves the categorical clustering by discovering unknown entries through similarity between clusters in an ensemble. A graph partitioning technique is applied to a weighted bipartite graph to obtain the final clustering result. So the link-based approach outperforms both conventional clustering algorithms for categorical data and well-known cluster ensemble technique. Data clustering is one of the fundamental tools we have for understanding the structure of a data set. It plays a crucial, foundation role in machine learning, data mining, information retrieval and pattern recognition. The experimental results on multiple real data sets suggest that the proposed link-based method almost always outperforms both conventional clustering algorithms for categorical data and well-known cluster ensemble technique. This paper proposes an Algorithm called Weighted Triple-Quality (WTQ), which also uses k-means algorithm for basic clustering. Once using does the basic clustering consensus functions we can get cluster ensembles of categorical data. This categorical data is converted to refined matrix.

Authors and Affiliations

Keywords

Related Articles

Studies on Epiphytic Microalgae in Two Freshwater Lakes of Central Tamil Nadu

There were species of epiphytic microalgae in the two lakes was under study, of which 33 species belonged to Bacillariophyceae, 24 species belonged to Chlorophyceae and 30 species belonged to Cyanophyceae. Eichhornia cra...

Comparative Study on Face Recognition using HGPP, PCA, LDA, ICA and SVM

In this paper performance of five face recognition algorithms i.e. HGPP, PCA, LDA, ICA and SVM is compared. The basis of the comparison is the rate of accuracy of face recognition. These algorithms are employed on the AT...

Radioactivity Levels in Some Sediments and Water Samples from Qarun Lake by Low–Level Gamma Spectrometry

The specific activities of the natural radionuclides 238U, 226Ra, 232Th and 40K were measured in sediments and water samples collected from the Qarun lake (Middle Egypt) in order to gather information about radionuclides...

Geographical Routing in MANET using Flexible Combination of Push and Pull Algorithm

Geographical Routing in MANET using Flexible Combination of Push and Pull Algorithm

Prevalence of Dengue in Patna District in Bihar

ABSTRACT: BACKGROUND: Dengue virus is a single stranded RNA virus of family Flaviviridae. It is transmitted by Aedes mosquito, particularly Aedes aegypti. It is distributed worldwide but epidemic is more prevalent in tro...

Download PDF file
  • EP ID EP333852
  • DOI -
  • Views 174
  • Downloads 0

How To Cite

(2012). A New Link Based Approach for Categorical Data Clustering. UNKNOWN, 1(3), -. https://europub.co.uk/articles/-A-333852