Implementation Of ROCK Clustering Algorithm For The Optimization Of Query Searching Time
Journal Title: International Journal on Computer Science and Engineering - Year 2012, Vol 4, Issue 5
Abstract
Clustering is a data mining technique of grouping similar type of data or queries together which helps in identifying similar subject areas. The major problem is to identify heterogeneous subject areas where frequent queries are asked. There are number of agglomerative clustering algorithms which are used to cluster the data. The problem with these algorithms is that they make use of distance measures to calculate similarity. So the best suited algorithm for clustering the categorical data is Robust Clustering Using Links (ROCK) [1] algorithm because it uses Jaccard coefficient instead of using the distance measures to find the similarity between the data or documents to classify the clusters. The mechanism for classifying the clusters based on the similarity measure shall be used over a given set of data. This method will make clusters of the data corresponding to different subject areas so that a prior knowledge about similarity can be maintained which in turn will help to discover accurate and consistent clusters and will reduce the query response time. The main objective of our work is to implement ROCK [1] and to decrease the query response time by searching the documents in the resulted clusters instead of searching the whole database. This technique actually reduces the searching time of documents from the database.
Authors and Affiliations
Ashwina Tyagi , Sheetal Sharma
TDPA: Trend Detection and Predictive Analytics
Text mining is the process of exploratory text analysis either by automatic or semi-automatic means that helps finding previously unknown information. Text mining is a highly interdisciplinary research area, bringing tog...
End to End delay analytical estimation of NoC with VBR traffic
Network On Chip (NoC) integrate real time application that require strength performance guaranties, usually enforced by a tight upper bound on the maximum end-to-end delay. The complex and unpredictable nature of data tr...
CHARACTER BASED WEIGHTED SUPPORT THRESHOLD LGORITHM USING MULTI CRITERIA DECISION MAKING TECHNIQUE
An association rule technique generally used to generate requent itemsets from databases and generates association rules by considering each item in the datasets. However, the values of items are different in many aspec...
Comparison of Novel Semi supervised Text classification using BPNN by Active search with KNN Algorithm
With the availability of huge amount of text in internet, news, institutes, organization etc need of automatic text classification also increases, The proposed work comprised to deal with the major challenge of getting l...
Efficient Forward Node List Algorithm for Broadcasting in symmetric Mobile Ad hoc networks
A mobile ad hoc network enables wireless communications between participating mobile nodes without the assistance of any base station. Two nodes that are out of one another’s ransmission range need the support of interm...