ENHANCED NEIGHBORHOOD NORMALIZED POINTWISE MUTUAL INFORMATION ALGORITHM FOR CONSTRAINT AWARE DATA CLUSTERING
Journal Title: ICTACT Journal on Soft Computing - Year 2016, Vol 6, Issue 4
Abstract
Clustering of similar data items is an important technique in mining useful patterns. To enhance the performance of Clustering, training or learning is an important task. A constraint learning semi-supervised methodology is proposed which incorporates SVM and Normalized Pointwise Mutual Information Computation Strategy to increase the relevance as well as the performance efficiency of clustering. The SVM Classifier is of Hard Margin Type to roughly classify the initial set. A recursive re-clustering approach is proposed for achieving higher degree of relevance in the final clustered set by incorporating ENNPI algorithm. An overall enriched F-Measure value of 94.09% is achieved as compared to existing algorithms.
Authors and Affiliations
Pushpa C N, Gerard Deepak, Mohammed Zakir, Thriveni J, Venugopal K R
ONTOLOGY EXTRACTION FOR AGRICULTURE DOMAIN IN MARATHI LANGUAGE USING NLP TECHNIQUES
Ontology is defined as shared specification of conceptual vocabulary used for formulating knowledge-level theories about a domain of discourse. Dataset is created by manually collecting information about different diseas...
RELIABLE COGNITIVE DIMENSIONAL DOCUMENT RANKING BY WEIGHTED STANDARD CAUCHY DISTRIBUTION
Categorization of cognitively uniform and consistent documents such as University question papers are in demand by e-learners. Literature indicates that Standard Cauchy distribution and the derived values are extensively...
MISSING VALUE IMPUTATION AND NORMALIZATION TECHNIQUES IN MYOCARDIAL INFARCTION
Missing Data imputation is an important research topic in data mining. In general, real data contains missing values. The presence of the missing value in the data set has a major problem for precise prediction. The obje...
OPTIMIZATION OF GRID RESOURCE SCHEDULING USING PARTICLE SWARM OPTIMIZATION ALGORITHM
Job allocation process is one of the big issues in grid environment and it is one of the research areas in Grid Computing. Hence a new area of research is developed to design optimal methods. It focuses on new heuristic...
PRESENTING SEARCH RESULT WITH REDUCED UNWANTED WEB ADDRESSES USING FUZZY BASED APPROACH
Big Data is now the most talked about research subject. Over the year with the internet and storage space expansions vast swaths of data are available for would be searcher. About a decade ago when a content was searched...