Enhancing MEDLINE Document Clustering using SSNCUT With MS and GC Constraints
Journal Title: International Journal of Engineering Sciences & Research Technology - Year 30, Vol 3, Issue 3
Abstract
The Global content and Mesh Semantic information are considered for clustering the biomedical documents from whole MEDLER collection and Mesh Semantic information. Previously by using Semi supervised Non Negative Matrix Factorization for clustering biomedical documents are not efficient for integrating more information and inefficacious because of limited space representation for combining different analogies. To overcome this limitation a Semi supervised Normalized cut and MPCKmeans algorithm is proposed over this analogies with two constraints ML and CL constraints. And the performance of the above algorithms are demonstrated on MEDLINE document clustering.Another interesting finding was that ML constraints more effectively worked than CL constraints. We evaluate the proposed method on benchmark datasets and the results demonstrate consistent and substantial improvements over the current state. Experimental results show that integrating the semantic and content similarities outperforms the case of using only one of the two similarities, being statistically significant. We further find the best parameter setting that is consistent over all experimental conditions conducted. And finally show a typical example of resultant clusters, confirming the effectiveness of our strategy in improving MEDLINE document clustering.
Authors and Affiliations
V. Aishwarya
ONTOLOGY BASED ASPECT LEVEL OPINION MINING
In recent years, opinion mining has been investigated mainly in three level of granularity (document, sentence or aspect(feature)). However both document and sentence level analysis do not discover what exactly customers...
DESIGN AND IMPLEMENTATION OF REMOTE ENVIRONMENT MONITORING SYSTEM FOR INDUSTRY AND LANDFILL SITES USING ARM7 PROCESSOR
The primary causes of the climate change and global warming on the earth are the greenhouse gases, which are produced as a result of industrial processes, burning fossil fuels, Landfill sites and rice cultivation...
SECURE MINING OF ASSOCIATION RULES IN DISTRIBUTED DATABASE USING SEMI HONEST THIRD PARTY
Data mining is used to discovering useful patterns hidden in a database from large datasets, but sometimes these datasets are split among various sites and none of the sites is allowed to expose its database to an...
EVALUATION OF ANALYTICAL MODELS FOR PREDICTION OF HEIGHT OF CAPILLARY RISE FROM WATER TABLE IN DIFFERENT POROUS MEDIA
The prediction of the capillary rise height in the unsaturated porous media above the water table is important in hydrology, water management, contaminant transport and geotechnical engineering studies. The heights of t...
STRUCTURAL HEALTH MONITORING OF A CONCRETE STRUCTURESIN GAUTAM BUDDHA UNIVERSITY
A reinforced concrete structure constructs a combination of concrete and steel bars. Structures are constructed by using different grade of concrete and steel bars. Its durability depends on various factors viz. w...