A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval
Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 8, Issue 5
Abstract
In recent years, the development of information systems in every field such as business, academics and medicine has led to increase in the amount of stored data year by year. A vast majority of data are stored in documents that are virtually unstructured. Text mining technology is very helpful for people to process huge information by imposing structure upon text. Clustering is a popular technique for automatically organizing a large collection of text. However, in real application domains, the experimenter possesses some background knowledge that helps in clustering the data. Traditional clustering techniques are rather unsuitable of multiple data types and cannot handle sparsity and high dimensional data. Co-clustering techniques are adopted to overcome the traditional clustering technique by simultaneously performing document and word clustering handling both deficiencies. Semantic understanding has become essential ingredient for information extraction, which is made by adopting constraints as a semi-supervised learning strategy. This survey reviews on the constrained co-clustering strategies adopted by researchers to boost the clustering performance. Experimental results using 20-Newsgroups dataset shows that the proposed method is effective for clustering textual documents. Furthermore, the proposed algorithm consistently outperformed all the existing constrained clustering and coclustering methods under different conditions.
Authors and Affiliations
S. Saranya , R. Munieswari
A Preliminary Investigation into the Effect of Ambient Temperature on Biogas Generation using Cow-Dung from Afaka-Kaduna for Household Cooking
This paper presents an exploratory investigation of the feasibility of generating biogas from the surplus cow-dung available in Afaka, Kaduna for possible use as an alternative renewable environmentally friendly cooking...
Pre-Recover from a Node Failure in Ad hoc Network Using Reactive Protocols
Ad-hoc Network is an infrastructure less networks, which will configure by it without any base stations. A mobile ad hoc network will move freely in any direction without any restrictions. Reactive protocol will intimate...
Image Fusion On Mr And Ct Images Using Wavelet Transforms And Dsp Processor
Medical image fusion is a technique in which useful information from two or more recorded medical images is integrated into a new image. It can be used to make clinical diagnosis and treatment more accurate. Wavele...
Design of an Amplifier through Second Generation Current Conveyor
This paper describes the architecture of first and second generation current conveyor (CCI and CCII respectively) and designing an amplifier using second generation current conveyor. The designed amplifier thr...
Load Compensation by Diesel Generator and Three Level Inverter Based DSTATCOM
This paper presents the load compensation by diesel generator. In this compensation reactive power, harmonics and unbalanced load current generates because of linear or nonlinear loads. The control of Distribution...