Challenging Issues and Similarity Measures for Web DocumentClustering
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 1
Abstract
Abstract: Web itself contains a large amount of documents available in electronic form. The availabledocuments are in various forms and the information in them is not in organized form. The lack of organizationof materials in the WWW motivates people to automatically manage the huge amount of information. Textminingrefers generally to the process of extracting interesting and non-trivial information and knowledgefrom unstructured text. Text mining framework contains Information Retrieval, Information Extraction,Information Mining and Interpretation. During Information Retrieval, so many web documents are retrieved.In that how we can find out similar documents among retrieved? This paper deals with the challengingissues and similarity measures for web document clustering
Authors and Affiliations
S. Mahalakshmi
Development of Virtual Computing Lab Using Private Cloud
Abstract: Virtual Computing Lab (VCL) is a very effective answer for the educational institution to meet the increasing demand of physical machines, different computational laboratories and large number of users in alimi...
A Novel Rebroadcast Technique for Reducing Routing Overhead In Mobile Ad Hoc Networks
In mobile ad hoc networks (MANETs), the network topology changes frequently and unpredictably due to the arbitrary mobility of nodes. This feature leads to frequent path failures and route reconstructions, ...
An Enhanced Scheme for Hiding Text in Wave Files
Abstract: Steganography is a term applied to any number of processes that embed an object into another object in order to deceive any observer or adversary. An embedding algorithm for hiding messages into wave files or a...
Microcontroller-Based Remote Temperature Monitoring System
Abstract: There is increase in death rate in hospitals due to inadequate attention to the patients, insufficient number of doctors as well as poor state of equipment make it difficult for the patients to receive proper...
Segmentation of the Blood Vessel and Optic Disc in Retinal Images Using EM Algorithm
Abstract: Diabetic retinopathy (DR), glaucoma and hypertension are eye disease which is harmful and causes pressure in eye nerve and finally blindness. With the invention of new systems and the developing of newtechnolog...