AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION

Abstract

 Nowadays, in many text mining applications, information is present in the form of text documents. Text document contains various types of information such as side information or metadata. The different types of information such as document provenance information, title of the document, links in the document, user-access behavior from web logs, or other non-textual attributes treated as side information contained into the text document. Such attributes contains a large amount of information for clustering purposes. It is difficult to estimate the importance of this sideinformation when text document contains some of the information is noisy. In such cases, to avoid the low quality of mining process we need a principled way to perform the text mining, to maximize the advantages from using this side information. Conformation to that, this paper represents solution to the use of side information for clustering by hierarchical algorithm which then extends to the classification problem on real data sets.

Authors and Affiliations

Kiran V. Gaidhane

Keywords

Related Articles

 Design and Implementation of 1-bit Pipeline ADC in 0.18um CMOS Technology

 This paper present the design of a single bit Pipeline Analog-to-Digital Converter (ADC) which is realize using CMOS technology. In this paper, a 1-bit Pipeline ADC is implemented in a standard TSMC 0.18um CMOS t...

 A Review on Different Models for Heat Transfer Assessment in Hydrocarbons Fires

 Since the beginning of industrial revolution, accidents caused by the hydrocarbons are increased significantly. In order to reduce the accidents and unintentional events, this project considers heat transfer asse...

A Survey- use of Software Quality Attributes for Web Based Software Applications

In older days the main purpose of the web application is to surf by means of trouble-free sites that consists simple hypertext documents. The use of web application on internet is limited and expensive. The demands of...

 REVIEW OF APT ATTACKS: HOW BIG DATA FIGHTES BACK

 Now a days, threat of previously unknown cyber-attacks are increasing because existing security systems are not able to detect them. The cyber-attacks had simple purposes of leaking personal information by attackin...

292-295 50 Emergence of New Wireless Technologies : 802.11ac and 802.11ad. Nayan Seth*, Vivek Sanghvi Jain, Ankit Vajani, Harish Narula, Lakshmi Kurup 292-295 50 Emergence of New Wireless Technologies : 802.11ac and 802.11ad. Nayan Seth*, Vivek Sanghvi Jain, Ankit Vajani, Harish Narula, Lakshmi Kurup 292-295 50 Em

Through this research paper a case study conducted at Swaraj Tractors, Mohali, is presented. This study was on subject of failures of flywheel housing in field. Flywheel housing is a structural component which is used a...

Download PDF file
  • EP ID EP112504
  • DOI 10.5281/zenodo.58632
  • Views 117
  • Downloads 0

How To Cite

Kiran V. Gaidhane (30).  AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION. International Journal of Engineering Sciences & Research Technology, 5(7), 1137-1148. https://europub.co.uk/articles/-A-112504