AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION

Abstract

 Nowadays, in many text mining applications, information is present in the form of text documents. Text document contains various types of information such as side information or metadata. The different types of information such as document provenance information, title of the document, links in the document, user-access behavior from web logs, or other non-textual attributes treated as side information contained into the text document. Such attributes contains a large amount of information for clustering purposes. It is difficult to estimate the importance of this sideinformation when text document contains some of the information is noisy. In such cases, to avoid the low quality of mining process we need a principled way to perform the text mining, to maximize the advantages from using this side information. Conformation to that, this paper represents solution to the use of side information for clustering by hierarchical algorithm which then extends to the classification problem on real data sets.

Authors and Affiliations

Kiran V. Gaidhane

Keywords

Related Articles

Landslide Monitoring System Based on Dual Receiver Measurements

 Many approaches have been used to monitor landslides. The detection techniques vary and depend on the range between the area of interest and receiving devices. For ranges up to a few hundred meters, the laser ligh...

 OPTIMAL DESIGN OF SPARSE FIR FILTER USING GENETIC ALGORITHM

 Sparse design of FIR filter has been used for reducing the implementation complexity and computational cost. The objective of the sparse FIR filter design problem considered in this paper is to reduce the number o...

SPATIAL INVERTED INDEX FOR SEARCHING MUILTIDIMENSIONAL DATA

Conventional spatial queries, such as range search and nearest neighbor retrieval, involve only conditions on objects’ geometric properties. Today, many modern applications call for novel forms of queries t...

PERFORMANCE ANALYSIS OF VAPOUR COMPRESSION REFRIGERATION SYSTEM OF WATER COOLER USING AN ECO FRIENDLY REFRIGERANT

R134a (Hydrofluorocarbon refrigerant) is used in domestic refrigeration and other vapour compression system.R134a has zero ozone depletion potential (ODP) and excellent thermodynamic properties, but it has 1300 globa...

 DETECTION OF COMPUTER VIRUSES USING WELM_ FGA_FCS

 Computer viruses are big threat for our society .The expansion of various new viruses of varying forms make the prevention quite tuff. Here we proposed WELM_FGA_FCS to detect computer viruses. The proposed method...

Download PDF file
  • EP ID EP112504
  • DOI 10.5281/zenodo.58632
  • Views 73
  • Downloads 0

How To Cite

Kiran V. Gaidhane (30).  AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION. International Journal of Engineering Sciences & Research Technology, 5(7), 1137-1148. https://europub.co.uk/articles/-A-112504