Comparison of Performance in Text Mining Using Text Categorization of Semi Structured Data

Abstract

Text mining or knowledge discovery is that sub process of data mining, which is widely being used to discover hidden patterns and significant information from the huge amount of unstructured data. The enormous amount of information stored in unstructured / semi structured data cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Therefore, specific pre-processing methods and algorithms are required in order to extract useful patterns. In this study, we compared the performance of these classifications by applying the method of Bayesian methods, k-NN, decision trees, SVM, and as a neural network in classification on famous 20_newsgroup dataset from CMU Text Learning Group Data Archives, which has a collection of 20,000 messages, collected from 20 different net news newsgroups. The news will be classified according to their contents.

Authors and Affiliations

M. Nandhiya, Ms. M. Sakthi

Keywords

Related Articles

Advanced Pesticide Sprayer using Blimp Balloons

The main aim of this method is to obtain short delay in the control loop so that spraying UAV can process the information. An algorithm is evaluated to adjust the UAV route under change in wind intensity and direction....

Smart Receptionist System using Smart Lock and Wireless Communication

Security has become one of the major issues with the advent and rise in technology. This project is a security system that lets you see a visitor while the main door is locked. This embedded system is built by using PIC...

Weather Data Acquisition System

The aim of our project is to design & develop a Weather Data Acquisition System. In these project we are measuring the various weather parameter like temperature, humidity, pressure and along with these we are displayin...

Experimental Analysis of Suction Line Heat Exchanger by Using R-134a

Now-a-days Vapour compression system is used for all purpose refrigeration. It is generally used for all industrial purpose from a small domestic refrigerator to a big air conditioning plant. Expansion device is one of...

Facets of Semantic Web (3.0)

Ontology represents relationships among set of terms and concepts in hierarchical fashion. Ontology plays crucial role in formulization of information related to given domain. Understanding these ontologies without havi...

Download PDF file
  • EP ID EP22659
  • DOI -
  • Views 280
  • Downloads 5

How To Cite

M. Nandhiya, Ms. M. Sakthi (2016). Comparison of Performance in Text Mining Using Text Categorization of Semi Structured Data. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 4(9), -. https://europub.co.uk/articles/-A-22659