Comparison of Performance in Text Mining Using Text Categorization of Semi Structured Data

Abstract

Text mining or knowledge discovery is that sub process of data mining, which is widely being used to discover hidden patterns and significant information from the huge amount of unstructured data. The enormous amount of information stored in unstructured / semi structured data cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Therefore, specific pre-processing methods and algorithms are required in order to extract useful patterns. In this study, we compared the performance of these classifications by applying the method of Bayesian methods, k-NN, decision trees, SVM, and as a neural network in classification on famous 20_newsgroup dataset from CMU Text Learning Group Data Archives, which has a collection of 20,000 messages, collected from 20 different net news newsgroups. The news will be classified according to their contents.

Authors and Affiliations

M. Nandhiya, Ms. M. Sakthi

Keywords

Related Articles

Disease Detection of Cotton crop using Image Processing Technique: A Survey

Farming is important sector in India for human being, as near about 55-60% people are depends directly and indirectly on it. Among all crops, Cotton is main cash crop in India gives more income to the farmer. Due to dis...

Defeating DOS Attacks in Low Rate Networks Using Network Multifractal.

Nowadays, distributed denial of service (DDoS) attacks pose one of the most serious security threats to the Internet. DDoS attacks can result in a great damage to the network service. To have a better understanding on D...

Efficient Multi Review Classification using feature extraction technique in the Micro Reviews

Although the content of micro-blogging sites has been studied extensively, micro-reviews are a source of content that has been largely overlooked in the literature. In this paper we study micro-reviews, and we show how...

Plant Growth Hormones - The Key Players In The Process Of Seed Germination

Plant growth hormones along with numeral factors are the key players in the process of seed germination and seedling growth. They are produced by both plants and soil bacteria. They are defined as organic compounds othe...

Wireless Based Visual Prosthesis System Using Artificial Silicon Retina

According to a recent survey done on the number of visually impaired people in the world, it is found that about 285 million people in the world are blind of which 246 million have low vision and 39 million people in th...

Download PDF file
  • EP ID EP22659
  • DOI -
  • Views 313
  • Downloads 5

How To Cite

M. Nandhiya, Ms. M. Sakthi (2016). Comparison of Performance in Text Mining Using Text Categorization of Semi Structured Data. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 4(9), -. https://europub.co.uk/articles/-A-22659