Text Mining For Retrieving The Vital Information

Apply

Text Mining For Retrieving The Vital Information

Journal Title: International Journal of Research in Computer and Communication Technology - Year 2014, Vol 3, Issue 1

Abstract

A huge amount of data is being collected in the data repository today. Typically there is an enormous space from the stored data to the information that could be assembled from the data. This evolution won't occur repeatedly, that's where Data Mining (DM) comes into the picture. In examining Data Analysis (DA), some initial knowledge is known about the data, but DM could help in a more in-depth knowledge about the data. In search of knowledge from enormous data is one of the most desired attributes of DM. Manual DA has been around for some time now, but it creates a restricted access for large DAs. Fast emergent computer science techniques and methodology generates new demands to mine difficult data types. A number of DM methods like Association Rule, Clustering and Classification are developed to mine this huge amount of data. Earlier studies on DM focus on structured data, such as relational and transactional data. However, in reality, a considerable portion of the available data is stored in text databases or document databases, which consists of great collections of documents from various resources, such as articles, books, web pages and digital libraries. Text databases (TD) are rapidly rising due to the increasing amount of information available in electronic forms, such as E-publications, E-mail and the World Wide Web. Data stored in TDs is mostly semi-structured, i.e., it is neither completely unstructured nor completely structured. For e.g., a document may contain a few structured fields, such as title, authors, publication date, category, and so on, but also contain some largely unstructured text modules, such as abstract and contents.

Authors and Affiliations

K. Sreerama Murthy, Dr G. Samuel Varaprasad Raju, Dr C. Sunil Kumar

Keywords

IR; FL; Query processing; Inverted Index file; LSAs

EP ID EP27812
DOI -
Views 265
Downloads 0