Text Mining For Retrieving The Vital Information
Journal Title: International Journal of Research in Computer and Communication Technology - Year 2014, Vol 3, Issue 1
Abstract
A huge amount of data is being collected in the data repository today. Typically there is an enormous space from the stored data to the information that could be assembled from the data. This evolution won't occur repeatedly, that's where Data Mining (DM) comes into the picture. In examining Data Analysis (DA), some initial knowledge is known about the data, but DM could help in a more in-depth knowledge about the data. In search of knowledge from enormous data is one of the most desired attributes of DM. Manual DA has been around for some time now, but it creates a restricted access for large DAs. Fast emergent computer science techniques and methodology generates new demands to mine difficult data types. A number of DM methods like Association Rule, Clustering and Classification are developed to mine this huge amount of data. Earlier studies on DM focus on structured data, such as relational and transactional data. However, in reality, a considerable portion of the available data is stored in text databases or document databases, which consists of great collections of documents from various resources, such as articles, books, web pages and digital libraries. Text databases (TD) are rapidly rising due to the increasing amount of information available in electronic forms, such as E-publications, E-mail and the World Wide Web. Data stored in TDs is mostly semi-structured, i.e., it is neither completely unstructured nor completely structured. For e.g., a document may contain a few structured fields, such as title, authors, publication date, category, and so on, but also contain some largely unstructured text modules, such as abstract and contents.
Authors and Affiliations
K. Sreerama Murthy, Dr G. Samuel Varaprasad Raju, Dr C. Sunil Kumar
Novel De-duplication Technique for Supporting Differential Privileges Check in H-Cloud
Clients are benefited storing their data in Cloud instead of local data management. But one critical challenge of cloud storage is management of huge volumes of data and lack of privacy and security which maximizes t...
Z-source Inverter based DVR for Power Quality Improvement
Interest in Power Quality has been explicitly seen in Electrical Power Engineering since past decade, even though utilities all over the world have for decades worked on the improvement of voltage quality, what is no...
Hybrid Multimodal Template Protection Technique Using Fuzzy Extractor And Random Projection
Due to the popularity of biometric authentication system, it is extremely important to protect the biometric template available in the networks. Template protection technique is a critical issue in biometric authentic...
Relative Reference Measure for Hierarchical Document Clustering
Clustering is a foremost concept in data mining. Clustering usually require a measure that needs to be computed among the clustering objects this measure could be either a similarity or a dissimilarity measure, here...
A Novel AHB Based SDRAM Memory Controller
This paper describes design of the memory controller which is compatible with Advanced High-performance Bus (AHB), which is a new generation of AMBA bus. The AHB is mainly meant for high-performance and high clock f...