Text Summarization and Discovery of Frames and Relationship from Natural Language Text - A R&D Methodology
Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 3
Abstract
The paper deals with the concept of data mining whereby the data resources can be fetched and accessed accordingly with reduced time complexity. Resource sharing is an important aspect in the field of information science. The retrieval techniques are pointed out based on the ideas of binary search tree, Gantt chart, text summarization. A heorem has been cited regarding the summation of total length of codes of each leaf search term. Summarization is a hard problem of Natural Language Processing because, to do it properly, one has to really understand the point of a text. This requires semantic analysis, discourse processing, and inferential interpretation (grouping of the content using world knowledge). The last step, especially, is complex, because systems without a great deal of world knowledge simply cannot do it. Therefore, attempts so far of performing true abstraction--creating abstracts as summaries--have not been very successful. Fortunately, however, an approximation called extraction is more feasible today. To create an extract, a system need simply to identify the most important/topical/central topic(s) of the text, and return them to the reader. Although the summary is not necessarily coherent, the reader can form an opinion of the content of the original. Most automated summarization systems today produce extracts only. Another purpose of this paper is to addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables from document collections that requires only a handful of training examples from users. In this paper we have tried to explain how to extract the different kind of relationship between the words with the help of a frame net analysis diagram of an annotation layer software.
Authors and Affiliations
P. Chakrabarti , J. K. Basu
Secured, Authenticated Communication Model for Dynamic Multicast Groups
Secure Multicast networks forms the backbone for many web and multimedia applications such as Interactive TV, Teleconference etc. The main challenge for secure multicast is scalability, efficiency and authenticity. A co...
Mining Recurrent Pattern Identification on Large Database
Recurrent pattern mining is an important problem in the context of data mining. In this paper data mining algorithms have been discussed and compared. Recurrent pattern mining has been an important area in data mining re...
Investigating the performance improvement by sampling techniques in EEG data
In this paper the performance of oversampling methods such as SMOTE (Synthetic Minority Over-sampling Technique) and PCA (Principal Component Analysis) which are used for preprocessing are applied for the Brain computer...
Reclaiming Individuality of Mysterious Passage
Authorship attribution, the science of inferring characteristics of author from characteristics of documents written by that author become an urgent need to find the original author of anonymous text. In this paper, a no...
SEVERITY BASED CODE OPTIMIZATION : A DATA MINING APPROACH
Billions of lines of code are currently running in Legacy systems, mainly running machine critical systems. Large organizations and as well as small organizations extensively rely on IT infrastructure as the backbone. Th...