A Survey of Unstructured Text Summarization Techniques
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2014, Vol 5, Issue 4
Abstract
Due to the explosive amounts of text data being created and organizations increased desire to leverage their data corpora, especially with the availability of Big Data platforms, there is not usually enough time to read and understand each document and make decisions based on document contents. Hence, there is a great demand for summarizing text documents to provide a representative substitute for the original documents. By improving summarizing techniques, precision of document retrieval through search queries against summarized documents is expected to improve in comparison to querying against the full spectrum of original documents. Several generic text summarization algorithms have been developed, each with its own advantages and disadvantages. For example, some algorithms are particularly good for summarizing short documents but not for long ones. Others perform well in identifying and summarizing single-topic documents but their precision degrades sharply with multi-topic documents. In this article we present a survey of the literature in text summarization. We also surveyed some of the most common evaluation methods for the quality of automated text summarization techniques. Last, we identified some of the challenging problems that are still open, in particular the need for a universal approach that yields good results for mixed types of documents.
Authors and Affiliations
Sherif Elfayoumy, Jenny Thoppil
Simultaneous Stream Transmission Methods for Free Viewpoint TV: A Comparative Study
Free Viewpoint TV is a system to view natural videos and allow users to control the viewpoint interactively. The main idea is that the users can switch between multiple video streams to find viewpoints of their own choic...
Clustering of Multidimensional Objects in the Formation of Personalized Diets
When developing personalized diets (personalized nutrition) it is necessary to take into account individual physiological nutritional needs of the body associated with the presence of gene polymorphism among consumers. T...
Fuzzy Data Mining for Autism Classification of Children
Autism is a development condition linked with healthcare costs, therefore, early screening of autism symptoms can cut down on these costs. The autism screening process involves presenting a series of questions for parent...
Microcontroller-based RFID, GSM and GPS for Motorcycle Security System
The crime level including motorcycle theft has been increasing. It occurs regardless the time and place. The owner of the motorcycle needs to ensure the security of his motorcycle by adding either manual or electronic lo...
ABCVS: An Artificial Bee Colony for Generating Variable T-Way Test Sets
To achieve acceptable quality and performance of any software product, it is crucial to assess various software components in the application. There exist various software-testing techniques such as combinatorial testing...