Comparing PMI-based to Cluster-based Arabic Single Document Summarization Approaches
Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 11, Issue 8
Abstract
In this paper, two extractive techniques are applied to handle Arabic Single Document Text summarization problem (SDS); the first uses a K-Means clustering approach and the other uses mutual information (MI) which is broadly used to measure the co-occurrence between two words in text mining. A successful Arabic document summarization algorithm should identify noteworthy sentences in the documents as accurately as possible. The terms used in the document (the distinct words) represent the document's identity, and instead of Bag of Words (BoW); a Term-Sentence Matrix (TSM) is utilized. In the first approach, the text themes are extracted using K-Means then one sentence per Cluster is chosen to be part of the summary using TFIDF weights. In the other approach, the pointwise mutual information (PMI) is used to assign weights for each cell in the TSM. The matrix generated from this TSM, is used to extract a summary of the document. experimentations prove that the cluster-based methodology performs slightly better than the first one, but if the end user could tweak the summary percentage to appropriate level then, the PMI-based approach will be slightly better.
Authors and Affiliations
Madeeh Nayer El-Gedawy
Lead Pollution in Iraqi Kurdistan Region
In addition to the research which was published by the authors which dealt with different environmental issues in Iraqi Kurdistan, this research deals with a specific type of very dangerous pollution to the health...
Failure Analysis of Bolted Composite Joint- A Review
A review of publications associated with the failure of bolted composite joints in this paper has been carried out. The study covered the work done from 2005 to 2012. Mechanical fasteners often cause a reduction of load...
Lifetime Maximization in Wireless Sensor Network
Network lifetime predictability is an essential system requirement for the type of wireless sensor network (WSN) used in safety-critical and highly reliable applications basically wireless sensor networks are battery pow...
Design and implementation of DDA architecture for FIR Filters
Traditionally, direct implementation of a K-tap FIR filter requires K multiply-and-accumulate (MAC) blocks, which are expensive to implement in FPGA due to logic complexity and resource usage. To resolve this issue...
An Approach to Color Image enhancement Using Modified Histogram
Image enhancement is one of the key issues in high quality pictures such as digital camera and HDTV. Since Image clarity is very easily affected by lighting, weather, or equipment that has been used to capture the...