Automatic Title Generation in Scientific Articles for Authorship Assistance: A Summarization Approach
Journal Title: Journal of ICT Research and Applications - Year 2017, Vol 11, Issue 3
Abstract
This paper presents a study on automatic title generation for scientific articles considering sentence information types known as rhetorical categories. A title can be seen as a high-compression summary of a document. A rhetorical category is an information type conveyed by the author of a text for each textual unit, for example: background, method, or result of the research. The experiment in this study focused on extracting the research purpose and research method information for inclusion in a computer-generated title. Sentences are classified into rhetorical categories, after which these sentences are filtered using three methods. Three title candidates whose contents reflect the filtered sentences are then generated using a template-based or an adaptive K-nearest neighbor approach. The experiment was conducted using two different dataset domains: computational linguistics and chemistry. Our study obtained a 0.109-0.255 F1-measure score on average for computer-generated titles compared to original titles. In a human evaluation the automatically generated titles were deemed ‘relatively acceptable’ in the computational linguistics domain and ‘not acceptable’ in the chemistry domain. It can be concluded that rhetorical categories have unexplored potential to improve the performance of summarization tasks in general.
Authors and Affiliations
Masayu Leylia Khodra
Generic Animation Method for Multi-Objects in IFS Fractal Form
Both non-metamorphic animation and metamorphic animation of objects or multi-objects in IFS fractal form as basic animation method can be implemented by a modified version of the random iteration algorithm as basic algor...
Efficient CFO Compensation Method in Uplink OFDMA for Mobile WiMax
Mobile WiMax uses Orthogonal Frequency Division Multiple Access (OFDMA) in uplink where synchronization is a complex task as each user presents a different carrier frequency offset (CFO). In the Data Aided Phase Incremen...
Performance Improvement of LeastSquares Adaptive Filter for High-Speed Train Communication Systems
The downlink communication channel from high-altitude platform (HAP) to high-speed train (HST) in the Ka-band is a slowly time-varying Rician distributed flat fading channel with 10-25 dB Rician K factor. In this respect...
Tweet-based Target Market Classification Using Ensemble Method
Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end...
Improvement of Fuzzy Geographically Weighted Clustering-Ant Colony Optimization Performance using Context-Based Clustering and CUDA Parallel Programming
Geo-demographic analysis (GDA) is the study of population characteristics by geographical area. Fuzzy Geographically Weighted Clustering (FGWC) is an effective algorithm used in GDA. Improvement of FGWC has been done by...