DBpedia based Ontological Concepts Driven Information Extraction from Unstructured Text
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 9
Abstract
In this paper a knowledge base concept driven named entity recognition (NER) approach is presented. The technique is used for information extraction from news articles and linking it with background concepts in knowledge base. The work specifically focuses on extracting entity mentions from unstructured articles. The extraction of entity mentions from articles is based on the existing concepts from DBPedia ontology, representing the knowledge associated with the concepts present in Wikipedia knowledge base. A collection of the Wikipedia concepts through structured DBpedia ontology has been extracted and developed. For processing of unstructured text, Dawn news articles have been scrapped, preprocessed and thereby a corpus has been built. The proposed knowledge base driven system shows that given an article, the system identifies the entity mentions in the text article and how they can automatically be linked with the concepts to the corresponding entity mentions representing their respective pages on Wikipedia. The system is evaluated on three test collections of news articles on politics, sports and entertainment domains. The experimental results in respect of entity mentions are reported. The results are presented as precision, recall and f-measure, where the precision of extraction of relevant entity mentions identified yields the best results with a little variation in percent recall and f-measures. Additionally, facts associated with the extracted entity mentions both in form of sentences and Resource Description Framework (RDF) triples are presented so as to enhance the user’s understanding of the related facts presented in the article.
Authors and Affiliations
Adeel Ahmed, Syed Saif ur Rahman
Computerized Kymograph for Muscle Contraction Measurement Using Ultrasonic Distance Sensor
Kymograph is a device to record the magnitude of physiological variables, such as: muscle contraction. However, we observe some lacks of the conventional kymographs, such as: result’s visualisation and accuracy. Hence, w...
AdviseMe: An Intelligent Web-Based Application for Academic Advising
The traditional academic advising process in many tertiary-level institutions today possess significant inefficiencies, which often account for high levels of student dissatisfaction. Common issues include high student-a...
LASyM: A Learning Analytics System for MOOCs
Nowadays, the Web has revolutionized our vision as to how deliver courses in a radically transformed and enhanced way. Boosted by Cloud computing, the use of the Web in education has revealed new challenges and looks for...
A Proposal for A High Availability Architecture for VoIP Telephone Systems based on Open Source Software
The inherent needs of organizations to improve and amplify their technological platform entail large expenses with the goal to enhance their performance. Hence, they have to contemplate mechanisms of optimization and the...
Land use Detection in Nusajaya using Higher-Order Modified Geodesic Active Contour Model
Urban development is a global phenomenon. In Johor, especially Nusajaya is one of the most rapidly developing cities. This is due to the increase of land demand and population growth. Moreover, land-use changes are consi...