Identifying and Extracting Named Entities from Wikipedia Database Using Entity Infoboxes
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2014, Vol 5, Issue 7
Abstract
An approach for named entity classification based on Wikipedia article infoboxes is described in this paper. It identifies the three fundamental named entity types, namely; Person, Location and Organization. An entity classification is accomplished by matching entity attributes extracted from the relevant entity article infobox against core entity attributes built from Wikipedia Infobox Templates. Experimental results showed that the classifier can achieve a high accuracy and F-measure scores of 97%. Based on this approach, a database of around 1.6 million 3-typed named entities is created from 20140203 Wikipedia dump. Experiments on CoNLL2003 shared task named entity recognition (NER) dataset disclosed the system’s outstanding performance in comparison to three different state-of-the-art systems.
Authors and Affiliations
Muhidin Mohamed, Mourad Oussalah
Developing Deep Learning Models to Simulate Human Declarative Episodic Memory Storage
Human like visual and auditory sensory devices became very popular in recent years through the work of deep learning models that incorporate aspects of brain processing such as edge and line detectors found in the visua...
Cloud Server Security using Bio-Cryptography
Data security is becoming more important in cloud computing. Biometrics is a computerized method of identifying a person based on a physiological characteristic. Among the features measured are our face, fingerprints, ha...
A Circular Polarization RFID Tag for Medical Uses
The aim of this paper is to present Radio Frequency Identification (RFID) Tag. The use of this kind of antennas in the medical field has a great importance in making people's life easier and improving the way to get medi...
Mitigation of Cascading Failures with Link Weight Control
Cascading failures are crucial issues for the study of survivability and resilience of our infrastructures and have attracted much interest in complex networks research. In this paper, we study the overload-based cascadi...
A semantic cache for enhancing Web services communities activities: Health care case Study
Collective memories are strong support for enhancing the activities of capitalization, management and dissemination inside a Web services community. To take advantages of collective memory, we propose an approach for ind...