Named entities identification

Journal Title: Romanian Journal of Human - Computer Interaction - Year 2014, Vol 7, Issue 4

Abstract

An important topic in natural language processing is represented by named entities recognition inside texts. This article describes a novel approach used for detecting named entities that tries to improve the results obtained with the named entity recognition module from Stanford NLP library. In order to determine and classify named entities, this new model uses the Naive Bayes classifier. Our method is focused on named entities of type person and organization but it can be easily extended to other types of named entities. As training data we are using text that is manually annotated, text annotated with Stanford NLP toolkit and a set of XML files containing rules that describe different patterns. After the training, we are using the naive Bayes classifier in order to classify new entities. As test data we are using a Reuters collection of approximate 25000 articles among which 150 articles were manually annotated and used as training data. In order to evaluate the method we are computing the precision, the recall and the F1 factor.

Authors and Affiliations

Liviu Sebastian Matei ,Ştefan Trăuşan-Matu

Keywords

Related Articles

The analysis of interactions in a collaborative learning session on instant messagery

The paper analyzes the interaction in a group of students who use instant messaging (IM) Yahoo Messenger to write a sorting algorithm. This analysis aims to identify words in a natural language and the methods of extract...

Specific aspects regarding the usability of faculty admission websites

Faculty admission websites are used every year by thousands of people looking for some kind of information. In this paper we present an empirical evaluation of Babeş-Bolyai University faculty admission websites. The univ...

Methodology for Identification and Evaluation of Web Application Performance Oriented Usability Issues

This paper aims to illustrate a methodology for identifying and assessing a set of performance issues encountered in a particular web application, with impact on the usability level. Throughout this methodology, several...

User Interaction Techniques based on 3D Graphical Annotation in eLearning Environments

Multimedia eLearning resources (i.e. images, video files, and 3D models) allow the exemplification and intuitive representation of high complexity concepts. At the same time, the interaction with these resources cannot b...

Prerequisites for an automatic speech recognition technology for the Romanian language used in the court of law

In response to an IT-based, strategic goal of the Romanian Ministry of Justice, the running JustASR project is responsible for developing the architecture and technical elements to be used in implementing a system which...

Download PDF file
  • EP ID EP28955
  • DOI -
  • Views 333
  • Downloads 8

How To Cite

Liviu Sebastian Matei, Ştefan Trăuşan-Matu (2014). Named entities identification. Romanian Journal of Human - Computer Interaction, 7(4), -. https://europub.co.uk/articles/-A-28955