Named entities identification

Journal Title: Romanian Journal of Human - Computer Interaction - Year 2014, Vol 7, Issue 4

Abstract

An important topic in natural language processing is represented by named entities recognition inside texts. This article describes a novel approach used for detecting named entities that tries to improve the results obtained with the named entity recognition module from Stanford NLP library. In order to determine and classify named entities, this new model uses the Naive Bayes classifier. Our method is focused on named entities of type person and organization but it can be easily extended to other types of named entities. As training data we are using text that is manually annotated, text annotated with Stanford NLP toolkit and a set of XML files containing rules that describe different patterns. After the training, we are using the naive Bayes classifier in order to classify new entities. As test data we are using a Reuters collection of approximate 25000 articles among which 150 articles were manually annotated and used as training data. In order to evaluate the method we are computing the precision, the recall and the F1 factor.

Authors and Affiliations

Liviu Sebastian Matei ,Ştefan Trăuşan-Matu

Keywords

Related Articles

News Web Portal based on Natural Language Processing

The paper presents an autonomous text classification module for a news web portal for the Romanian language. Statistical natural language processing techniques are combined in order to achieve a completely autonomous fun...

Named entities identification

An important topic in natural language processing is represented by named entities recognition inside texts. This article describes a novel approach used for detecting named entities that tries to improve the results obt...

The UsiSketch Software Architecture

In previous work, we proposed a method to facilitate the tabletop collaborative prototyping of model-based user interfaces in early steps of the design process when multiple stakeholders have only a vague goal in mind of...

Romanian dependency parser developed based on parsers for other Romanic languages

Determining the syntactic dependencies is an important task in natural language processing, as it is useful for improving the results of a wide range of applications, such as machine translation, opinion mining, question...

Evolutionary Social Network Analysis of Web-based Educational Environments

This research provides a different perspective to analyzing the social networks that are formed in online learning environments. The novelty of this study lies in the investigation of the evolution of the online social n...

Download PDF file
  • EP ID EP28955
  • DOI -
  • Views 352
  • Downloads 8

How To Cite

Liviu Sebastian Matei, Ştefan Trăuşan-Matu (2014). Named entities identification. Romanian Journal of Human - Computer Interaction, 7(4), -. https://europub.co.uk/articles/-A-28955