Named Entity Recognition System for Postpositional Languages: Urdu as a Case Study 

Abstract

Named Entity Recognition and Classification is the process of identifying named entities and classifying them into one of the classes like person name, organization name, location name, etc. In this paper, we propose a tagging scheme Begin Inside Last -2 (BIL2) for the Subject Object Verb (SOV) languages that contain postposition. We use the Urdu language as a case study. We compare the F-measure values obtained for the tagging schemes IO, BIO2, BILOU and BIL2 using Hidden Markov Model (HMM) and Conditional Random Field (CRF). The BIL2 tagging scheme results are better than the other three tagging schemes using the same parameters including bigram and context window. With HMM, the F-measure values for IO, BIO2, BILOU, and BIL2 are 44.87%, 44.88%, 45.14%, and 45.88%, respectively. With CRF, the F-measure values for IO, BIO2, BILOU, and BIL2 are 35.13%, 35.90%, 37.85%, and 38.39%, respectively. The F-measure values for BIL2 are better than those of previously reported techniques

Authors and Affiliations

Muhammad Malik, Syed Sarwar

Keywords

Related Articles

Toward a Hybrid Approach for Crowd Simulation

We address the problem of simulating pedestrian crowd behaviors in real time. To date, two approaches can be used in modeling and simulation of crowd behaviors, i.e. macroscopic and microscopic models. Microscopic simula...

A Multi-Agent Framework for Data Extraction,Transformation and Loading in Data Warehouse

The rapid growth in size of data sets poses chal-lenge to extract and analyze information in timely manner for better prediction and decision making. Data warehouse is the solution for strategic decision making. Data war...

Skill Evaluation for Newly Graduated Students Via Online Test

Every year in each university many students are graduated holding a first university degree. For example Bachelor degree in Computer Science. Most of those students have a motivation to continue with further studies to g...

Method for Image Portion Retrieval and Display for Comparatively Large Scale of Imagery Data onto Relatively Small Size of Screen Which is Suitable to Block Coding of Image Data Compression

Method for image portion retrieval and display for the relatively large scale of imagery data onto comparatively small size of display is proposed. The method is suitable to the data compression methods based on block co...

The Discovery of the Implemented Software Engineering Process Using Process Mining Techniques

Process model guidance is an important feature by which the software process is orchestrated. Without complying with this guidance, the production lifecycle deviates from producing a reliable software with high-quality s...

Download PDF file
  • EP ID EP107327
  • DOI 10.14569/IJACSA.2016.071019
  • Views 112
  • Downloads 0

How To Cite

Muhammad Malik, Syed Sarwar (2016). Named Entity Recognition System for Postpositional Languages: Urdu as a Case Study . International Journal of Advanced Computer Science & Applications, 7(10), 141-147. https://europub.co.uk/articles/-A-107327