Tagging Urdu Sentences from English POS Taggers

Abstract

Being a global language, English has attracted a majority of researchers and academia to work on several Natural Language Processing (NLP) applications. The rest of the languages are not focused as much as English. Part-of-speech (POS) Tagging is a necessary component for several NLP applications. An accurate POS Tagger for a particular language is not easy to construct due to the diversity of that language. The global language English, POS Taggers are more focused and widely used by the researchers and academia for NLP processing. In this paper, an idea of reusing English POS Taggers for tagging non-English sentences is proposed. On exemplary basis, Urdu sentences are processed to tagged from 11 famous English POS Taggers. State-of-the-art English POS Taggers were explored from the literature, however, 11 famous POS Taggers were being input to Urdu sentences for tagging. A famous Google translator is used to translate the sentences across the languages. Data from twitter.com is extracted for evaluation perspective. Confusion matrix with kappa statistic is used to measure the accuracy of actual Vs predicted tagging. The two best English POS Taggers which tagged Urdu sentences were Stanford POS Tagger and MBSP POS Tagger with an accuracy of 96.4% and 95.7%, respectively. The system can be generalized for multi-lingual sentence tagging.

Authors and Affiliations

Adnan Naseem, Muazzama Anwar, Salman Ahmed, Qadeem Akhtar Satti, Faizan Rasul Hashmi, Tahira Malik

Keywords

Related Articles

A Study of Mobile Forensic Tools Evaluation on Android-Based LINE Messenger

The limitation of forensic tool and the mobile device’s operating system are two problems for researchers in mobile forensics field. Nevertheless, some kinds of forensic tools testing in several devices might be helpful...

A Genetic Programming based Algorithm for Predicting Exchanges in Electronic Trade using Social Networks’ Data

Purpose of this paper is to use Facebook dataset for predicting Exchanges in Electronic business. For this purpose, first a dataset is collected from Facebook users and this dataset is divided into two training and test...

The Factors of Subjective Voice Disorder Using Integrated Method of Decision Tree and Multi-Layer Perceptron Artificial Neural Network Algorithm

The aim of the present study was to develop a prediction model for subjective voice disorders based on an artificial neural network algorithm and a decision tree using national statistical data. Subjects of analysis were...

A New Algorithm to Represent Texture Images

In recent times the spatial autoregressive models have been extensively used to represent images. In this paper we propose an algorithm to represent and reproduce texture images based on the estimation of spatial autoreg...

Investigation of Adherence Degree of Agile Requirements Engineering Practices in Non-Agile Software Development Organizations

Requirements are critical for the success of software projects. Requirements are practically difficult to produce, as the hardest stage of building a software system is to decide what the system should do. Moreover, requ...

Download PDF file
  • EP ID EP262241
  • DOI 10.14569/IJACSA.2017.081030
  • Views 77
  • Downloads 0

How To Cite

Adnan Naseem, Muazzama Anwar, Salman Ahmed, Qadeem Akhtar Satti, Faizan Rasul Hashmi, Tahira Malik (2017). Tagging Urdu Sentences from English POS Taggers. International Journal of Advanced Computer Science & Applications, 8(10), 231-238. https://europub.co.uk/articles/-A-262241