Tagging Urdu Sentences from English POS Taggers

Abstract

Being a global language, English has attracted a majority of researchers and academia to work on several Natural Language Processing (NLP) applications. The rest of the languages are not focused as much as English. Part-of-speech (POS) Tagging is a necessary component for several NLP applications. An accurate POS Tagger for a particular language is not easy to construct due to the diversity of that language. The global language English, POS Taggers are more focused and widely used by the researchers and academia for NLP processing. In this paper, an idea of reusing English POS Taggers for tagging non-English sentences is proposed. On exemplary basis, Urdu sentences are processed to tagged from 11 famous English POS Taggers. State-of-the-art English POS Taggers were explored from the literature, however, 11 famous POS Taggers were being input to Urdu sentences for tagging. A famous Google translator is used to translate the sentences across the languages. Data from twitter.com is extracted for evaluation perspective. Confusion matrix with kappa statistic is used to measure the accuracy of actual Vs predicted tagging. The two best English POS Taggers which tagged Urdu sentences were Stanford POS Tagger and MBSP POS Tagger with an accuracy of 96.4% and 95.7%, respectively. The system can be generalized for multi-lingual sentence tagging.

Authors and Affiliations

Adnan Naseem, Muazzama Anwar, Salman Ahmed, Qadeem Akhtar Satti, Faizan Rasul Hashmi, Tahira Malik

Keywords

Related Articles

Inferring of Cognitive Skill Zones in Concept Space of Knowledge Assessment

In these research zones of the knowledge, the assessed domain is identified. Explicitly, these zones are known as Verified Skills, Derived Skills and Potential Skills. In detail, the Verified Skills Zone is the set of te...

Existing Trends of Digital Watermarking and its Significant Impact on Multimedia Streaming: A Survey

Nowadays digital media has reached the general level of resource sharing system and become a convenient way for sharing lots of information among various individuals. However, these digital data are stored and shared ove...

Finding Attractive Research Areas for Young Scientists

The selection of the research area is very vital for new researchers. One of the major issues for researchers is the selection of the domain of research on which he/she can carry out research. This case is very vital on...

Object Detection System to Help Navigating Visual Impairments

The number of people with severe visual impairments and blind people in the world is 216.6 million and 38.5 million, respectively in 2018 and that number will increase every year. While the development of Computer Vision...

Formal Modeling and Verification of Smart Traffic Environment with Design Aided by UML

Issue challan in response to rules violation, LED (Light Emitting Diode) and Bridge components of this proposed Smart Traffic Monitoring and Guidance System are presented in this paper to monitor violation of rules, upda...

Download PDF file
  • EP ID EP262241
  • DOI 10.14569/IJACSA.2017.081030
  • Views 75
  • Downloads 0

How To Cite

Adnan Naseem, Muazzama Anwar, Salman Ahmed, Qadeem Akhtar Satti, Faizan Rasul Hashmi, Tahira Malik (2017). Tagging Urdu Sentences from English POS Taggers. International Journal of Advanced Computer Science & Applications, 8(10), 231-238. https://europub.co.uk/articles/-A-262241