Glyph Identification and Character Recognition for Sindhi OCR

Abstract

A computer can read and write multiple languages and today?s computers are capable of understanding various human languages. A computer can be given instructions through various input methods but OCR (Optical Character Recognition) and handwritten character recognition are the input methods in which a scanned page containing text is converted into written or editable text. The change in language text available on scanned page demands different algorithm to recognize text because every language and script pose varying number of challenges to recognize text. The Latin language recognition pose less difficulties compared to Arabic script and languages that use Arabic script for writing and OCR systems for these Latin languages are near to perfection. Very little work has been done on regional languages of Pakistan. In this paper the Sindhi glyphs are identified and the number of characters and connected components are identified for this regional language of Pakistan. A graphical user interface has been created to perform identification task for glyphs and characters of Sindhi language. The glyphs of characters are successfully identified from scanned page and this information can be used to recognize characters. The language glyph identification can be used to apply suitable algorithm to identify language as well as to achieve a higher recognition rate.

Authors and Affiliations

N. A. Memon, F. Abassi, S. Zardari

Keywords

Related Articles

Effect of Cryogenic Treatment on Mechanical Properties of AISI 4340 and AISI 4140 Steel

From last epoch till to date, AISI 4340 and AISI 4140 have been widely used in different engineering applications. These applications include bolt, screws, gears, drive shafts, crane shaft and piston rods for engines due...

Application of Lean Agile Resilient Green Paradigm Framework on China Pakistan Economic Corridor: A Case Study

China has recently emerged as the technological and economic giant and an attractive place for investment for MCNs (Multi-National Companies). Many of the high ranked MNC?s have shifted their production facilities to Chi...

Problems and Prospects of Curbside Parking in Lahore: Policy Implications for Effective Management

Lahore is a fast-growing metropolis experiencing rapid growth in people and vehicle population. This unprecedented growth has led to urban sprawl, dependency on motorized transport, and increased parking space demands th...

Analysis of Emergency Medical Response Service in Peshawar through Simulation

EMRS (Emergency Medical Response Service) is the public safety system that is responsible for the initial first aid and transportation of the patient to the hospital. Providing a timely response to any emergency situatio...

An Investigation of Prototyping Technique in Pakistani Software Industry

Requirements elicitation is one of the important and major activities within the Requirements Engineering phase. There are different techniques used for requirement elicitation process. Selection of any requirements elic...

Download PDF file
  • EP ID EP226281
  • DOI 10.22581/muet1982.1704.18
  • Views 120
  • Downloads 0

How To Cite

N. A. Memon, F. Abassi, S. Zardari (2017). Glyph Identification and Character Recognition for Sindhi OCR. Mehran University Research Journal of Engineering and Technology, 36(4), 933-940. https://europub.co.uk/articles/-A-226281