LUCIDAH Ligative and Unligative Characters in a Dataset for Arabic Handwriting

Abstract

Arabic script is inherently cursive, even when machine-printed. When connected to other characters, some Arabic characters may be optionally written in compact aesthetic forms known as ligatures. It is useful to distinguish ligatures from ordinary characters for several applications, especially automatic text recognition. Datasets that do not annotate these ligatures may confuse the recognition system training. Some popular datasets manually annotate ligatures, but no dataset (prior to this work) took ligatures into consideration from the design phase. In this paper, a detailed study of Arabic ligatures and a design for a dataset that considers the representation of ligative and unligative characters are presented. Then, pilot data collection and recognition experiments are conducted on the presented dataset and on another popular dataset of handwritten Arabic words. These experiments show the benefit of annotating ligatures in datasets by reducing error-rates in character recognition tasks.

Authors and Affiliations

Yousef Elarian, Irfan Ahmad, Abdelmalek Zidouri, Wasfi G. Al-Khatib

Keywords

Related Articles

Feature Descriptor Based on Normalized Corners and Moment Invariant for Panoramic Scene Generation

Panorama generation systems aim at creating a wide-view image by aligning and stitching a sequence of images. The technology is extensively used in many fields such as virtual reality, medical image analysis, and geologi...

Big Data Knowledge Mining

Big Data (BD) era has been arrived. The ascent of big data applications where information accumulation has grown beyond the ability of the present programming instrument to catch, manage and process within tolerable shor...

Evaluation of the Performance of the University Information Systems: Case of Moroccan Universities

The purpose of this paper is to develop a conceptual model of university information systems performance measurement. To do this resorted to the choice of 3E-3P model. This model proposes a development under the spectrum...

Deployment Protocol for Underwater Wireless Sensors Network based on Virtual Force

Recently, Underwater Sensor Networks (UWSNs) have attracted researchers’ attention due to the challenges and the peculiar characteristics of the underwater environment. The initial random deployment of UWSN where sensors...

Goal Model Integration for Tailoring Product Line Development Processes

Many companies rely on the promised benefits of product lines, targeting systems between fully custom made software and mass products. Such customized mass products account for a large number of applications automaticall...

Download PDF file
  • EP ID EP626788
  • DOI 10.14569/IJACSA.2019.0100855
  • Views 110
  • Downloads 0

How To Cite

Yousef Elarian, Irfan Ahmad, Abdelmalek Zidouri, Wasfi G. Al-Khatib (2019). LUCIDAH Ligative and Unligative Characters in a Dataset for Arabic Handwriting. International Journal of Advanced Computer Science & Applications, 10(8), 406-415. https://europub.co.uk/articles/-A-626788