Entropy Based Texture Features Useful for Automatic Script Identification

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 2

Abstract

In a multi script environment, a collection of documents printed in different scripts is in practice. For automatic processing of such documents through Optical Character Recognition, it is necessary to identify the script type of the document. In this paper, a novel texture-based approach is presented to identify the script type of the documents printed in three prioritized scripts - Kannada, Hindi and English, prevailed in Karnataka, an Indian state. The document images are decomposed through the Wavelet Packet Decomposition using the Haar basis function up to level two. The texture features are extracted from the sub bands of the wavelet packet decomposition. The Shannon entropy value is computed for the set of sub bands and these entropy values are combined to obtain the texture features. Experimentation conducted involved 1500 text images for learning and 1200 text images for testing. Script classification performance is analyzed using the Knearest neighbor classifier. The average success rate is found to be 99.33%.

Authors and Affiliations

M. C. Padma , P. A. Vijaya

Keywords

Related Articles

A study of Adaptive Replication Technique in routing time-constrained messages (VoIP) in MANET

Imposing the constraint of timely delivery on usual messages is referred as time-constrained messages. The utility of such messages depends upon the time at which they arrive at their destination. Due to contention among...

E-Cash Payment Protocols

E-cash is a payment system designed and implemented for making purchases over open networks such as the Internet. Need of a payment system which enables the electronic transactions are growing at the same time that the u...

Performance Comparison and Analysis of DSDV and AODV for MANET

Abstract—A Mobile Ad hoc NETwork (MANET) is a kind of wireless ad-hoc network, and is a self configuring network of mobile routers (and associated hosts) connected by wireless links – the union of which forms an arbitrar...

TEMPORAL SEQUENTIAL PATTERN IN DATA MINING TASKS

The rapid increase in the data available leads to the difficulty for analyzing those data and different types of frameworks are required for unearthing useful knowledge that can be extracted from such databases. The fiel...

A Randomized Secure Data Hiding Algorithm Using File Hybridization for Information Security

The internet and the World Wide Web have revolutionized the way in which digital data is distributed. The growing possibilities of modern communication need special means of security especially on computer network. In th...

Download PDF file
  • EP ID EP139661
  • DOI -
  • Views 123
  • Downloads 0

How To Cite

M. C. Padma, P. A. Vijaya (2010). Entropy Based Texture Features Useful for Automatic Script Identification. International Journal on Computer Science and Engineering, 2(2), 115-120. https://europub.co.uk/articles/-A-139661