A Zone Classification Approach for Arabic Documents using Hybrid Features
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2016, Vol 7, Issue 7
Abstract
Zone segmentation and classification is an important step in document layout analysis. It decomposes a given scanned document into zones. Zones need to be classified into text and non-text, so that only text zones are provided to a recognition engine. This eliminates garbage output resulting from sending non-text zones to the engine. This paper proposes a framework for zone segmentation and classification. Zones are segmented using morphological operation and connected component analysis. Features are then extracted from each zone for the purpose of classification into text and non-text. Features are hybrid between texture-based and connected component based features. Effective features are selected using genetic algorithm. Selected features are fed into a linear SVM classifier for zone classification. System evaluation shows that the proposed zone classification works well on multi-font and multi-size documents with a variety of layouts even on historical documents.
Authors and Affiliations
Amany M. Hesham, Sherif Abdou, Amr Badr, Mohsen Rashwan, Hassanin M. Al-Barhamtoshy
Sentiment Analysis of Arabic Jordanian Dialect Tweets
Sentiment Analysis (SA) of social media contents has become one of the growing areas of research in data mining. SA provides the ability of text mining the public opinions of a subjective manner in real time. This paper...
Energy Saving EDF Scheduling for Wireless Sensors on Variable Voltage Processors
Advances in micro technology has led to the development of miniaturized sensor nodes with wireless communication to perform several real-time computations. These systems are deployed wherever it is not possible to mainta...
TokenVote: Secured Electronic Voting System in the Cloud
With the spread of democracy around the world, voting is considered a way to collectively make decisions. Recently, many government offices and private organizations use voting to make decisions when the opinions of mult...
Analysis of a Braking System on the Basis of Structured Analysis Methods
In this paper, we present the general context of the research in the domain of analysis and modeling of mechatronic systems. In fact, we present à bibliographic review on some works of research about the systemic analysi...
Optimized Routing Information Exchange in Hybrid IPv4-IPv6 Network using OSPFV3 & EIGRPv6
IPv6 is the next generation internet protocol which is gradually replacing the IPv4. IPv6 offers larger address space, simpler header format, efficient routing, better QoS and built-in security mechanisms. The migration...