Word Extraction Using X-Y Cut Algorithm

Abstract

Digitization of printed documents is the motivating factor today to work more on text of scanned documents. Conversion of hand written scanned or printed documents into electronically readable form enables to store, exchange and process the valuable information. Text Recognition aims to recognize the text from printed or handwritten document to desired format. Several steps of text recognition include preprocessing, segmentation, feature extraction, classification, post processing. Preprocessing refers to the basic conversion operation of gray Scale image into Binary Image and removal of noisy signal from image. Segmentation does the segment the document image into line by line and extracts each word from segmented line. Feature extraction is calculating the characteristics of character. A classification contains the database and further processing of them. The paper proposes the approach to extract words from based on a set of properties for each connected component in the whole binary image of the document which is independent of languages.

Authors and Affiliations

Simple Batra

Keywords

Related Articles

Electro-Disinfection of Municipal Wastewater: Laboratory Scale Comparison between Direct Current and Alternating Current

Electrodisinfection of wastewater has been investigated extensively in the past, although a consensus over the use of direct current (DC) or alternating current (AC) as the most efficient electricity source has not been...

A Review on Fluoride in Ground Water: Causes, Effects and Solutions

Water, the most vital resource for all kinds of life on this planet, is also the resource, adversely affected both quantitatively and qualitatively by all kind of human activities on land, in air or in water or naturally...

Pollution Parameter Investigation of Waste Effluents of DDC and Kamdhenu Dairy Industries of Nepal

The organic pollutants released from the milk processing units in dairy industries are considered a major source of environmental pollution which creates havoc in the human flora of the world. The dairy wastewater is ric...

Static Analysis of Bridge Structure using Finite Element Analysis Software

Finite element analysis is an effective method of determining the static performance of structures for three reasons which are saving in design time, cost effective in construction and increase the safety of the structur...

Optimization of Voltage, Delay, Power and Area for 16 bit Cyclic Redundancy Check (CRC) in VLSI Circuits using 45nm Technology

In Very-large-scale integration (VLSI) application area, delay and power are the important factors for any digital circuits. This paper presents 16 bit Cyclic Redundancy Check (CRC) mapped in Cadence Encounter(R) RTL Com...

Download PDF file
  • EP ID EP423561
  • DOI 10.9790/9622-0812010104.
  • Views 125
  • Downloads 0

How To Cite

Simple Batra (2018). Word Extraction Using X-Y Cut Algorithm. International Journal of engineering Research and Applications, 8(12), 1-4. https://europub.co.uk/articles/-A-423561