Extracting Text from Image Document and Displaying Its Related Information

Abstract

Image Text is the text information embedded or written in image of different form. Image text can be found in captured images, scanned documents, magazines, newspapers, posters etc. These image texts are highly available nowadays and they are very important in representing, describing and transferring information which help peoples in communication, solving problems, availability, creation of new types of jobs, cost effectiveness, productivity, globalization and cultural gap etc. The information from these image documents would give higher efficiency and ease of access if it is converted to text form. The process by which Image Text converted into plain text is Text Extraction. Text Extraction is useful in information retrieving, searching, editing, documenting, archiving or reporting of image text. However, variation of these texts due to differences in size, orientation style, and alignment, text is embedded in complex colored document images, degraded documents image, low quality image, as well as low image contrast and complex background make problem text extraction extremely difficult and challenging one. Different techniques such as Connected Component Method, Mathematical Morphology Method, Edged Based Method and Texture Based Method have been used previously, but those all have their own limitations when measured by different parameters like precision, recall and fscore. In this paper, text extraction from image documents, using combination of the two powerful methods Connected Component and Edge Based Method, in order to enhance performance and accuracy of text extraction is discussed and implementation is done by integrated MATLAB code with MATLAB/Simulink tool and the proposed system is tested by Digital Image Binarization Competition (DIBCO) 2017 dataset. Finally, the extracted and recognized is converted to speech for proper use for visually impaired people.

Authors and Affiliations

K. N. Natei, J. Viradiya, S. Sasi kumar

Keywords

Related Articles

Advantages of Concrete Mixing with Tyre Rubber

Strong waste administration is one of the major natural concerns everywhere throughout the world. Tire-rubber particles made out of tire chips, piece elastic, and a mix of tire chips and scrap elastic, where utilized to...

Preparation of Aluminum Hydroxide by Precipitation Method for Vaccine Adjuvant Application

Aluminum hydroxide is commonly used as vaccine adjuvant. The adjuvant activity of aluminum hydroxide is considered to depend on its particle size. In order to obtain aluminum hydroxide with high adjuvant activity, we syn...

Data Partitioning Technique In Cloud: A Survey On Limitation And Benefits

In recent years,increment in the growth and popularity of cloud services has lead the enterprises to an increase in the capability to handle, store and retrieve critical data. This technology access a shared group of con...

A Rational Study on Strategies Adopted by the Manufacturing Engineering Firms - in Order to Protect their Core Activities while Outsourcing Engineering Design and Product Development – A Paradigm Shift.

Rapidly changing and increasingly complex business forces are bringing dynamic shifts in management practices in any type of organisation, whether it may be a production sector or service sector. The advancing of technol...

Design, Fabrication, Manufacture & Installation Of Pyrolysissystem For Waste Rubberpieces.

The present study focuses on the design, fabrication, manufacture & installation &optimization of a 10 Ton capacity per day for a pyrolysis system for waste rubber pieces. This pyrolysis system generatesoil & gases as a...

Download PDF file
  • EP ID EP394275
  • DOI 10.9790/9622-0805052733.
  • Views 124
  • Downloads 0

How To Cite

K. N. Natei, J. Viradiya, S. Sasi kumar (2018). Extracting Text from Image Document and Displaying Its Related Information. International Journal of engineering Research and Applications, 8(5), 27-33. https://europub.co.uk/articles/-A-394275