Translating Images into Text Descriptions and Speech Synthesis for Learning Purpose

Abstract

Image to text and speech conversion system can be useful for improving accessibility of images for visually impaired as well as physically challenging people understand the scenario from the images and also train the system as that of human brain. The techniques of image segmentation and edge detection play an important role in implementing proposed system. The system generates text descriptions for an input image given by the user. Object wise generation of sentences, preposition and conjunction mapping is a challenging task. The framework formulates the interaction between image segmentation and object recognition in the framework of Canny algorithm. The system goes through various phases such as pre-processing, feature extraction, object recognition, edge detection, image segmentation and Text To Speech (TTS) conversion. The proposed system database consists of huge set of sample images, which help to perform training of database. The accuracy of proposed system is achieved due to the proper recognition of objects and sentences are formed making use of the recognized objects. These sample images consists of several categories of images. The system mainly consists of two main modules such as image to text and text to speech. An image to text module generates text descriptions in natural language based on understanding of image. A text to speech module generates speech synthesis in English from description of natural language.

Authors and Affiliations

Yogesh N. Shinde, Mrunmayee Patil

Keywords

Related Articles

slugA Novel FLC Based Street Lighting Using NI Lab VIEW

This paper presents a novel concept of energy management software for developing energy efficient street lighting system by using solar as well as grid supply. The investigator studied and thought deeply regarding this...

Evaluating the Chemical Stabilization of Soil Pavement and Its Molecular Spectra

In India, the increment in populace combined with substantially loaded heaps of vehicles passing on heavier stresses focuses particularly on roads running in clayey soil zones which make critical issues for pavements an...

A Method for Mobile Robot to Navigate in Dynamic Environment using Fuzzy Approach

In this paper, we proposed algorithm for position and path estimation of mobile Robot in dynamic environment with the help of fuzzy approach. The application of the combined soft computing approaches are best techniques...

Reliable Automated Telemedicine System Using IOT

It has been always a challenging task for the researchers to incorporate information technology advancement in medical profession. The design and development of wearable biosensor system for health monitoring has garner...

Quantum Realization Full Adder-Subtractor Circuit Design Using Islam gate

Quantum Computing is one of the emerging computing methods of future computing technologies. The construction of quantum computer that performs computation is implemented using Quantum Gate, the basic gate level element...

Download PDF file
  • EP ID EP22315
  • DOI -
  • Views 230
  • Downloads 3

How To Cite

Yogesh N. Shinde, Mrunmayee Patil (2016). Translating Images into Text Descriptions and Speech Synthesis for Learning Purpose. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 4(6), -. https://europub.co.uk/articles/-A-22315