INDONESIAN TEXT-TO-SPEECH SYSTEM USING DIPHONE CONCATENATIVE SYNTHESIS

Abstract

In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process) alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems). This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can handle texts with abbreviation, there is the facility to add such words.

Authors and Affiliations

Sutarman Sutarman

Keywords

Related Articles

PREDICTING THE EFFECTIVENESS OF WEB INFORMATION SYSTEMS USING NEURAL NETWORKS MODELING: FRAMEWORK & EMPIRICAL TESTING

The information systems (IS) assessment studies have still used the commonly traditional tools such as questionnaires in evaluating the dependent variables and specially effectiveness of systems. Artificial neural network...

COMPARING THE PERFORMANCE OF PREDICTIVE MODELS CONSTRUCTED USING THE TECHNIQUES OF FEED-FORWORD AND GENERALIZED REGRESSION NEURAL NETWORKS

Artificial Neural Network (ANNs) is an efficient machine learning method that can be used to fits model from data for prediction purposes. It is capable of modelling the class prediction as a nonlinear combination of the...

INVESTIGATION MODEL FOR DDOS ATTACK DETECTION IN REAL-TIME

Investigating traffic of distributed denial of services (DDoS) attack requires extra overhead which mostly results in network performance degradation. This study proposes an investigation model for detecting DDoS attack...

IMPLEMENTING COMBINED FSM WITH CPLDS

The subject of the research in this article is the logic circuit of the combined finite state machine (CFSM), which combines the functions of the both FSM Mealy and Moore. In practice, such a model of control automata is...

A DEVELOPED NETWORK LAYER HANDOVER BASED WIRELESS NETWORKS

This paper proposes an Advanced Mobility Handover (AMH) scheme based on Wireless Local Area Networks (WLANs) by developing a network layer handover procedure which triggers messages to be sent to the next access point. T...

Download PDF file
  • EP ID EP254084
  • DOI -
  • Views 116
  • Downloads 0

How To Cite

Sutarman Sutarman (2015). INDONESIAN TEXT-TO-SPEECH SYSTEM USING DIPHONE CONCATENATIVE SYNTHESIS. International Journal of Software Engineering and Computer Systems, 1(1), 85-93. https://europub.co.uk/articles/-A-254084