Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

Abstract

A grapheme-to-phoneme conversion (G2P) is very important in both speech recognition and synthesis. The existing Indonesian G2P based on pseudo nearest neighbour rule (PNNR) has two drawbacks: the grapheme encoding does not adapt all Indonesian phonemic rules and the PNNR should select a best phoneme from all possible conversions even though they can be filtered by some phonemic rules. In this paper, a modified partial orthogonal binary grapheme encoding and a phonemic-based rule are proposed to improve the performance of PNNR-based Indonesian G2P. Evaluating on 5-fold cross-validation, contain 40K words to develop the model and 10K words to evaluation each, shows that both proposed concepts reduce the relative phoneme error rate (PER) by 13.07%. A more detail analysis shows the most errors are from grapheme ?e? that can be dynamically converted into either /E/ or /??/ since four prefixes, ’ber’, ’me’, ’per’, and ’ter’, produce many ambiguous conversions with basic words and also from some similar compound words with both different pronunciations for the grapheme ?e?. A stemming procedure can be applied to reduce those errors.

Authors and Affiliations

Suyanto , Sri Hartati, Agus Harjoko

Keywords

Related Articles

Security Issues of a Recent RFID Multi Tagging Protocol

RFID is now a widespread method used for identifying people and objects. But, not all communication protocols can provide the same rigorous confidentiality to RFID technology. In return, unsafe protocols put individuals...

Hybrid Genetic-FSM Technique for Detection of High-Volume DoS Attack

Insecure networks are vulnerable to cyber-attacks, which may result in catastrophic damages on the local and global scope. Nevertheless, one of the tedious tasks in detecting any type of attack in a network, including Do...

Automatic Association of Strahler’s Order and Attributes with the Drainage System

A typical drainage pattern is an arrangement of river segment in a drainage basin and has several contributing identifiable features such as leaf segments, intermediate segments and bifurcations. In studies related to mo...

Android Malware Detection & Protection: A Survey

Android has become the most popular smartphone operating system. This rapidly increasing adoption of Android has resulted in significant increase in the number of malwares when compared with previous years. There exist l...

Wireless Sensor Network Energy Efficiency with Fuzzy Improved Heuristic A-Star Method

Energy is a major factor in designing wireless sensor networks (WSNs). In order to extend the network lifetime, researchers should consider energy consumption in routing protocols of WSNs. Routing will serve to facilitat...

Download PDF file
  • EP ID EP143769
  • DOI 10.14569/IJACSA.2016.070358
  • Views 110
  • Downloads 1

How To Cite

Suyanto, Sri Hartati, Agus Harjoko (2016). Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P. International Journal of Advanced Computer Science & Applications, 7(3), 430-435. https://europub.co.uk/articles/-A-143769