Assessing the efficacy of benchmarks for automatic speech accent recognition

Journal Title: EAI Endorsed Transactions on Creative Technologies - Year 2015, Vol 2, Issue 4

Abstract

Speech accents can possess valuable information about the speaker, and can be used in intelligent multimedia-based human-computer interfaces. The performance of algorithms for automatic classification of accents is often evaluated using audio datasets that include recording samples of different people, representing different accents. Here we describe a method that can detect bias in accent datasets, and apply the method to two accent identification datasets to reveal the existence of dataset bias, meaning that the datasets can be classified with accuracy higher than random even if the tested algorithm has no ability to analyze speech accent. We used the datasets by separating one second of silence from the beginning of each audio sample, such that the one-second sample did not contain voice, and therefore no information about the accent. An audio classification method was then applied to the datasets of silent audio samples, and provided classification accuracy significantly higher than random. These results indicate that the performance of accent classification algorithms measured using some accent classification benchmarks can be biased, and can be driven by differences in the background noise rather than the auditory features of the accents.

Authors and Affiliations

Benjamin Bock, Lior Shamir

Keywords

Related Articles

Head pose estimation & TV Context: current technology

With the arrival of low-cost high quality cameras, implicit user behaviour tracking is easier and it becomes very interesting for viewer modelling and content personalization in a TV context. In this paper, we present a...

Exploring Deep Recurrent Q-Learning for Navigation in a 3D Environment

Learning to navigate in 3D environments from raw sensory input is an important step towards bridging the gap between human players and artificial intelligence in digital games. Recent advances in deep reinforcement learn...

Maze and Mirror Game Design for Increasing Motivation in Studying Science in Elementary School Students

The research project discussed here, examines attempts to increase the motivation of elementary school students in basic science by the means of designing a science game. To realize this goal, the maze and mirror game wa...

The Significance of a Body in Contemporary Arts

This paper discusses the role and significance of a body in Performance Art. Considering that Art reflects social, cultural and sometimes political realities, we identify types of messages that an artwork using advanced...

A Genetic Algorithm for Automated Refactoring of Component-Based Software

Nowadays a software undergoes modifications done by different people to quickly fulfill new requirements, but its underlying design is not adjusted properly after each update. This leads to the emergence of bad smells. R...

Download PDF file
  • EP ID EP45835
  • DOI http://dx.doi.org/10.4108/icst.mobimedia.2015.259033
  • Views 302
  • Downloads 0

How To Cite

Benjamin Bock, Lior Shamir (2015). Assessing the efficacy of benchmarks for automatic speech accent recognition. EAI Endorsed Transactions on Creative Technologies, 2(4), -. https://europub.co.uk/articles/-A-45835