Jump to Content
A Theeraphol

A Theeraphol

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Burmese Speech Corpus, Finite­-State Text Normalization and Pronunciation Grammars with an Application to Text-­to-­Speech
    Yin May Oo
    Chen Fang Li
    Pasindu De Silva
    Supheakmungkol Sarin
    Knot Pipatsrisawat
    Martin Jansche
    Proc. 12th Language Resources and Evaluation Conference (LREC 2020), European Language Resources Association (ELRA), 11--16 May, Marseille, France, pp. 6328-6339
    Preview abstract This paper introduces an open-­source crowd­-sourced multi­-speaker speech corpus along with the comprehensive set of finite-­state transducer (FST) grammars for performing text normalization for the Burmese (Myanmar) language. We also introduce the open­-source finite­-state grammars for performing grapheme­-to­-phoneme (G2P) conversion for Burmese. These three components are necessary (but not sufficient) for building a high­-quality text-­to-­speech (TTS) system for Burmese, a tonal Southeast Asian language from the Sino­-Tibetan family which presents several linguistic challenges. We describe the corpus acquisition process and provide the details of our finite state­based approach to Burmese text normalization and G2P. Our experiments involve building a multi­speaker TTS system based on long short term memory (LSTM) recurrent neural network (RNN) models, which were previously shown to perform well for other languages in a low­-resource setting. Our results indicate that the data and grammars that we are announcing are sufficient to build reasonably high­-quality models comparable to other systems. We hope these resources will facilitate speech and language research on the Burmese language, which is considered by many to be low­resource due to the limited availability of free linguistic data. View details
    Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview
    Alena Butryna
    Shan Hui Cathy Chu
    Linne Ha
    Fei He
    Martin Jansche
    Chen Fang Li
    Tatiana Merkulova
    Yin May Oo
    Knot Pipatsrisawat
    Clara E. Rivera
    Supheakmungkol Sarin
    Pasindu De Silva
    Keshan Sodimana
    Jaka Aris Eko Wibawa
    2019 UNESCO International Conference Language Technologies for All (LT4All): Enabling Linguistic Diversity and Multilingualism Worldwide, 4--6 December, Paris, France, pp. 91-94
    Preview abstract This paper presents an overview of a program designed to address the growing need for developing free speech resources for under-represented languages. At present we have released 38 datasets for building text-to-speech and automatic speech recognition applications for languages and dialects of South and Southeast Asia, Africa, Europe and South America. The paper describes the methodology used for developing such corpora and presents some of our findings that could benefit under-represented language community. View details
    Voice Builder: A Tool for Building Text-To-Speech Voices
    Pasindu De Silva
    Hao Tang
    Knot Pipatsrisawat
    11th edition of the Language Resources and Evaluation Conference (LREC), 7-12 May 2018, Miyazaki, Japan
    Preview abstract We describe an opensource text-to-speech (TTS) voice building tool that focuses on simplicity, flexibility, and collaboration. Our tool allows anyone with basic computer skills to run voice training experiments and listen to the resulting synthesized voice. We hope that this tool will reduce the barrier for creating new voices and accelerate TTS research, by making experimentation faster and interdisciplinary collaboration easier. We believe that our tool can help improve TTS research, especially for low-resourced languages, where more experimentations are often needed to get the most out of the limited data. View details
    Text Normalization for Bangla, Khmer, Nepali, Javanese, Sinhala, and Sundanese TTS Systems
    Keshan Sodimana
    Pasindu De Silva
    Chen Fang Li
    Supheakmungkol Sarin
    Knot Pipatsrisawat
    6th International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2018), International Speech Communication Association (ISCA), 29--31 August, Gurugram, India, pp. 147-151
    Preview abstract Text normalization is the process of converting non-standard words (NSWs) such as numbers, abbreviations, and time expressions into standard words so that their pronunciations can be derived either through lexicon lookup or by utilizing a program to predict pronunciations from spellings. Text normalization is, thus, an important component of any Text-to-Speech (TTS) system. Without such component, the resulting voice, no matter how good the quality is, may sound unintelligent. Such a component is often built manually by translating language-specific knowledge into rules that can be utilized by TTS pipelines. In this paper, we describe an approach to develop a rule-based text normalization component for many low-resourced languages. We also describe our open source repository containing text normalization grammars for Bangla, Javanese, Khmer, Nepali, Sinhala, Sundanese and present a recipe for utilizing them in a TTS system. View details
    No Results Found