Areal and Phylogenetic Features for Multilingual Speech Synthesis

Alexander Gutkin

Richard Sproat

Proc. of Interspeech 2017, International Speech Communication Association (ISCA), August 20–24, 2017, Stockholm, Sweden, pp. 2078-2082

Download Google Scholar

Abstract

We introduce phylogenetic and areal language features to the domain of multilingual text-to-speech (TTS) synthesis. Intuitively, enriching the existing universal phonetic features with such cross-language shared representations should benefit the multilingual acoustic models and help to address issues like data scarcity for low-resource languages. We investigate these representations using the acoustic models based on long short-term memory (LSTM) recurrent neural networks (RNN). Subjective evaluations conducted on eight languages from diverse language families show that sometimes phylogenetic and areal representations lead to significant multilingual synthesis quality improvements.

Research Areas

Natural Language Processing
Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Areal and Phylogenetic Features for Multilingual Speech Synthesis

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Areal and Phylogenetic Features for Multilingual Speech Synthesis

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities