Multilingual Speech Recognition with Self-Attention Structured Parameterization

Yun Zhu

Parisa Haghani

Anshuman Tripathi

Bhuvana Ramabhadran

Brian Farris

Hainan Xu

Han Lu

Hasim Sak

Isabel Leal

Neeraj Gaur

Pedro Jose Moreno Mengibar

Qian Zhang

Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, ISCA

Google Scholar

Abstract

Multilingual automatic speech recognition systems can transcribe utterances from different languages. These systems are attractive from different perspectives: they can provide quality improvements, specially for lower resource languages, and simplify the training and deployment procedure. End-to-end speech recognition has further simplified multilingual modeling as one model, instead of several components of a classical system, have to be unified. In this paper, we investigate a streamable end-to-end multilingual system based on the Transformer Transducer. We propose several techniques for adapting the self-attention architecture based on the language id. We analyze the trade-offs of each method with regards to quality gains and number of additional parameters introduced. We conduct experiments in a real-world task consisting of five languages. Our experimental results demonstrate $\sim$10\% and $\sim$15\% relative gain over the baseline multilingual model.

Research Areas

Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Multilingual Speech Recognition with Self-Attention Structured Parameterization

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Multilingual Speech Recognition with Self-Attention Structured Parameterization

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities