Convolutional dropout and wordpiece augmentation for end-to-end speech recognition

Bhuvana Ramabhadran

Hainan Xu

Kartik Audhkhasi

Yinghui Huang

Yun Zhu

ICASSP 2021(2021)

Google Scholar

Abstract

Regularization and data augmentation are crucial to training end-to-end automatic speech recognition systems. Dropout is a popular regularization technique, which operates on each neuron independently by multiplying it with a Bernoulli random variable. We propose a generalization of dropout, called ``convolutional dropout'', where each neuron's activation is replaced with a randomly-weighted linear combination of neuron values in its neighborhood. We believe that this formulation combines the regularizing effect of dropout with the smoothing effects of the convolution operation. In addition to convolutional dropout, this paper also proposes using random wordpiece segmentations as a data augmentation scheme during training, inspired by results in neural machine translation. We adopt both these methods during the training of transformer-transducer speech recognition models, and show consistent improvements over strong baselines across different languages.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Convolutional dropout and wordpiece augmentation for end-to-end speech recognition

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Convolutional dropout and wordpiece augmentation for end-to-end speech recognition

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities