Multi-Accent Speech Recognition with Hierarchical Grapheme Based Models

Hasim Sak; Kanishka Rao

Multi-Accent Speech Recognition with Hierarchical Grapheme Based Models

Hasim Sak

Kanishka Rao

ICASSP 2017 (to appear)

Google Scholar

Abstract

We explore the viability of grapheme-based
recognition specifically how it compares to phoneme-based
equivalents. We utilize the CTC loss to train models to directly
predict graphemes, we also train models with hierarchical
CTC and show that they improve on previous CTC models.
We also explore how the grapheme and phoneme models
scale with large data sets, we consider a single acoustic training
data set where we combine various dialects of English from
US, UK, India and Australia. We show that by training a single
grapheme-based model on this multi-dialect data set we create
a accent-robust ASR system

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Multi-Accent Speech Recognition with Hierarchical Grapheme Based Models

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs