Revisiting Graphemes with Increasing Amounts of Data

Yun-Hsuan Sung; Thad Hughes; Francoise Beaufays; Brian Strope

Revisiting Graphemes with Increasing Amounts of Data

Yun-Hsuan Sung

Thad Hughes

Francoise Beaufays

Brian Strope

ICASSP, IEEE (2009)

Google Scholar

Abstract

Letter units, or graphemes, have been reported in the literature as a surprisingly effective substitute to the more traditional phoneme units, at least in languages that enjoy a strong correspondence between pronunciation and orthography. For English however, where letter symbols have less acoustic consistency, previously reported results fell short of systems using highly-tuned pronunciation lexicons. Grapheme units simplify system design, but since graphemes map to a wider set of acoustic realizations than phonemes, we should expect grapheme-based acoustic models to require more training data to capture these variations. In this paper, we compare the rate of improvement of
grapheme and phoneme systems trained with datasets ranging from 450 to 1200 hours of speech. We consider various grapheme unit configurations, including using letter-specific, onset, and coda units. We show that the grapheme systems improve faster and, depending on the lexicon, reach or surpass
the phoneme baselines with the largest training set.

Index Terms— Acoustic modeling, graphemes, directory assistance, speech recognition.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Revisiting Graphemes with Increasing Amounts of Data

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs