Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages

Thad Hughes; Kaisuke Nakajima; Linne Ha; Atul Vasu; Pedro Moreno; Mike LeBeau

Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages

Thad Hughes

Kaisuke Nakajima

Linne Ha

Atul Vasu

Pedro Moreno

Mike LeBeau

Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), International Speech Communication Association, pp. 1914-1917

Google Scholar

Abstract

We present a system for quickly and cheaply building transcribed speech corpora containing utterances from many speakers in a variety of acoustic conditions. The system consists of a client application running on an Android mobile device with an intermittent Internet connection to a server. The client application collects demographic information about the speaker, fetches textual prompts from the server for the speaker to read, records the speaker’s voice, and uploads the audio and associated metadata to the server. The system has so far been used to collect over 3000 hours of transcribed audio in 17 languages around the world.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs