Google Research

Managing Transcription Data for Automatic Speech Recognition with Elpis

The Open Handbook of Linguistic Data Management, The MIT Press (2022)


This chapter provides a ‘mid-level’ introduction to speech recognition technologies, with particular reference to Elpis (Foley et al., 2018), a tool designed for people with minimal computational experience to take advantage of modern speech recognition technologies in their language documentation transcription workflow. Elpis is intended to be used even in situations where there might not be the large quantities of previously-transcribed recordings typically required for training speech recognition systems. Even in language documentation contexts where people may only have one or two hours of transcribed recordings, using speech recognition can be beneficial to the process of transcription by providing an initial estimate which can be more quickly refined than typed from scratch.

