Korean Voice Input -- Have you Dictated your E-Mails in Korean lately?

October 14, 2010

Posted by Mike Schuster & Kaisuke Nakajima, Google Research

Google Voice Search has been available in various flavors of English since 2008, in Mandarin and Japanese since 2009, in French, Italian, German and Spanish since June 2010 (see also in this blog post), and shortly after that in Taiwanese. On June 16th 2010, we took the next step by launching our Korean Voice Search system.

Korean Voice Search, by focusing on finding the correct web page for a spoken query, has been quite successful since launch. We have improved the acoustic models several times which resulted in significantly higher accuracy and reduced latency, and we are committed to improving it even more over time.

While voice search significantly simplifies input for search, especially for longer queries, there are numerous applications on any smartphone that could also benefit from general voice input, such as dictating an email or an SMS. Our experience with US English has taught us that voice input is as important as voice search, as the time savings from speaking rather than typing a message are substantial. Korean is the first non-English language where we are launching general voice input. This launch extends voice input to emails, SMS messages, and more on Korean Android phones. Now every text field on the phone will accept Korean speech input.

Creating a general voice input service had different requirements and technical challenges compared to voice search. While voice search was optimized to give the user the correct web page, voice input was optimized to minimize (Hangul) character error rate. Voice inputs are usually longer than searches (short full sentences or parts of sentences), and the system had to be trained differently for this type of data. The current system’s language model was trained on millions of Korean sentences that are similar to those we expect to be spoken. In addition to the queries we used for training voice search, we also used parts of web pages, selected blogs, news articles and more. Because the system expects spoken data similar to what it was trained on, it will generally work well on normal spoken sentences, but may yet have difficulty on random or rare word sequences -- we will work to keep improving on those.

Korean voice input is part of Google’s long-term goal to make speech input an acceptable and useful form of input on any mobile device. As with voice search, our cloud computing infrastructure will help us to improve quality quickly, as we work to better support all noise conditions, all Korean dialects, and all Korean users.