Jump to Content

Using Machine Translation to Localize Task Oriented NLG Output

Scott Roy
Cliff Brunk
Kyu-Young Kim
Justin Xu Zhao
Sidharth Mudgal
Chris Varano
CoRR, vol. abs/2107.04512 (2021)

Abstract

One of the challenges for a task oriented NLG system like the Google Assistant is to internationalize the output to many languages. This paper explores doing this by applying machine translation to the English output. Using machine translation is very scalable, as it can work with any English output and can handle dynamic text, but it is difficult to meet the required quality bar: machine translation is good, but for a commercial NLG application it often needs to be nearly perfect. Fortunately, in task oriented NLG the quality only needs to reach this bar for the narrow range of sentences that the NLG system can actually produce. We are able to reach this quality using a combination of semantic annotations, fine tuning on in-domain translations, automatic error detection, and sentences from the Web. This paper shares our approach and results, together with a distillation model to serve the NMT models at scale.