Scale the Authoring of High Quality Lexicon Knowledge
Abstract
In this document, we describe T-Lex, a tool designed to solve high quality Lexicon knowledge's collection issue for natural language generation (NLG) purpose. To allow the computer to speak naturally and fluently, it's critical to collect large amount of high quality Lexicon knowledge. T-Lex has 3 main contributions: 1, scales tooling support for new locale's Lexicon model; 2, provides filter/annotate/arbitration workflow and ACL system to fasten data collection and ensure data quality; 3, offers proper way to validate and fix data issues quickly. The result shows that the system is capable to collect high quality Lexicon knowledge with better scalability.