- Ajda Gokcen
- Hao Zhang
- Richard Sproat
Abstract
Neural text normalization systems achieve high accuracy, but the errors they do make can include not only “acceptable” errors (such as reading $3 as three dollar) but also unacceptable errors (reading $3 as three euros). We explore ways of training dual encoder classifiers with both positive and negative data to then use as soft constraints in neural text normalization in order to decrease the number of unacceptable errors. Already-low error rates and high variability in performance on the evaluation set make it difficult to determine when improvement is significant, but qualitative analysis suggests that certain types of dual encoder constraints yield systems that make fewer unacceptable errors.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work