- Xikun Zhang
- Deepak Ramachandran
- Ian Tenney
- Yanai Elazar
- Dan Roth
EMNLP'20 (2020)
We show that embedding-based language models capture a significant amount of information about the scalar magnitudes of objects but are short of the capability required for general common-sense reasoning. We identify ambiguity and numeracy as the key factors limiting their performance, and show that a simple reversible transformation of the pre-training corpus can have a significant effect on the results. We identify the best models and metrics to use when doing zero-shot transfer across tasks in this domain.
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work