Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the Research Manifold

Sebastian Ruder
Ivan Vulić
Anders Søgaard
Findings of ACL 2022(2022) (to appear)
Google Scholar

Abstract

The first NLP experiment you did was likely training a standard architecture on labeled English data and optimizing for accuracy, e.g., not worrying about fairness, interpretability or computational efficiency. We call this the square one of NLP research---and establish through surveys that such a square one exists. NLP research often goes beyond this experimental setup, of course, e.g, focusing not only on accuracy, but also on fairness or interpretability, but typically only along a single dimension. Most work focused on multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on. We show this through manual classification of recent NLP research papers, as well as of ACL Test-of-Time award recipients. The one-dimensionality of most research means we are only exploring a fraction of the NLP research search space. We provide historical and more recent examples of how this bias has led researchers to draw false conclusions or make unwise choices and point to unexplored directions on the research manifold.