- Mark Díaz
- Razvan Adrian Amironesei
- Laura Weidinger
- Iason Gabriel
Tasks such as toxicity detection, hate speech detection, and online harassment detection have been developed for identifying and intervening in interactions that have the potential to cause social harms. These tasks, for identifying and classifying offensive or undesirable language, have gone by different names and have employed varying task definitions. However, they are united by a goal of reducing harm and breakdowns in civil discourse. Because language use varies from context to context, a major challenge to the success of these methods arises from the need to properly model and understand nuanced social context. Modeling social context has been identified as a massive challenge that stands to limit the performance of natural language processing (NLP) systems.
In this work we articulate the need for a relational understanding of offensiveness as well as a north star definition of this concept for NLP research. Many classification tasks implicitly treat offensiveness as a fixed property of language. However, offense emerges in the context of relationships between individual or broader networks of social actors (including human-like actors) and the language used between them. Using examples of speech drawn from members of marginalized groups, we argue that a fuller account of offensive speech, and when it is objectionable, must focus on the ends– or impact– of language and how it is used. We also explore the degree to which NLP systems may encounter limits when modeling relational factors, for example due to technical limitations or concerns regarding privacy in data collection for training and evaluation. Nonetheless, developing a robust, translatable, relational understanding of offensiveness is key to the successful operationalization and use of this concept. Addressing this challenge, the present work considers how offensiveness has been operationalized in classification tasks, the affordances and weakness thereof. We also discuss how a more relational approach can be implemented in data collection techniques and operationalizations of offensiveness.