Google Research

Capturing Covertly Toxic Speech via Crowdsourcing

HCI, (2021) (to appear)


We study the task of extracting covert or veiled toxicity labels from user comments. Prior research has highlighted the difficulty in creating language models that recognize nuanced toxicity such as microaggressions. Our investigations further underscore the difficulty in parsing such labels reliably from raters via crowdsourcing. We introduce an initial dataset, COVERTTOXICITY, which aims to identify such comments from a refined rater template, with rater associated categories. Finally, we fine-tune a comment-domain BERT model to classify covertly offensive comments and compare against existing baselines.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work