Context-Sensitivity Estimation in Toxicity Classification
Abstract
Toxicity detection is of growing importance in social and other media to allow healthy discussions. Most previous work ignores the context of user posts, which can mislead systems and moderators to incorrectly classify toxic posts as non-toxic, or vice versa. Recent work concluded that datasets containing many more context-aware posts are needed to correctly train and evaluate context-aware toxicity classifiers. We re-annotated an existing toxicity dataset, adding context-aware ground truth to the existing context-unaware ground truth. Exploiting both types of ground truth, context aware and unaware, we develop and evaluate a classifier that can determine if a post is context-sensitive or not. The classifier can be used to collect more context-sensitive posts. It can also be used to determine when a moderator needs to consider the parent post (to decrease the moderation cost) or when a context-aware toxicity detection system has to be evoked, as opposed to using a simpler context-unaware system. We also discuss how the context-sensitivity classifier can help avoid a possibly malicious exploitation of the context-unawareness of current toxicity detectors. Datasets and code of models addressing this novel task will become publicly available.