Jeffrey Sorensen
Authored Publications
Sort By
Context Sensitivity Estimation in Toxicity Detection
Alexandros Xenos
Ioannis Pavlopoulos
Ion Androutsopoulos
First Monday (2022)
Preview abstract
Context-sensitive posts are rare in toxicity de-tection datasets. This fact leads to modelsthat disregard even the conversational context(e.g., the parent post) when they predict toxic-ity. This work introduces the task of context-sensitivity estimation in toxicity detection andpresents. We present and publicly release thefirst dataset that can be used to build context-sensitivity estimation systems.We furthershow that systems trained on our dataset canbe effectively used to detect posts that dependto the parent post, regarding toxicity detection.
View details
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Whitlock Lees
Yi Tay
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022)
Preview abstract
On the world wide web, toxic content detectors are a crucial line ofdefense against potentially hateful and offensive messages. As such,building highly effective classifiers that enable a safer internet is animportant research area. Moreover, the web is a highly multilingual,cross-cultural community that develops its own lingo over time.As such, developing models that can be effective across a diverserange of languages usages and styles is crucial. In this paper, wepresent Jigsaw Perspective API’s new generation of toxic contentclassifiers which takes a step towards this unified vision. At theheart of the approach is a single multilingual token-free Charformermodel that is applicable across languages, domains, and tasks. Wedemonstrate that by forgoing static vocabularies, we gain flexibilityacross a variety of settings. We additionally outline the techniquesemployed to make such a byte-level model efficient and feasible forproductionization. Through extensive experiments on multilingualtoxic comment classification benchmarks derived from real API traffic and evaluation on an array of code-switching, covert toxicity,emoji-based hate, human-readable obfuscation, distribution shift,and bias evaluation settings, we show that our proposed approachoutperforms strong baselines. Finally, we present our findings ofdeploying this system in production, and discuss our observedbenefits over traditional approaches
View details
Context-Sensitivity Estimation in Toxicity Classification
Ioannis Pavlopoulos
Ion Androutsopoulos
ACL-IJCNLP 2021: (2021)
Preview abstract
Toxicity detection is of growing importance in social and other media to allow healthy discussions. Most previous work ignores the context of user posts, which can mislead systems and moderators to incorrectly classify toxic posts as non-toxic, or vice versa. Recent work concluded that datasets containing many more context-aware posts are needed to correctly train and evaluate context-aware toxicity classifiers. We re-annotated an existing toxicity dataset, adding context-aware ground truth to the existing context-unaware ground truth. Exploiting both types of ground truth, context aware and unaware, we develop and evaluate a classifier that can determine if a post is context-sensitive or not. The classifier can be used to collect more context-sensitive posts. It can also be used to determine when a moderator needs to consider the parent post (to decrease the moderation cost) or when a context-aware toxicity detection system has to be evoked, as opposed to using a simpler context-unaware system. We also discuss how the context-sensitivity classifier can help avoid a possibly malicious exploitation of the context-unawareness of current toxicity detectors. Datasets and code of models addressing this novel task will become publicly available.
View details
Jigsaw @ AMI and HaSpeeDe2: Fine-Tuning a Pre-TrainedComment-Domain BERT Model
Alyssa Whitlock Lees
Ian Kivlichan
Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), CEUR.org, Online (to appear)
Preview abstract
The Google Jigsaw team produced submissions for two of the
EVALITA 2020 shared asks, based in part on the
technology that powers the publicly available
PerspectiveAPI comment evaluation service.
We present a basic description of our submitted results and a review of
the types of errors that our system made in these shared tasks.
View details
Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification
Daniel Borkan
ACM Conference on Fairness, Accountability, and Transparency (2019) (to appear)
Preview abstract
Unintended bias in Machine Learning can manifest as systemic differences in performance for different demographic groups, potentially compounding existing challenges to fairness in society at large. In this paper, we introduce a suite of threshold-agnostic metrics that provide a nuanced view of this unintended bias, by considering the various ways that a classifier's score distribution can vary across designated groups. We also introduce a large new test set of online comments with crowd-sourced annotations for identity references. We use this to show how our metrics can be used to find new and potentially subtle unintended bias in existing public models.
View details
WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community
Cristian Danescu
Dario Taraborelli
Yiqing Hua
ACL (2018), pp. 5
Preview abstract
We present a corpus that encompasses the complete history of conversations between contributors of English Wikipedia, one of the largest online collaborative communities.
By recording the intermediate states of conversations---including not only comments and replies, but also their modifications, deletions and restorations---this data offers an unprecedented view of online conversation.
This level of detail supports new research questions pertaining to the process (and challenges) of large-scale online collaboration.
We illustrate the corpus' potential with two case studies that highlight new perspectives on earlier work.
First, we explore how a person's conversational behavior depends on how they relate to the discussion venue.
Second, we show that community moderation of toxic behavior happens at a higher rate than previously estimated.
View details
Measuring and Mitigating Unintended Bias in Text Classification
John Li
AAAI/ACM Conference on AI, Ethics, and Society (2018)
Preview abstract
We introduce and illustrate a new approach to measuring and
mitigating unintended bias in machine learning models. Our
definition of unintended bias is parameterized by a test set
and a subset of input features. We illustrate how this can
be used to evaluate text classifiers using a synthetic test set
and a public corpus of comments annotated for toxicity from
Wikipedia Talk pages. We also demonstrate how imbalances
in training data can lead to unintended bias in the resulting
models, and therefore potentially unfair applications. We use
a set of common demographic identity terms as the subset of
input features on which we measure bias. This technique permits
analysis in the common scenario where demographic information
on authors and readers is unavailable, so that bias
mitigation must focus on the content of the text itself. The
mitigation method we introduce is an unsupervised approach
based on balancing the training dataset. We demonstrate that
this approach reduces the unintended bias without compromising
overall model quality
View details
The OpenGrm Open-Source Finite-State Grammar Software Libraries
Preview
Richard Sproat
Terry Tai
ACL (System Demonstrations) (2012), pp. 61-66
Unary Data Structures for Language Models
Interspeech 2011, International Speech Communication Association, pp. 1425-1428
Preview abstract
Language models are important components of speech recognition and machine translation systems.
Trained on billions of words, and consisting of billions of parameters, language models often are the
single largest components of these systems. There have been many proposed techniques to reduce the
storage requirements for language models. A technique based upon pointer-free compact storage of
ordinal trees shows compression competitive with the best proposed systems, while retaining the full
finite state structure, and without using computationally expensive block compression schemes or
lossy quantization techniques.
View details