Google Research

WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

ACL (2018), pp. 5

Abstract

We present a corpus that encompasses the complete history of conversations between contributors of English Wikipedia, one of the largest online collaborative communities.
By recording the intermediate states of conversations---including not only comments and replies, but also their modifications, deletions and restorations---this data offers an unprecedented view of online conversation. This level of detail supports new research questions pertaining to the process (and challenges) of large-scale online collaboration.
We illustrate the corpus' potential with two case studies that highlight new perspectives on earlier work. First, we explore how a person's conversational behavior depends on how they relate to the discussion venue. Second, we show that community moderation of toxic behavior happens at a higher rate than previously estimated.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work