Google Award Program stimulates Journalism and CS collaboration

February 19, 2014

Posted by Krishna Bharat, Distinguished Research Scientist

Last fall, Google invited academic researchers to participate in a Computational Journalism awards program focused on the intersection of Computer Science and Journalism. We solicited proposals for original research projects relevant to today’s fast evolving news industry.

As technology continues to shape and be shaped by the media landscape, applicants were asked to rethink traditional models and roles in the ecosystem, and reimagine the lifecycle of the news story in the online world. We encouraged them to develop innovative tools and open source software that could benefit readers and be game-changers for reporters and publishers. Each award includes funding of $60,000 in cash and $20,000 in computing credits on Google’s Cloud Platform.

We congratulate the recipients of these awards, whose projects are described below, and look forward to the results of their research. Stay tuned for updates on their progress.

Larry Birnbaum, Professor of Electrical Engineering and Computer Science, and Journalism, Northwestern University
Project: Thematic Characterization of News Stories
This project aims to develop computational methods for identifying abstract themes or "angles" in news stories, e.g., seeing a story as an instance of "pulling yourself up by your bootstraps," or as a "David vs. Goliath" story. In collaboration with journalism and computer science students, we will develop applications utilizing these methods in the creation, distribution, and consumption of news content.

Irfan Essa, Professor, Georgia Institute of Technology
Project: Tracing Reuse in Political Language
Our goal in this project is to research, and then develop a data-mining tool that allows an online researcher to find and trace language reuse. By language reuse, we specifically mean: Can we find if in a current text some language was used that can be traced back to some other text or script. The technical innovation in this project is aimed at (1) identifying linguistic reuse in documents as well as other forms of material, which can be converted to text, and therefore includes political speeches and videos. Another innovation will be in (2) how linguistic reuse can be traced through the web and online social networks.

Susan McGregor, Assistant Director, Tow Center for Digital Journalism, Columbia Journalism School
Project: InfoScribe
InfoScribe is a collaborative web platform that lets citizens participate in investigative journalism projects by digitizing select data from scanned document sets uploaded by journalists. One of InfoScribe's primary research goals is to explore how community participation in journalistic activities can help improve their accuracy, transparency and impact. Additionally, InfoScribe seeks to build and expand upon understandings of how computer vision and statistical inference can be most efficiently combined with human effort in the completion of complex tasks.

Paul Resnick, Professor, University of Michigan School of Information
Project: RumorLens
RumorLens is a tool that will aid journalists in finding posts that spread or correct a particular rumor on Twitter, by exploring the size of the audiences that those posts have reached. In the collection phase, the user provides one or a few exemplar tweets and then manually classifies a few hundred others as spreading the rumor, correcting it, or labeling it as unrelated. This enables automatic retrieval and classification of remaining tweets, which are then presented in an interactive visualization that shows audience sizes.

Ryan Thornburg, Associate Professor, School of Journalism and Mass Communication, University of North Carolina at Chapel Hill
Project: Public Records Dashboard for Small Newsrooms
Building off our Knight News Challenge effort to bring data-driven journalism to readers of rural newspaper websites, we are developing an internal newsroom tool that will alert reporters and editors to potential story tips found in public data. Our project aims to lower the cost of finding in public data sets stories that shine light in dark places, hold powerful people accountable, and explain our increasingly complex and interconnected world. (Public facing site for the data acquisition element of the project at