Underspecification in Scene Description to Depiction Tasks

Ben Hutchinson; Jason Baldridge; Vinodkumar Prabhakaran

Underspecification in Scene Description to Depiction Tasks

Ben Hutchinson

Jason Baldridge

Vinodkumar Prabhakaran

ACL Rolling Review (2022) (to appear)

Google Scholar

Abstract

Questions regarding implicitness, ambiguity and underspecification are crucial for multimodal image+text systems, but have received little attention to date. This paper maps out a conceptual framework to address this gap for systems which generate images from text inputs, specifically for systems which generate images depicting scenes from descriptions of those scenes. In doing so, we account for how texts and images convey different forms of meaning. We then outline a set of core challenges concerning textual and visual ambiguity and specificity tasks, as well as risks that may arise from improper handling of ambiguous and underspecified elements. We propose and discuss two strategies for addressing these challenges: a) generating a visually ambiguous output image, and b) generating a set of diverse output images.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Underspecification in Scene Description to Depiction Tasks

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs