A Dataset and Architecture for Visual Reasoning with a Working Memory

Robert Guangyu Yang

Igor Ganichev

Xiao Jing Wang

Jonathon Shlens

David Sussillo

ECCV(2018)

Download Google Scholar

Abstract

A vexing problem in artificial intelligence is reasoning about events that occur in complex, changing visual stimuli, such as in video analysis or game play. Inspired by cognitive psychology and neuroscience, which have a rich tradition of studying both visual reasoning and memory, we developed a configurable visual question and answer dataset (COG) that is much simpler than the general problem of video analysis yet addresses many of the problems relating to visual and logical reasoning and memory, problems that remain challenging for modern deep learning architectures. We additionally propose a deep learning architecture that performs at state of the art level on the CLEVR dataset, and performs well on easy settings of the COG dataset, but struggles at harder levels. Preliminary analyses of the network demonstrate the network accomplishes the task in ways that are interpretable to humans.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

A Dataset and Architecture for Visual Reasoning with a Working Memory

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

A Dataset and Architecture for Visual Reasoning with a Working Memory

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities