Temporal Reasoning in Videos using Convolutional Gated Recurrent Units

Debidatta Dwibedi

Jonathan Tompson

Pierre Sermanet

CVPR Workshop (2018)

Download Google Scholar

Abstract

Recently, deep learning based models have pushed the state-of-the-art performance for the task of action recognition in videos. Yet, for many large-scale datasets like Kinetics and UCF101, the correct temporal order of frames doesn't seem to be essential to solving the task. We find that the temporal order matters more for the recently introduced 20BN Something-Something dataset where the task of fine-grained action recognition necessitates the model to do temporal reasoning. We show that when temporal order matters, recurrent models can significantly outperform non-recurrent models. This also provides us with an opportunity to inspect the recurrent units using qualitative approaches to get more insight into what they are encoding about actions in videos.

Research Areas

Machine Intelligence
Machine Perception

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Temporal Reasoning in Videos using Convolutional Gated Recurrent Units

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Temporal Reasoning in Videos using Convolutional Gated Recurrent Units

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities