High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Ruben Villegas; Arkanath Pathak; Harini Kannan; Dumitru Erhan; Quoc Le; Honglak Lee

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Ruben Villegas

Arkanath Pathak

Harini Kannan

Dumitru Erhan

Quoc Le

Honglak Lee

NeurIPS (2019)

Download Google Scholar

Abstract

Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex network architectures and highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: maximizing the capacity of a standard convolutional neural network. We perform the first large-scale empirical study of the effect of capacity on video prediction models. In our experiments, we demonstrate our results on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling first-person car driving.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs