Video Probabilistic Diffusion Models in Projected Latent Space

Jinwoo Shin

Kihyuk Sohn

Sihyun Yu

Subin Kim

ICLR 2023(2023)

Google Scholar

Abstract

Despite the remarkable progress in deep generative models, synthesizing highresolution and temporally coherent videos still remains a challenge due to their highdimensionality and complex temporal dynamics along with large spatial variations. Recent works on diffusion models have shown their potential to solve this challenge, yet they suffer from severe computation inefficiency for generation and thus limit the scalability. To handle this issue, we propose a novel generative model for videos, coined projected latent video diffusion model (PVDM), a probabilistic diffusion model which learns a video distribution in a low-dimensional latent space. Specifically, PVDM is composed of two components: (a) an autoencoder that projects a given video as 2D-shaped latent vectors that factorize the complex cubic structure of video pixels and (b) a diffusion model architecture specialized for our new factorized latent space and the training/sampling procedure to synthesize videos of arbitrary length with a single model. Experiments on various benchmarks demonstrate the effectiveness of PVDM compared with previous video generation methods; e.g., PVDM obtains the FVD score of 548.1 on UCF-101, a 61.7% improved result compared with 1431.0 of the prior state-of-the-art.

Research Areas

Machine Perception

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Video Probabilistic Diffusion Models in Projected Latent Space

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Video Probabilistic Diffusion Models in Projected Latent Space

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities