Evolving Losses for Unlabeled Video Representation Learning

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

CVPR 2019 Workshop on Learning from Unlabeled Videos (2019)

Download Google Scholar

Abstract

We present a new method to learn video representations from large-scale unlabeled video data. We formulate our unsupervised representation learning as a multi-modal, multi-task learning problem, where the representations are also shared across different modalities via distillation. Our formulation allows for the distillation of audio, optical flow and temporal information into a single, RGB-based convolutional neural network. We also compare the effects of using additional unlabeled video data and evaluate our representation learning on standard public video datasets. We newly introduce the concept of using an evolutionary algorithm to obtain a better multi-modal, multi-task loss function to train the network. AutoML has successfully been applied to architecture search and data augmentation. Here we extend the concept of AutoML to unsupervised representation learning by automatically finding the optimal weighting of tasks for representation learning.

Research Areas

Machine Perception

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Evolving Losses for Unlabeled Video Representation Learning

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Evolving Losses for Unlabeled Video Representation Learning

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities