Research Areas
Authored Publications
Sort By
Google
Learning Audio-Video Modalities from Image Captions
Paul Hongsuck Seo
Anja Hauth
Santiago Manen
European Conference on Computer Vision (2022)
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) (2022)
AVATAR: Unconstrained Audiovisual Speech Recognition
Valentin Gabeur
Paul Hongsuck Seo
Karteek Alahari
Interspeech (2022)
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Anna Rohrbach
Medhini Narasimhan
Trevor Darrell
European Conference on Computer Vision (2022)
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur
Karteek Alahari
Winter Conference on Applications of Computer Vision (WACV) (2022) (to appear)
What makes for good views for contrastive representation learning?
Dilip Krishnan
Phillip Isola
Yonglong Tian
NeurIPS 2020 (to appear)