Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 1500 publications
Machine Perception
Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Winter Conference on Applications of Computer Vision 2024 (2024) (to appear)
MetaMix: Meta-state Precision Searcher for Mixed-precision Activation Quantization
Han-Byul Kim
Joo Hyung Lee
Sungjoo Yoo
Hong-Seok Kim
Proc. The 38th Annual AAAI Conference on Artificial Intelligence (AAAI) (2024)
SPHEAR: Spherical Head Registration for Complete Statistical 3D Modeling
Andrei Zanfir
Teodor Szente
Mihai Zanfir
International Conference on 3D Vision (2024)
TextMesh: Generation of Realistic 3D Meshes From Text Prompts
Christina Tsalicoglou
Fabian Manhardt
Michael Niemeyer
3DV 2024 (2024)
LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D Signals
Arjun Karpur
Guilherme Perrotta
Ricardo Martin-Brualla
Proc. 3DV'24 (2024) (to appear)
Using Early Readouts to Mediate Featural Bias in Distillation
Rishabh Tiwari
Durga Sivasubramanian
Anmol Mekala
Ganesh Ramakrishnan
WACV 2024 (2024)
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Xiuye Gu
Jonathan Huang
Grant Schindler
Rachel Hornung
Vighnesh Birodkar
Jimmy Yan
Ming-Chang Chiu
Hassan Akbari
Josh Dillon
Agrim Gupta
Meera Hahn
Anja Hauth
David Hendon
Alonso Martinez
Kihyuk Sohn
Xuan Yang
Huisheng Wang
Lu Jiang
ICML (2024)
Beyond SOT: Tracking Multiple Generic Objects at Once
Christoph Mayer
Martin Danelljan
Vittorio Ferrari
Luc Van Gool
WACV'24 (2024)
Large Scale Self-Supervised Pretraining for Active Speaker Detection
Alice Chuang
Keith Johnson
Olivier Siohan
Wei Xia
Yunfan Ye
ICASSP 2024 (2024) (to appear)
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Sadeep Jayasumana
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2024)
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Gilles Baechler
Srinivas Sunkara
Maria Wang
Hassan Mansoor
Vincent Etter
Jason Lin
(2024)
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Sadeep Jayasumana
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2024)