Research Areas
Authored Publications
Sort By
Google
AfriMed-QA: A Pan-African Multi-Specialty Medical Question-Answering Benchmark Dataset
Tobi Olatunji
Abraham Toluwase Owodunni
Charles Nimo
Jennifer Orisakwe
Henok Biadglign Ademtew
Chris Fourie
Foutse Yuehgoh
Stephen Moore
Mardhiyah Sanni
Emmanuel Ayodele
Timothy Faniran
Bonaventure F. P. Dossou
Fola Omofoye
Wendy Kinara
Tassallah Abdullahi
Michael Best
2025
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Nitesh Bharadwaj Gundavarapu
Luca Versari
Kihyuk Sohn
Agrim Gupta
Xiuye Gu
Alex Hauptmann
Boqing Gong
Lu Jiang
ICLR (2024)
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Xiuye Gu
Jonathan Huang
Grant Schindler
Rachel Hornung
Vighnesh Birodkar
Jimmy Yan
Ming-Chang Chiu
Hassan Akbari
Josh Dillon
Agrim Gupta
Meera Hahn
Anja Hauth
David Hendon
Alonso Martinez
Kihyuk Sohn
Xuan Yang
Huisheng Wang
Lu Jiang
ICML (2024)
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Kihyuk Sohn
Xiuye Gu
Fei-Fei Li
Lu Jiang
ECCV (2024)
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Zhiruo Wang
Yonatan Bisk
Alex Hauptmann
Lu Jiang
NeurIPS (2023)
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Huiwen Chang
Luisa Polania
Han Zhang
Lu Jiang
CVPR 2023 (2023)
MAGVIT: Masked Generative Video Transformer
Kihyuk Sohn
Han Zhang
Huiwen Chang
Alex Hauptmann
Lu Jiang
CVPR (2023)
Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access
Yi-Hao Peng
CHI 2023: ACM Conference on Human Factors in Computing Systems (2023) (to appear)