
Lijun Yu
I am a research scientist at Google DeepMind. I obtained my Ph.D. and M.S. at Carnegie Mellon University School of Computer Science. I graduated summa cum laude from Peking University major in Computer Science as well as Economics. My research interests lie around multi-modal foundation models, especially for video generation.
Research Areas
Authored Publications
Sort By
Google
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Xiuye Gu
Jonathan Huang
Grant Schindler
Rachel Hornung
Vighnesh Birodkar
Jimmy Yan
Ming-Chang Chiu
Hassan Akbari
Josh Dillon
Agrim Gupta
Meera Hahn
Anja Hauth
David Hendon
Alonso Martinez
Kihyuk Sohn
Xuan Yang
Huisheng Wang
Lu Jiang
ICML (2024)
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Bradley Kim
Alonso Martinez
Yu-Chuan Su
Agrim Gupta
Lu Jiang
Jacob Walker
Neural Information Processing Systems (NeurIPS) (2024) (to appear)
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Nitesh Bharadwaj Gundavarapu
Luca Versari
Kihyuk Sohn
Agrim Gupta
Xiuye Gu
Alex Hauptmann
Boqing Gong
Lu Jiang
ICLR (2024)
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Kihyuk Sohn
Xiuye Gu
Fei-Fei Li
Lu Jiang
ECCV (2024)
MAGVIT: Masked Generative Video Transformer
Kihyuk Sohn
Han Zhang
Huiwen Chang
Alex Hauptmann
Lu Jiang
CVPR (2023)
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Zhiruo Wang
Yonatan Bisk
Alex Hauptmann
Lu Jiang
NeurIPS (2023)