
Lijun Yu
I am a research scientist at Google DeepMind. I obtained my Ph.D. and M.S. at Carnegie Mellon University School of Computer Science. I graduated summa cum laude from Peking University major in Computer Science as well as Economics. My research interests lie around multi-modal foundation models, especially for video generation.
Research Areas
Authored Publications
Sort By
Google
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Bradley Kim
Alonso Martinez
Yu-Chuan Su
Agrim Gupta
Lu Jiang
Jacob Walker
Neural Information Processing Systems (NeurIPS) (2024) (to appear)
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Kihyuk Sohn
Xiuye Gu
Fei-Fei Li
Lu Jiang
ECCV (2024)
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Nitesh Bharadwaj Gundavarapu
Luca Versari
Kihyuk Sohn
Agrim Gupta
Xiuye Gu
Alex Hauptmann
Boqing Gong
Lu Jiang
ICLR (2024)
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Xiuye Gu
Jonathan Huang
Grant Schindler
Rachel Hornung
Vighnesh Birodkar
Jimmy Yan
Ming-Chang Chiu
Hassan Akbari
Josh Dillon
Agrim Gupta
Meera Hahn
Anja Hauth
David Hendon
Alonso Martinez
Kihyuk Sohn
Xuan Yang
Huisheng Wang
Lu Jiang
ICML (2024)
MAGVIT: Masked Generative Video Transformer
Kihyuk Sohn
Han Zhang
Huiwen Chang
Alex Hauptmann
Lu Jiang
CVPR (2023)
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Zhiruo Wang
Yonatan Bisk
Alex Hauptmann
Lu Jiang
NeurIPS (2023)