
Mostafa Dehghani
I'm a Research Scientist at Google Brain, where I work on machine learning, in particular, deep learning. My areas of interest include self-supervised learning, generative models, training giant models, and sequence modeling.
Before Google, I was doing a PhD at the University of Amsterdam. My PhD research was focused on improving the process of learning with imperfect supervision. I explored ideas around using injecting inductive biases into algorithms, incorporating prior knowledge, and meta-learning the properties of the data using the data itself, in order to help learning algorithms to better learn from noisy or/and limited data.
You can know more about me here: mostafadehghani.com.
Research Areas
Authored Publications
Sort By
Google
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Carlos Riquelme
Sebastian Goodman
Yi Tay
Siamak Shakeri
Daniel Salz
Michael Tschannen
Hexiang (Frank) Hu
Mandar Joshi
Matthias Minderer
Filip Pavetić
Gang Li
Lucas Beyer
Anurag Arnab
Yuanzhong Xu
Keran Rong
Alexander Kolesnikov
Xiaohua Zhai
Neil Houlsby
Computer Vision and Pattern Recognition Conference (CVPR) (2024)
How (not) to ensemble LVLMs for VQA
Lisa Alazraki
Lluis Castrejon
Fantine Huot
"I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops
Dual PatchNorm
Neil Houlsby
Transactions on Machine Learning Research (2023) (to appear)
DSI++: Updating Transformer Memory with New Documents
Yi Tay
Jinfeng Rao
Emma Strubell
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Scaling Vision Transformers to 22 Billion Parameters
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Justin Gilmer
Mathilde Caron
Rodolphe Jenatton
Lucas Beyer
Michael Tschannen
Anurag Arnab
Carlos Riquelme
Matthias Minderer
Gamaleldin Elsayed
Fisher Yu
Avital Oliver
Fantine Huot
Mark Collier
Vighnesh Birodkar
Yi Tay
Alexander Kolesnikov
Filip Pavetić
Thomas Kipf
Xiaohua Zhai
Neil Houlsby
Arxiv (2023)
UL2: Unifying Language Learning Paradigms
Yi Tay
Xavier Garcia
Jason Wei
Hyung Won Chung
Steven Zheng
Neil Houlsby
ICLR (2023)
Transformer Memory as a Differentiable Search Index
Yi Tay
Jianmo Ni
Harsh Mehta
Zhe Zhao
NeurIPS 2022
Confident Adaptive Language Modeling
Adam Fisch
Yi Tay
NeurIPS 2022