Lisa Lee
Research Areas
Authored Publications
Sort By
Multi-Game Decision Transformers
Ofir Nachum
Sherry Yang
Daniel Freeman
Winnie Xu
Eric Victor Jang
Henryk Witold Michalewski
Igor Mordatch
Advances in Neural Information Processing Systems (NeurIPS) (2022)
Preview abstract
A longstanding goal of the field of AI is a strategy for compiling diverse experience into a highly capable, generalist agent. In the subfields of vision and language, this was largely achieved by scaling up transformer-based models and training them on large, diverse datasets. Motivated by this progress, we investigate whether the same strategy can be used to produce generalist reinforcement learning agents. Specifically, we show that a single transformer-based model - with a single set of weights - trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance. When trained and evaluated appropriately, we find that the same trends observed in language and vision hold, including scaling of performance with model size and rapid adaptation to new games via fine-tuning. We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning, and find that our Multi-Game Decision Transformer models offer the best scalability and performance. We release the pre-trained models and code to encourage further research in this direction. Additional information, videos and code can be seen at: http://sites.google.com/view/multi-game-transformers
View details
Weakly-Supervised Reinforcement Learning for Controllable Behavior
Benjamin Eysenbach
Ruslan Salakhutdinov
Shixiang (Shane) Gu
Chelsea Finn
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020) (2020)
Preview abstract
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks. However, in many settings, an agent must winnow down the inconceivably large space of all possible tasks to the single task that it is currently being asked to solve.. Can we instead constrain the space of tasks to those that are semantically meaningful? In this work, we introduce a framework for using weak supervision to automatically disentangle this semantically meaningful subspace of tasks from the enormous space of nonsensical ``chaff'' tasks. We show that this learned subspace enables efficient exploration and provides a representation that captures distance between states. On a variety of challenging, vision-based continuous control problems, our approach leads to substantial performance gains, particularly as the complexity of the environment grows.
View details