Piotr Mirowski

Piotr Mirowski

I am a Staff Research Scientist working at Google DeepMind. As a member Dr. Raia Hadsell‘s and Dr. Shakir Mohamed‘s teams, I have been focusing on navigation-related research, in scaling up autonomous agents to real world environments, in weather and climate forecasting and data compression, and in socio-technical systems studies on computational creativity and participatory AI with artists. Some of my work has been published in Nature, at ICLR and NeurIPS and covered by The Guardian, BBC, Financial Times and many other press outlets.
I studied computer science in France (ENSEEIHT, Toulouse) and obtained my PhD in computer science in 2011 at New York University, with a thesis on "Time Series Modeling with Hidden Variables and Gradient-based Algorithms" supervised by Prof. Yann LeCun (Outstanding Dissertation Award, 2011).
During my theatre and improv performances (with and without robots on the stage), I investigate the use of AI for artistic human and machine-based co-creation.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract What are dimensions of human intent, and how do writing tools shape and augment these expressions? From papyrus to auto-complete, a major turning point was when Alan Turing famously asked, “Can Machines Think?” If so, should we offload aspects of our thinking to machines, and what impact do they have in enabling the intentions we have? This paper adapts the Authorial Leverage framework, from the Intelligent Narrative Technologies literature, for evaluating recent generative model advancements. With increased widespread access to Large Language Models (LLMs), the evolution of our evaluative frameworks follow suit. To do this, we discuss previous expert studies of deep generative models for fiction writers and playwrights, and propose two future directions, (1) author-focused and (2) audience-focused, for furthering our understanding of Authorial Leverage of LLMs, particularly in the domain of comedy writing. View details
    Preview abstract The Touchdown dataset (Chen et al., 2019) provides instructions by human annotators for navigation through New York City streets and for resolving spatial descriptions at a given location. To enable the wider research community to work effectively with the Touchdown tasks, we are publicly releasing the 29k raw Street View panoramas needed for Touchdown. We follow the process used for the StreetLearn data release (Mirowski et al., 2019) to check panoramas for personally identifiable information and blur them as necessary. These have been added to the StreetLearn dataset and can be obtained via the same process as used previously for StreetLearn. We also provide a reference implementation for both of the Touchdown tasks: vision and language navigation (VLN) and spatial description resolution (SDR). We compare our model results to those given in Chen et al. (2019) and show that the panoramas we have added to StreetLearn fully support both Touchdown tasks and can be used effectively for further research and comparison. View details
    Vector-based Navigation using Grid-like Representations in Artificial Agents.
    Alexander Pritzel
    Andrea Banino
    Benigno Uria
    Brian C Zhang
    Caswell Barry
    Charles Blundell
    Charlie Beattie
    Demis Hassabis
    Dharshan Kumaran
    Fabio Viola
    Greg Wayne
    Helen King
    Hubert Soyer
    Joseph Modayil
    Koray Kavukcuoglu
    Martin J. Chadwick
    Neil Rabinowitz
    Raia Hadsell
    Razvan Pascanu
    Stephen Gaffney
    Stig Vilholm Petersen
    Thomas Degris
    Timothy Lillicrap
    Nature(2018)
    Preview abstract Efficient navigation is a fundamental component of mammalian behaviour but remains challenging for artificial agents. Mammalian spatial behaviour is underpinned by grid cells in the entorhinal cortex, providing a multi-scale periodic representation that functions as a metric for coding space. Grid cells are viewed as critical for integrating self-motion (path integration) and planning direct trajectories to goals (vector-based navigation).We report, for the first time, that brain-like grid representations can emerge as the product of optimizing a recurrent network to perform the task of path integration - providing a normative perspective on the role of grid cells as a compact code for representing space. We show that grid cells provide an effective basis set to optimize the primary objective of navigation through deep reinforcement learning (RL) - the rapid discovery and exploitation of goals in complex, unfamiliar, and changeable environments. The performance of agents endowed with grid-like representations was found to surpass that of an expert human and comparison agents. Further, we demonstrate that grid-like representations enable agents to conduct shortcut behaviours reminiscent of those performed by mammals - with decoding analyses confirming that the metric quantities necessary for vector-based navigation (e.g. Euclidean distance and direction to goal) are represented within the network. Our findings show that emergent grid-like responses furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for path integration and vector-based navigation, demonstrating that the latter can be combined with path- based strategies to support navigation in complex environments View details
    Learning to Navigate in Cities Without a Map
    Matthew Grimes
    Mateusz Malinowski
    Karl Moritz Hermann
    Keith Anderson
    Denis Teplyashin
    Karen Simonyan
    Koray Kavukcuoglu
    Andrew Zisserman
    Raia Hadsell
    Preview abstract Navigating through unstructured environments isa basic capability of intelligent creatures, and thus is of fundamental interest in the study and development of artificial intelligence. Long-range navigation is a complex cognitive task that relies on developing an internal representation of space, grounded by recognisable landmarks and robust visual processing, that can simultaneously sup-port continuous self-localisation (“I am here”) and a representation of the goal (“I am going there”). Building upon recent research that applies deep reinforcement learning to maze navigation problems, we present an end-to-end deep reinforcement learning approach that can be applied on a city scale. Recognising that successful navigation relies on integration of general policies with locale-specific knowledge, we propose a dual pathway architecture that allows locale-specific features to be encapsulated, while still enabling transfer to multiple cities. We present an interactive navigation environment that uses Google Street View for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to tar-get destinations that may be kilometres away. View details
    Preview abstract Model-free reinforcement learning has recently been shown to be effective at learning navigation policies from complex image input. However, these algorithms tend to require large amounts of interaction with the environment, which can be prohibitively costly to obtain on robots in the real world. We present an approach for efficiently learning goal-directed navigation policies on a mobile robot, from only a single coverage traversal of recorded data. The navigation agent learns an effective policy over a diverse action space in a large heterogeneous environment consisting of more than 2km of travel, through buildings and outdoor regions that collectively exhibit large variations in visual appearance, self-similarity, and connectivity. We compare pretrained visual encoders that enable precomputation of visual embeddings to achieve a throughput of tens of thousands of transitions per second at training time on a commodity desktop computer, allowing agents to learn from millions of trajectories of experience in a matter of hours. We propose multiple forms of computationally efficient stochastic augmentation to enable the learned policy to generalise beyond these precomputed embeddings, and demonstrate successful deployment of the learned policy on the real robot without fine tuning, despite considerable visual differences at test time. The dataset and code required to reproduce these results and apply the technique to other datasets and robots is made publicly available at https://github.com/jakebruce/deployable-rl-navigation. View details
    No Results Found