Jump to Content
Lasse Espeholt

Lasse Espeholt

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract This paper presents first successful steps in designing search agents that learn meta-strategies for iterative query refinement in information-seeking tasks. Our approach uses machine reading to guide the selection of refinement terms from aggregated search results. Agents are then empowered with simple but effective search operators to exert fine-grained and transparent control over queries and search results. We develop a novel way of generating synthetic search sessions, which leverages the power of transformer-based language models through (self-)supervised learning. We also present a reinforcement learning agent with dynamically constrained actions that learns interactive search strategies from scratch. Our search agents obtain retrieval and answer quality performance comparable to recent neural methods, using only a traditional term-based BM25 ranking function and interpretable discrete reranking and filtering actions. View details
    SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
    Piotr Michal Stanczyk
    Marcin Michalski
    International Conference on Learning Representations (2020) (to appear)
    Preview abstract We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing modern accelerators, we show that it is not only possible to train on millions of frames per second but also to lower the cost of experiments compared to current methods. We achieve this with a simple architecture that features centralized inference and an optimized communication layer. SEED adopts two state of the art distributed algorithms, IMPALA/V-trace (policy gradients) and R2D2 (Q-learning), and is evaluated on Atari-57, DeepMind Lab and Google Research Football. We improve the state of the art on Football and are able to reach state of the art on Atari-57 three times faster in wall-time. For the scenarios we consider, a 40% to 80% cost reduction for running experiments is achieved. The implementation along with experiments is open-sourced so results can be reproduced and novel ideas tried out. View details
    Preview abstract Weather forecasting is a long standing scientific challenge with direct social and economic impact. The task is suitable for deep neural networks due to vast amounts of continuously collected data and a rich spatial and temporal structure that presents long range dependencies. We introduce MetNet, a neural network that forecasts precipitation up to 8 hours into the future at the high spatial resolution of 1 km and at the temporal resolution of 2 minutes with a latency in the order of seconds. MetNet takes as input radar and satellite data and forecast lead time and produces a probabilistic precipitation map. The architecture uses axial self-attention to aggregate the global context from a large input patch corresponding to a million square kilometers. We evaluate the performance of MetNet at various precipitation thresholds and find that MetNet outperforms Numerical Weather Prediction at forecasts of up to 7 to 8 hours on the scale of the continental United States. View details
    Multi-task Deep Reinforcement Learning with PopArt
    Matteo Hessel
    Hubert Soyer
    Wojciech Czarnecki
    Simon Schmitt
    Hado van Hasselt
    DeepMind (2019) (to appear)
    Preview abstract The reinforcement learning community has made great strides in designing algorithms capable of exceeding human performance on specific tasks. These algorithms are mostly trained one task at the time, each new task requiring to train a brand new agent instance. In this work, we investigate algorithms capable of learning to master not one but multiple sequential-decision tasks at once. We use PopArt normalisation to derive scale invariant policy-gradient updates, and we propose an actor critic architecture designed for multi-task learning. In combination with the IMPALA reinforcement-learning architecture this results in state of the art performance on learning to play all games in a set of 57 diverse Atari games. Excitingly, our method learns a single trained policy - with a single set of weights - that exceeds median human performance across all games. To our knowledge, this is the first time a single agent surpasses human-level performance on this multi-task domain. The same approach demonstrates state of the art results on a set of 30 tasks defined in the 3D reinforcement learning platform DeepMind Lab. View details
    Google Research Football: A Novel Reinforcement Learning Environment
    Karol Kurach
    Piotr Michal Stanczyk
    Michał Zając
    Carlos Riquelme
    Damien Vincent
    Marcin Michalski
    Sylvain Gelly
    AAAI (2019)
    Preview abstract Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator. The resulting environment is challenging, easy to use and customize, and it is available under a permissive open-source license. We further propose three full-game scenarios of varying difficulty with the Football Benchmarks, we report baseline results for three commonly used reinforcement algorithms (Impala, PPO, and Ape-X DQN), and we also provide a diverse set of simpler scenarios with the Football Academy. View details
    IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
    Hubert Soyer
    Remi Munos
    Karen Simonyan
    Volodymyr Mnih
    Tom Ward
    Yotam Doron
    Vlad Firoiu
    Tim Harley
    Iain Robert Dunning
    Shane Legg
    Koray Kavukcuoglu
    ArXiv (2018) (to appear)
    Preview abstract In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time, which is already a problem in single task learning. In order to tackle this challenging problem, we have developed a new distributed agent architecture IMPALA (Importance-Weighted Actor Learner) that can scale to using thousands of machines and achieve a throughput rate of $250,000$ frames per second. We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace, which was critical for achieving learning stability. We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment \cite{beattie2016dmlab}) and ATARI-57 (all available ATARI games in Arcade Learning Environment \cite{bellemare13arcade}). Our results show that IMPALA is able to achieve better performance than previous agents, uses less data and crucially exhibits positive transfer between tasks as a result of its multi-task approach. View details
    Conditional Image Generation with PixelCNN Decoders
    Aäron van den Oord
    Koray Kavukcuoglu
    Alexander Graves
    Advances in Neural Information Processing Systems 29, Curran Associates, Inc. (2016), pp. 4790-4798 (to appear)
    Preview abstract This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other networks. When conditioned on class labels from the ImageNet database, the model is able to generate diverse, realistic scenes representing distinct animals, objects, landscapes and structures. When conditioned on an embedding produced by a convolutional network given a single image of an unseen face, it generates a variety of new portraits of the same person with different facial expressions, poses and lighting conditions. We also show that conditional PixelCNN can serve as a powerful decoder in an image autoencoder. Additionally, the gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-ofthe-art performance of PixelRNN on ImageNet, with greatly reduced computational cost. View details
    Neural Machine Translation in Linear Time
    Karen Simonyan
    Aäron van den Oord
    Alexander Graves
    Koray Kavukcuoglu
    Arxiv (2016)
    Preview abstract We present a neural architecture for sequences, the ByteNet, that has two core features: it runs in time that is linear in the length of the sequences and it preserves the sequences' temporal resolution. The ByteNet is a stack of two dilated convolutional neural networks, one to encode the source and one to decode the target, where the target decoder unfolds dynamically to generate variable length outputs. We show that the ByteNet decoder attains state-of-the-art performance on character-level language modelling and outperforms recurrent neural networks. We also show that the ByteNet achieves a performance on raw character-level machine translation that approaches that of the best neural translation models that run in quadratic time. A visualization technique reveals the latent alignment structure learnt by the ByteNet. View details
    Teaching Machines to Read and Comprehend
    Karl Moritz Hermann
    Tomas Kocisky
    Edward Grefenstette
    Will Kay
    Mustafa Suleyman
    Phil Blunsom
    NIPS (2015) (to appear)
    Preview abstract Teaching machines to read natural language documents remains an elusive chal- lenge. Such models can be tested on their ability to answer questions posed on the contents of the documents that they have seen, but until now large scale su- pervised training and test datasets have been missing for such tasks. In this work we introduce a new machine reading paradigm based on large scale supervised training datasets extracted from readily available online sources. We define mod- els for this task based on both a traditional natural language processing pipeline, and on attention based recurrent neural networks. Our results demonstrate that neural network models are able to learn to read documents and answer complex questions with minimal prior knowledge of language structure. View details
    No Results Found