Ross Goroshin
I am currently a Research Scientist in the Google Brain Montreal team. My main area of expertise is Deep Learning. I'm mainly interested in building flexible and robust computer vision systems by applying ideas from self-supervised and meta-learning.
Prior to joining the Brain team I was at DeepMind (London UK) where I worked on navigation problems using reinforcement learning. I completed by PhD under the supervision of Yann LeCun at NYU.
Research Areas
Authored Publications
Sort By
Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Jesse Farebrother
Joshua Greaves
Charline Le Lan
Marc Bellemare
International Conference on Learning Representations (ICLR) (2023)
Preview abstract
Auxiliary tasks improve the representations learned by deep reinforcement learning agents. Analytically, their effect is reasonably well-understood; in practice, how-ever, their primary use remains in support of a main learning objective, rather than as a method for learning representations. This is perhaps surprising given that many auxiliary tasks are defined procedurally, and hence can be treated as an essentially infinite source of information about the environment. Based on this observation, we study the effectiveness of auxiliary tasks for learning rich representations, focusing on the setting where the number of tasks and the size of the agent’s network are simultaneously increased. For this purpose, we derive a new family of auxiliary tasks based on the successor measure. These tasks are easy to implement and have appealing theoretical properties. Combined with a suitable off-policy learning rule, the result is a representation learning algorithm that can be understood as extending Mahadevan & Maggioni (2007)’s proto-value functions to deep reinforcement learning – accordingly, we call the resulting object proto-value networks. Through a series of experiments on the Arcade Learning Environment, we demonstrate that proto-value networks produce rich features that may be used to obtain performance comparable to established algorithms, using only linear approximation and a small number (~4M) of interactions with the environment’s reward function.
View details
Impact of Aliasing on Generalization in Deep Convolutional Networks
Nicolas Le Roux
Rob Romijnders
International Conference on Computer Vision ICCV 2021, IEEE/CVF (2021)
Preview abstract
Traditionally image pre-processing in the frequency domain has played a vital role in computer vision and was even part of the standard pipeline in the early days of Deep Learning. However, with the advent of large datasets many practitioners concluded that this was unnecessary due to the belief that these priors can be learned from the data itself \emph{if they aid in achieving stronger performance}. Frequency aliasing is a phenomena that may occur when down-sampling (sub-sampling) any signal, such as an image or feature map. We demonstrate that substantial improvements on OOD generalization can be obtained by mitigating the effects of aliasing by placing non-trainable blur filters and using smooth activation functions at key locations in the ResNet family of architectures -- helping to achieve new state-of-the-art results on two benchmarks without any hyper-parameter sweeps.
View details
A Unified Few-Shot Classification Benchmark to Compare Transfer and Meta Learning Approaches
Neil Houlsby
Sylvain Gelly
NeurIPS Datasets and Benchmarks Track (2021)
Preview abstract
Meta and transfer learning are two successful families of approaches to few-shot learning. Despite highly related goals, state-of-the-art advances in each family are measured largely in isolation of each other. As a result of diverging evaluation norms, a direct or thorough comparison of different approaches is challenging. To bridge this gap, we introduce a few-shot classification evaluation protocol named VTAB+MD with the explicit goal of facilitating sharing of insights from each community. We demonstrate its accessibility in practice by performing a cross-family study of the best transfer and meta learners which report on both a large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task Adaptation Benchmark, VTAB). We find that, on average, large-scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet. In contrast, meta-learning approaches struggle to compete on VTAB when trained and validated on MD. However, BiT is not without limitations, and pushing for scale does not improve performance on highly out-of-distribution MD tasks. We hope that this work contributes to accelerating progress on few-shot learning research.
View details
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
Eleni Triantafillou
Tyler Zhu
Kelvin Xu
Carles Gelada
International Conference on Learning Representations (submission) (2020)
Preview abstract
Few-shot classification refers to learning a classifier for new classes given only a few examples. While a plethora of models have emerged to tackle this recently, we find the current procedure and datasets that are used to systematically assess progress in this setting lacking. To address this, we propose META-DATASET: a new benchmark for training and evaluating few-shot classifiers that is large-scale, consists of multiple datasets, and presents more natural and realistic tasks. The aim is to measure the ability of state-of the-art models to leverage diverse sources of data to achieve higher generalization, and to evaluate that generalization ability in a more challenging setting. We additionally measure robustness of current methods to variations in the number of available examples and the number of classes. Finally our extensive empirical evaluation leads us to identify weaknesses in Prototypical Networks and MAML, two popular few-shot classification methods, and to propose a new method, ProtoMAML, which achieves improved performance on our benchmark.
View details
Preview abstract
Fully convolutional deep correlation networks are currently the state of the art approaches to single object visual tracking. It is commonly assumed that these networks perform tracking by detection by matching features of the object instance with features of the scene. Strong architectural priors and conditioning on the object representation is thought to encourage this tracking strategy. Despite these efforts, we show that deep trackers often default to “tracking by saliency” detection – without relying on the object representation. This leads us to introduce an auxiliary detection task that encourages more discriminative object representations and improves tracking performance.
View details
Vector-based Navigation using Grid-like Representations in Artificial Agents.
Alexander Pritzel
Andrea Banino
Benigno Uria
Brian C Zhang
Caswell Barry
Charles Blundell
Charlie Beattie
Demis Hassabis
Dharshan Kumaran
Greg Wayne
Helen King
Hubert Soyer
Joseph Modayil
Koray Kavukcuoglu
Martin J. Chadwick
Neil Rabinowitz
Raia Hadsell
Razvan Pascanu
Stephen Gaffney
Stig Vilholm Petersen
Thomas Degris
Timothy Lillicrap
Nature (2018)