A System for Massively Parallel Hyperparameter Tuning

Liam Li; Kevin Jamieson; Afshin Rostamizadeh; Ekaterina Gonina; Jonathan Ben-tzur; Moritz Hardt; Benjamin Recht; Ameet Talwalkar

A System for Massively Parallel Hyperparameter Tuning

Liam Li

Kevin Jamieson

Afshin Rostamizadeh

Ekaterina Gonina

Jonathan Ben-tzur

Moritz Hardt

Benjamin Recht

Ameet Talwalkar

Third Conference on Systems and Machine Learning (2020) (to appear)

Google Scholar

Abstract

Modern learning models are characterized by large hyperparameter spaces and long training times; this coupled
with the rise of parallel computing and productionization of machine learning motivate developing production-
quality hyperparameter optimization functionality for a distributed computing setting. We address this challenge
with a simple and robust hyperparameter optimization algorithm ASHA, which exploits parallelism and aggressive
early-stopping to tackle large-scale hyperparameter optimization problems. Our extensive empirical results show
that ASHA outperforms state-of-the-art hyperparameter optimization methods; scales linearly with the number of
workers in distributed settings; and is suitable for massive parallelism, converging to a high quality configuration
in half the time taken by Vizier (Google’s internal hyperparameter optimization service) in an experiment with
500 workers. We end with a discussion of the systems considerations we encountered and our associated solutions
when implementing ASHA in SystemX, a production-quality service for hyperparameter tuning.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

A System for Massively Parallel Hyperparameter Tuning

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs