Google Research

A System for Massively Parallel Hyperparameter Tuning

Third Conference on Systems and Machine Learning (2020) (to appear)

Abstract

Modern learning models are characterized by large hyperparameter spaces and long training times; this coupled with the rise of parallel computing and productionization of machine learning motivate developing production- quality hyperparameter optimization functionality for a distributed computing setting. We address this challenge with a simple and robust hyperparameter optimization algorithm ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameter optimization problems. Our extensive empirical results show that ASHA outperforms state-of-the-art hyperparameter optimization methods; scales linearly with the number of workers in distributed settings; and is suitable for massive parallelism, converging to a high quality configuration in half the time taken by Vizier (Google’s internal hyperparameter optimization service) in an experiment with 500 workers. We end with a discussion of the systems considerations we encountered and our associated solutions when implementing ASHA in SystemX, a production-quality service for hyperparameter tuning.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work