How to Train Navigation Agents (on a Sample and Compute) Budget

Erik Wijmans
Dhruv Batra
International Conference on Autonomous Agents and Multi-Agent Systems, 2022

Abstract

PointGoal Navigation has seen significant recent interest and progress, spurred on by the Habitat platform and associated challenge [1]. In this paper, we study PointGoal Navigation under both a sample budget (75 million frames) and a compute budget (1 GPU for 1 day). We conduct an extensive set of experiments, cumulatively totaling over 50,000 GPU-hours, that let us identify and discuss a number of ostensibly minor but significant design choices – the advantage estimation procedure (a key component in training), visual encoder architecture, and a seemingly minor hyper-parameter change. Overall, these design choices to lead considerable and consistent improvements over the base-lines present in Savva et al. [1]. Under a sample budget, performance for RGB-D agents improves 8 SPL on Gibson (14% relative improvement) and 20 SPL on Matterport3D (38% relative improvement). Under a compute budget, performance for RGB-D agents improves by 19 SPL on Gibson (32% relative improvement) and 35 SPL on Matterport3D (220% relative improvement). We hope our findings and recommendations will make serve to make the community’s experiments more efficient.