The Auto-Arborist Dataset: A Large-Scale Benchmark for Generalizable, Multimodal Urban Forest Monitoring

Sara Meghan Beery
Guanhang Wu
Trevor Edwards
Filip Pavetić
Bo Majewski
Stan Chan
John Morgan
Vivek Mansing Rathod
CVPR 2022 (2022)

Abstract

Urban forests provide significant benefits to urban societies (e.g., cleaner air and water, carbon sequestration, and energy savings among others). However, planning and maintaining these forests is expensive. One particularly costly aspect of urban forest management is monitoring the existing trees in a city: ie tracking tree locations, species, and health. Monitoring efforts are currently based on tree censuses built by human experts, collected at a rate of once every five years or less and costing cities millions of dollars. In this paper we explore the use of computer vision to automatically find, label, and monitor individual trees at a large scale using a combination of street level and aerial imagery.

Previous investigations into automating this process focused on small datasets from single cities, covering only common species \cite{Branson2018, sumbul2017fine}. These fail to capture the complexity of the problem, which is both fine-grained and significantly long-tailed, and result in methods which are not applicable to new cities.
To address this shortcoming, we introduce a new large scale dataset that joins public tree inventories (maintained by cities) with a large collection of street level and aerial imagery. Our Auto-Arborist dataset contains over 2.5 million trees covering >340 genus level categories from North America and is currently at least two orders of magnitude larger than the closest comparable dataset in the literature. Uniquely, we cover multiple cities (to our knowledge, prior works have restricted their focus to single-city datasets) which allows for analysis of generalization with respect to geographic distribution shifts that were not previously possible.

We propose a set of metrics to evaluate performance especially with respect to these geographic distribution shifts and show the strengths and weaknesses of typical deep learning models when applied to the Auto Arborist dataset. We hope our dataset can be an important and exciting new scientific benchmark that will spur progress on the application of computer vision to urban ecology and sustainability.

Research Areas