Jump to Content

Scaling Spherical CNNs

Jean-Jacques Slotine
International Conference on Machine Learning (ICML) (2023)

Abstract

Spherical CNNs generalize CNNs to functions on the sphere, by using spherical convolutions as the main linear operation. The most accurate and efficient way to compute spherical convolutions is in the spectral domain (via the convolution theorem), but this is still much more costly than the usual planar convolutions. For this reason, applications of spherical CNNs have so far been limited to small problems that can be approached with low model capacity. In this work, we show how spherical CNNs can be scaled for much larger problems. To achieve this, we made critical improvements including an implementation of core operations to exploit hardware accelerator characteristics, introducing novel variants of common model components, and showing how to construct application-specific input representations that exploit the properties of our model. Experiments show our larger spherical CNNs reach state-of-the-art on several targets of the QM9 molecular benchmark, which was previously dominated by equivariant graph neural networks, and achieve competitive performance on multiple weather forecasting tasks.

Research Areas