Autotuning Convolutions is Easier Than You Think

Nicolas Tollenaere; Guillaume Iooss; Stéphane Pouget; Hugo Brunie; Christophe Guillon; Albert Cohen; P.  Sadayappan; Fabrice Rastello

Autotuning Convolutions is Easier Than You Think

Nicolas Tollenaere

Guillaume Iooss

Stéphane Pouget

Hugo Brunie

Christophe Guillon

Albert Cohen

P. Sadayappan

Fabrice Rastello

ACM TACO (2022)

Download Google Scholar

Abstract

A wide range of scientific and machine learning applications depend on highly optimized implementations
of tensor computations. Exploiting the full capacity of a given processor architecture remains a challenging
task, due to the complexity of the microarchitectural features that come into play when seeking near-peak
performance. Among the state-of-the-art techniques for loop transformations for performance optimization,
AutoScheduler tends to outperform other systems. It often yields higher performance as
compared to vendor libraries, but takes a large number of runs to converge, while also involving a complex
training environment.
In this paper, we define a structured configuration space that enables much faster convergence to highperformance code versions, using only random sampling of candidates. We focus on two-dimensional convolutions on CPUs. Compared to state-of-the-art libraries, our structured search space enables higher performance
for typical tensor shapes encountered in convolution stages in deep learning pipelines. Compared to autotuning code generators like AutoScheduler, it prunes the search space while increasing the density of efficient
implementations. We analyze the impact on convergence speed and performance distribution, on two Intel x86
processors and one ARM AArch64 processor. We match or outperform the performance of the state-of-the-art
oneDNN library and TVM’s AutoScheduler, while reducing the autotuning effort by at least an order of
magnitude.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Autotuning Convolutions is Easier Than You Think

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs