Google Research

MAO - an Extensible Micro-Architectural Optimizer

  • Robert Hundt
  • Easwaran Raman
  • Martin Thuresson
  • Neil Vachharajani
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, ACM (2011)


Performance matters, and so does repeatability and predictability. Today's processors' micro-architectures have become so complex as to now contain many undocumented, not understood, and even puzzling performance cliffs. Small changes in the instruction stream, such as the insertion of a single NOP instruction, can lead to significant performance deltas, with the effect of exposing compiler and performance optimization efforts to perceived unwanted randomness.

This paper presents MAO, an extensible micro-architectural assembly to assembly optimizer, which seeks to address this problem for x86/64 processors. In essence, MAO is a thin wrapper around a common open source assembler infrastructure. It offers basic operations, such as creation or modification of instructions, simple data-flow analysis, and advanced infra-structure, such as loop recognition, and a repeated relaxation algorithm to compute instruction addresses and lengths. This infrastructure enables a plethora of passes for pattern matching, alignment specific optimizations, peep-holes, experiments (such as random insertion of NOPs), and fast prototyping of more sophisticated optimizations. MAO can be integrated into any compiler that emits assembly code, or can be used standalone. MAO can be used to discover micro-architectural details semi-automatically. Initial performance results are encouraging.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work