Google Research

Google Workload Traces 2022


With rapid expansion of internet and cloud computing, warehouse-scale computing (WSC) workloads (search, email, video sharing, online maps, online shopping, etc.) have reached planetary scale and are amongst the fastest growing in computing demand. These workloads differ from others in their requirements for on-demand scalability, elasticity and availability.

Google workloads differ from traditional benchmarks in many ways. For example, Google workloads have data and instruction footprints that go beyond the capacity of modern CPU caches, leaving the CPU spending a significant portion of its time waiting for code and data. Simply increasing memory bandwidth would not solve the problem as many accesses are in the critical path for application request processing; therefore, it is just as important to reduce memory access latency as to increase memory bandwidth.

The Google Workload Traces capture the addresses of the instruction and memory accesses during workload execution. These traces will help systems designers better understand how a WSC workload performs as it interacts with the underlying components and develop new solutions for front-end and data-access bottlenecks.

Over the past few years, we have found these traces useful for understanding WSC workloads and seeding internal research on processor front-ends, on-die interconnects, caches and memory subsystems, etc. — all areas that greatly impact WSC workloads. One example of using these traces is AsmDB: Understanding and Mitigating Front-End Stalls in Warehouse-Scale Computers. We hope these traces will enable the computer architecture community to develop new ideas that improve performance and efficiency of WSC workloads.

The Google Workload Traces are captured using DynamoRIO on computer servers running Google workloads. For more information about these traces, their format, etc., please visit: