Software-defined far memory in warehouse-scale computers

Andres Lagar-Cavilla; Junwhan Ahn; Suleiman Souhlal; Neha Agarwal; Radoslaw Burny; Shakeel Butt; Jichuan Chang; Ashwin Chaugule; Nan Deng; Junaid Shahid; Greg Thelen; Kamil Adam Yurtsever; Yu Zhao; Parthasarathy Ranganathan

Software-defined far memory in warehouse-scale computers

Andres Lagar-Cavilla

Junwhan Ahn

Suleiman Souhlal

Neha Agarwal

Radoslaw Burny

Shakeel Butt

Jichuan Chang

Ashwin Chaugule

Nan Deng

Junaid Shahid

Greg Thelen

Kamil Adam Yurtsever

Yu Zhao

Parthasarathy Ranganathan

International Conference on Architectural Support for Programming Languages and Operating Systems (2019)

Download Google Scholar

Abstract

Increasing memory demand and slowdown in technology scaling pose important challenges to total cost of ownership (TCO) of warehouse-scale computers (WSCs). One promising idea to reduce the memory TCO is to add a cheaper, but slower, "far memory" tier and use it to store infrequently accessed (or cold) data. However, introducing a far memory tier brings new challenges around dynamically responding to workload diversity and churn, minimizing stranding of capacity, and addressing brownfield (legacy) deployments.

We present a novel software-defined approach to far memory that proactively compresses cold memory pages to effectively create a far memory tier in software. Our end-to-end system design encompasses new methods to define performance service-level objectives (SLOs), a mechanism to identify cold memory pages while meeting the SLO, and our implementation in the OS kernel and node agent. Additionally, we design learning-based autotuning to periodically adapt our design to fleet-wide changes without a human in the loop. Our system has been successfully deployed across Google's WSC since 2016, serving thousands of production services. Our software-defined far memory is significantly cheaper (67% or higher memory cost reduction) at relatively good access speeds (6 us) and allows us to store a significant fraction of infrequently accessed data (on average, 20%), translating to significant TCO savings at warehouse scale.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Software-defined far memory in warehouse-scale computers

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs