Filtering Translation Bandwidth with Virtual Caching

Hongil Yoon

Jason Lowe-Power

Gurindar S. Sohi

Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM, Williamsburg, Virginia, USA(2018)

Google Scholar

Abstract

Heterogeneous computing with GPUs integrated on the same chip as CPUs is ubiquitous, and to increase programmability many of these systems support virtual address accesses from GPU hardware. However, this entails address translations on every memory access. We observe that future GPUs and workloads show very high bandwidth demands (up to 4 accesses per cycle in some cases) of shared address translation hardware due to frequent private TLB misses. This greatly impacts performance (1.47× slowdown over an ideal MMU). To mitigate this overhead, we propose a software-agnostic, practical, GPU virtual cache hierarchy. We use the virtual cache hierarchy as an effective address translation bandwidth filter. We observe many requests that miss in private TLBs find corresponding valid data in the GPU cache hierarchy. With a GPU virtual cache hierarchy, these TLB misses can be filtered (i.e., virtual cache hits), significantly reducing bandwidth demands for the shared address translation hardware. In addition, accelerator-specific attributes (e.g., less likelihood of synonyms) of GPUs reduce the design complexity of virtual caches, making a whole virtual cache hierarchy (including a shared L2 cache) practical for GPUs. Our evaluation shows that the entire GPU virtual cache hierarchy effectively filters the high address translation bandwidth, achieving the same performance as an ideal MMU. We also evaluate L1-only virtual cache designs and show that using a whole virtual cache hierarchy obtains additional performance benefits (1.31× speedup on average).

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Filtering Translation Bandwidth with Virtual Caching

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Filtering Translation Bandwidth with Virtual Caching

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities