In-Memory Performance for Big Data

Goetz Graefe; Haris Volos; Hideaki Kimura; Harumi Kuno; Joseph Tucek; Mark Lillibridge; Alistair Veitch

In-Memory Performance for Big Data

Goetz Graefe

Haris Volos

Hideaki Kimura

Harumi Kuno

Joseph Tucek

Mark Lillibridge

Alistair Veitch

Proceedings of the VLDB Endowment, 8 (2014), pp. 37-48

Download Google Scholar

Abstract

When a working set fits into memory, the overhead imposed by the buffer pool renders traditional databases non-competitive with in-memory designs that sacrifice the benefits of a buffer pool. However, despite the large memory available with modern hardware, data skew, shifting workloads, and complex mixed workloads make it difficult to guarantee that a working set will fit in memory. Hence, some recent work has focused on enabling in-memory databases to protect performance when the working data set almost fits in memory. Contrary to those prior efforts, we enable buffer pool designs to match in-memory performance while supporting the "big data" workloads that continue to require secondary storage, thus providing the best of both worlds. We introduce here a novel buffer pool design that adapts pointer swizzling for references between system objects (as opposed to application objects), and uses it to practically eliminate buffer pool overheads for memoryresident data. Our implementation and experimental evaluation demonstrate that we achieve graceful performance degradation when the working set grows to exceed the buffer pool size, and graceful improvement when the working set shrinks towards and below the memory and buffer pool sizes.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

In-Memory Performance for Big Data

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs