Michael Lippautz

Michael Lippautz

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Concurrent Marking of Shape-Changing Objects
    Ulan Degenbaev
    Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management, ACM, New York, NY, USA, pp. 89-102
    Preview abstract Efficient garbage collection is a key goal in engineering high-performance runtime systems. To reduce pause times, many collector designs traverse the object graph concurrently with the application, an optimization known as concurrent marking. Traditional concurrent marking imposes strict invariants on the object shapes: 1) static type layout of objects, 2) static object memory locations, 3) static object sizes. High performance virtual machines for dynamic languages, for example, the V8 JavaScript virtual machine used in the Google Chrome web browser, generally violate these constraints in pursuit of high throughput for a single thread. Taking V8 as an example, we show that some object shape changes are safe and can be handled by traditional concurrent marking algorithms. For unsafe shape changes, we introduce novel wait-free object snapshotting and lock-based concurrent marking algorithms and prove that they preserve key invariants. We implemented both algorithms in V8 and achieved performance improvements on various JavaScript benchmark suites and real-world web workloads. Concurrent marking of shape-changing objects using the wait-free object snapshotting algorithm is enabled by default in Chrome since version 64. View details
    Preview abstract A collaborative approach to reclaiming memory in heterogeneous software systems. View details
    Cross-Component Garbage Collection
    Ulan Degenbaev
    Proceedings of the ACM on Programming Languages, 2 Issue OOPSLA(2018), 151:1-151:24
    Preview abstract Embedding a modern language runtime as a component in a larger software system is popular these days. Communication between these systems often requires keeping references to each others' objects. In this paper we present and discuss the problem of cross-component memory management where reference cycles across component boundaries may lead to memory leaks and premature reclamation of objects may lead to dangling cross-component references. We provide a generic algorithm for effective, efficient, and safe garbage collection over component boundaries, which we call cross-component tracing. We designed and implemented cross-component tracing in the Chrome web browser where the JavaScript virtual machine V8 is embedded into the rendering engine Blink. Cross-component tracing from V8's JavaScript heap to Blink's C++ heap improves garbage collection latency and eliminates long-standing memory leaks for real websites in Chrome. We show how cross-component tracing can help web developers to reason about reachability and retainment of objects spanning both V8 and Blink components based on Chrome's heap snapshot memory tool. Cross-component tracing was enabled by default for all websites in Chrome version 57 and is also deployed in other widely used software systems such as Opera, Cobalt, and Electron. View details
    Fast, Multicore-scalable, Low-fragmentation Memory Allocation Through Large Virtual Memory and Global Data Structures
    Martin Aigner
    Christoph M. Kirsch
    Ana Sokolova
    Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, ACM, New York, NY, USA, pp. 451-469
    Preview abstract We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a concurrent allocator that generally performs and scales in our experiments better than other allocators while using less memory, and is still competitive otherwise. The main ideas behind the design of scalloc are: uniform treatment of small and big objects through so-called virtual spans, efficiently and effectively reclaiming free memory through fast and scalable global data structures, and constant-time (modulo synchronization) allocation and deallocation operations that trade off memory reuse and spatial locality without being subject to false sharing. View details
    Distributed Queues in Shared Memory - Multicore Performance and Scalability through Quantitative Relaxation
    Andreas Haas
    Thomas A. Henzinger
    Ana Sokolova
    Christoph M. Kirsch
    Ali Sezgin
    Proceedings of the ACM International Conference on Computing Frontiers, ACM, New York, NY, USA(2013), 17:1-17:9
    Preview abstract A prominent remedy to multicore scalability issues in concurrent data structure implementations is to relax the sequential specification of the data structure. We present distributed queues (DQ), a new family of relaxed concurrent queue implementations. DQs implement relaxed queues with linearizable emptiness check and either configurable or bounded out-of-order behavior or pool behavior. Our experiments show that DQs outperform and outscale in micro- and macrobenchmarks all strict and relaxed queue as well as pool implementations that we considered. View details
    Short-term Memory for Self-collecting Mutators
    Martin Aigner
    Andreas Haas
    Christoph M. Kirsch
    Ana Sokolova
    Stephanie Stroka
    Andreas Unterweger
    Proceedings of the International Symposium on Memory Management, ACM, New York, NY, USA(2011), pp. 99-108
    Preview abstract We propose a new memory model called short-term memory for managing objects on the heap. In contrast to the traditional persistent memory model for heap management, objects in short-term memory expire after a finite amount of time, which makes deallocation unnecessary. Instead, expiration of objects may be extended, if necessary, by refreshing. We have developed a concurrent, incremental, and non-moving implementation of short-term memory for explicit refreshing called self-collecting mutators that is based on programmer-controlled time and integrated into state-of-the-art runtimes of three programming languages: C, Java, and Go. All memory management operations run in constant time without acquiring any locks modulo the underlying allocators. Our implementation does not require any additional heap management threads, hence the name. Expired objects may be collected anywhere between one at a time for maximal incrementality and all at once for maximal throughput and minimal memory consumption. The integrated systems are heap management hybrids with persistent memory as default and short-term memory as option. Our approach is fully backwards compatible. Legacy code runs without any modifications with negligible runtime overhead and constant per-object space overhead. Legacy code can be modified to take advantage of short-term memory by having some but not all objects allocated in short-term memory and managed by explicit refreshing. We study single- and multi-threaded use cases in all three languages macro-benchmarking C and Java and micro-benchmarking Go. Our results show that using short-term memory (1) simplifies heap management in a state-of-the-art H.264 encoder written in C without additional time and minor space overhead, and (2) improves, at the expense of safety, memory management throughput, latency, and space consumption by reducing the number of garbage collection runs, often even to zero, for a number of Java and Go programs. View details