- Tianwei Sheng
- Neil Vachharajani
- Stephane Eranian
- Robert Hundt
Concurrency bugs, particularly data races, are notoriously difficult to debug and are a significant source of unreliability in multithreaded applications. Many tools to catch data races rely on program instrumentation to obtain memory instruction traces. Unfortunately, this instrumentation introduces significant runtime overhead, is extremely invasive, or has a limited domain of applicability making these tools unsuitable for many production systems. Consequently, these tools are typically used during application testing where many data races go undetected.
This paper proposes RACEZ, a novel race detection mechanism which uses a sampled memory trace collected by the hardware performance monitoring unit rather than invasive instrumentation. The approach introduces only a modest overhead making it usable in production environments. We validate RACEZ using two open source server applications and the PARSEC benchmarks. Our experiments show that RACEZ catches a set of known bugs with reasonable probability while introducing only 2.8% runtime slow down on average.