DeduBB: Binary Code Size Reduction via Post-Link Basic Block De-duplication

Chaitanya Mamatha Ananda
Rajiv Gupta
Mahbod Afarin
Han Shen
LCTES (Languages, Compilers, Tools and Theory of Embedded Systems) (2026) (to appear)
Google Scholar

Abstract

Binary sizes of newer versions of software applications tend to be larger, primarily due to feature bloat. This poses various challenges, particularly for mobile applications. It affects
upgrade rates directly impacting revenues, increases maintenance costs of supporting multiple versions, and prevents some users from getting critical security fixes. Code bloat also poses a problem for large warehouse-scale applications. Such applications experience performance degradation when their code size exceeds what smaller and more efficient code
models can handle.

In this paper, we introduce a post-link optimization tech nique called DeduBB, which deduplicates basic blocks of an application across procedure boundaries. While prior tech-
niques used function outlining to de-duplicate redundant code sequences, it missed out on many opportunities as it cannot handle code that manipulates the program stack. In addition, previous techniques were either limited to the scope of a module or lacked scalable implementations required to handle large warehouse-scale applications. Our technique,
DeduBB, handles all types of code duplication as we use a novel save-and-jump code pattern to execute de-duplicated code blocks. In addition, DeduBB has been designed to work on scalable post-link optimizers and can even be applied to large warehouse-scale datacenter applications. Finally, DeduBB is profile-guided and can be applied selectively to infrequently executed cold basic blocks to not affect application performance. In fact, in several cases, the performance
of the smaller application binary improves due to reductions in its hot working set size. We have implemented our technique on the state-of-the-art post link optimizers, BOLT and Propeller. Experiments show that we can significantly reduce the code size of several benchmarks by 1.55% to 18.63%, on both Arm and x86 platforms, and on binaries that have already been heavily optimized for size using existing code size reduction features. Furthermore, aided by profiles, our technique can retain more than 80% of the maximal code size savings without affecting performance.
×