Jump to Content
Sriraman Tallam

Sriraman Tallam

Sriraman Tallam is a member of the GCC compiler optimization team at Google, Mountain View, CA where he works on compiler techniques to improve the performance and reduce the code size of applications. He obtained a PhD in Computer Science from the University of Arizona.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract While profile guided optimizations (PGO) and link time optimiza-tions (LTO) have been widely adopted, post link optimizations (PLO)have languished until recently when researchers demonstrated that late injection of profiles can yield significant performance improvements. However, the disassembly-driven, monolithic design of post link optimizers face scaling challenges with large binaries andis at odds with distributed build systems. To reconcile and enable post link optimizations within a distributed build environment, we propose Propeller, a relinking optimizer for warehouse scale work-loads. To enable flexible code layout optimizations, we introduce basic block sections, a novel linker abstraction. Propeller uses basic block sections to enable a new approach to PLO without disassembly. Propeller achieves scalability by relinking the binary using precise profiles instead of rewriting the binary. The overhead of relinking is lowered by caching and leveraging distributed compiler actions during code generation. Propeller has been deployed to production at Google with over tens of millions of cores executing Propeller optimized code at any time. An evaluation of internal warehouse-scale applications show Propeller improves performance by 1.1% to 8% beyond PGO and ThinLTO. Compiler tools such as Clang improve by 7% while MySQL improves by 1%. Compared to the state of the art binary optimizer, Propeller achieves comparable performance while lowering memory overheads by 30%-70% on large benchmarks. View details
    Preview abstract We have found that large C++ applications and shared libraries tend to have many functions whose code is identical with another function. As much as 10% of the code could theoretically be eliminated by merging such identical functions into a single copy. This optimization, Identical Code Folding (ICF), has been implemented in the gold linker. At link time, ICF detects functions with identical object code and merges them into a single copy. ICF can be unsafe, however, as it can change the run-time behaviour of code that relies on each function having a unique address. To address this, ICF can be used in a safe mode where it identifies and folds functions whose addresses are guaranteed not to have been used in comparison operations. Further, profiling and debugging binaries with merged functions can be confusing, as the PC values of merged functions cannot be always disambiguated to point to the correct function. To address this, we propose a new call table format for the DWARF debugging information to allow tools like the debugger and profiler to disambiguate PC values of merged functions correctly by examining the call chain. Detailed experiments on the x86 platform show that ICF can reduce the text size of a selection of Google binaries, whose average text size is 64 MB, by about 6%. Also, the code size savings of ICF with the safe option is almost as good as the code savings obtained without the safe option. Further, experiments also show that the run-time performance of the optimized binaries on the x86 platform does not change. View details
    Dynamic Recognition of Synchronization Operations for Improved Data Race Detection
    Chen Tian
    Vijay Nagarajan
    Rajiv Gupta
    Proc. International Symposium on Software Testing and Analysis, ACM, Seattle (2008), pp. 143-154
    Preview
    No Results Found