Maintaining large codebases can be a challenging endeavour. As new libraries, APIs and standards are introduced, old code is migrated to use them. To provide as clean and succinct an interface as possible for developers, old APIs are ideally removed as new ones are introduced. In practice, this becomes difficult as automatically finding and transforming code in a semantically correct way can be challenging, particularly as the size of a codebase increases.
In this paper, we present a real-world implementation of a system to refactor large C++ codebases efficiently. A combination of the Clang compiler framework and the MapReduce parallel processor, ClangMR enables code maintainers to easily and correctly transform large collections of code. We describe the motivation behind such a tool, its implementation and then present our experiences using it in a recent API update with Google’s C++ codebase.