Type Migration in Ultra-Large-Scale Codebases

Ali Mesbah
Ameya Ketkar
Danny Dig
Davood Mazinanian
Eddie Aftandilian
International Conference on Software Engineering (ICSE) (2019) (to appear)

Abstract

Type migration is a refactoring activity in which an existing type is replaced with another one throughout the source code. Manually performing type migration is tedious as programmers need to find all instances of the type to be migrated, along with its dependencies that propagate over assignment operations, method hierarchies, and subtypes. Existing automated approaches for type migration are not adequate for ultra-large-codebases – they perform an intensive whole-program analysis that does not scale. If we could represent the type structure of the program as graphs, then we could employ a MapReduce parallel and distributed process that scales to hundreds of millions of LOC. We implemented this approach as an IDE-independent tool called T2R, which integrates with most build systems. We evaluated T2R’s accuracy, usefulness and scalability on seven open source projects and one proprietary codebase of 300M LOC. T2R generated 130 type migration patches, of which the original developers accepted 98%.