September 20, 2024
Haihao Lu, Research Scientist, Google Research, and Assistant Professor, MIT, and David Applegate, Principal Scientist, Google Research
This post describes the award winning product called PDLP, a new first-order method based solver for large-scale linear programming.
Classic linear programming (LP) problems are one of the most foundational problems in computer science and operations research. With extensive applications across numerous sectors of the global economy, such as manufacturing, networking, and other fields, LP has been the cornerstone of mathematical programming and has significantly influenced the development of today’s sophisticated modeling and algorithmic frameworks for data-driven decision making. If there's something to optimize, there's a good chance LP is involved.
Since the late 1940s, LP solving has evolved significantly, with the simplex method by Dantzig and various interior-point methods being the most prevalent techniques. Today's advanced commercial LP solvers utilize these methods but face challenges in scaling to very large instances due to computational demands. In response to this limitation, first-order methods (FOMs) have gained traction for large-scale LP problems.
With the above in mind, we introduce our solver PDLP (Primal-dual hybrid gradient enhanced for LP), a new FOM–based LP solver that significantly scales up our LP solving capabilities. Utilizing matrix-vector multiplication rather than matrix factorization, PDLP requires less memory and is more compatible with modern computational technologies like GPUs and distributed systems, offering a scalable alternative that mitigates the memory and computational inefficiencies of traditional LP methods. PDLP is open-sourced in Google’s OR-Tools. This project has been in development since 2018 [1, 2, 3], and we are proud to announce that it was co-awarded the prestigious Beale — Orchard-Hays Prize at the International Symposium of Mathematical Programming in July 2024. This accolade is one of the highest honors in the field of computational optimization, awarded every three years by the Mathematical Optimization Society.
Scaling the methods used in today’s state of the art LP solvers presents significant challenges. The primary computational limitations for both methods relate to matrix factorization required for solving linear equations, introducing two key challenges as problem sizes grow:
Given the above limitations associated with traditional LP methods, FOMs have emerged as a promising alternative for tackling large-scale LP problems. Unlike methods that rely on matrix factorization, FOMs utilize gradient information to iteratively update their solutions, with the primary computational requirement being matrix-vector multiplication. This distinction means that FOMs require only the storage of the LP instance itself, without needing additional memory to store factorized forms. Additionally, advances in FOMs for machine learning and deep learning have enhanced their scalability, making them highly efficient on modern computing platforms such as GPUs and distributed computing. This scalability and reduced memory dependency make FOMs particularly suitable for large and complex LP tasks where traditional methods may falter.
Primal-dual hybrid gradient (PDHG) is widely recognized for its application in image processing. When applied to LP, PDHG's primary computational demand involves matrix-vector multiplication, eliminating the need for matrix factorizations. This makes PDHG particularly efficient for large-scale computational tasks, but it is not reliable in solving LP. For example, in a benchmark of 383 instances, PDHG can only solve 113 instances to moderate accuracy.
To enhance PDHG’s reliability in solving LP problems, we have developed a modified approach called restarted PDHG. This version uses a two-loop structure where PDHG is run until a restarting condition is triggered, after which the average of the PDHG iterations is computed. The algorithm then restarts from this average point. This approach is visualized below where the trajectory of the standard PDHG is depicted with a blue line, the average iteration with a red line, and the restarted PDHG with a green line. Notably, the restarted PDHG shows a quicker convergence to the optimal solution, marked by a star on the plot.
The convergence behaviors of the PDHG and restarted PDHG, where the x-axis is the current solution of the LP, and the y-axis is the current solution of the dual LP.
The intuition behind this faster convergence is that by restarting from the computed average at the end of each spiral phase, the restarted PDHG effectively shortens the path to convergence. This strategy leverages the cyclical nature of the PDHG spirals to expedite the solution process.
We show in our research that this restarting technique can significantly speed up the convergence behaviors of PDHG for LP both in theory and in practice. This establishes restarted PDHG as a highly efficient and theoretically sound method for tackling LP challenges, reinforcing its utility and effectiveness in computational optimization.
We designed PDLP as a software package that can solve linear programming problems efficiently. The core algorithm of PDLP is based on the restarted PDHG, which we have enhanced significantly through five improvements:
PDLP is open-sourced as part of Google’s OR-Tools, an open-source software suite for optimization. The solver is easy to use and it has interfaces in Python, C++, Java and C#. More details and examples on how to use PDLP can be found in the OR-Tools documentation.
Scaling up and speeding up LP enables new applications — here, we briefly mention three:
Since its initial release, PDLP has attracted significant interest, leading to further enhancements. Here are some notable developments: cuPDLP.jl is an open-sourced GPU implementation of PDLP, written in Julia. The commercial solver company, Cardinal Optimizer, has incorporated PDLP into their software in Version 7.1 in January 2024. The open-source solver, HiGHS, has incorporated a version of PDLP in their software in V1.7.0 in March 2024. In addition, the academic community has continued to explore and expand upon the theoretical foundations of PDLP. Recent studies have focused on areas such as new analysis on PDHG, condition number theory, trajectory-based analysis, extensions to quadratic programming and semi-definite programming, etc. These efforts not only deepen the understanding of PDLP's underlying mechanics but also explore its potential applications to more complex problems. These developments reflect PDLP's significant impact on the field of optimization, bridging the gap between theoretical research and practical application. As PDLP continues to evolve, its influence is expected to grow, pushing the boundaries of what can be achieved in computational optimization.
We are grateful to our co-authors Mateo Diaz, Oliver Hinder, Miles Lubin, Brendan O’Donoghue, and Warren Schudy for their exceptional support and contributions. We would also like to thank our managers, Vahab Mirrokni, Jon Orwant and Aaron Archer, and our collaborators in the Data Center Networking team, the Algorithm and Optimization team and the Operations Research team.