Operations research

Operations Research groups solve the toughest optimization problems both inside and outside Google.

Woman working at a computer

Operations Research groups solve the toughest optimization problems both inside and outside Google.

About the team

Operations Research groups are involved in many areas throughout Google, running the gamut from fundamental research to enterprise-grade engineering. We are software engineers, research scientists, and data scientists who use integer programming, linear programming, constraint programming, and graph algorithms to solve problems at scale.

Across Google, Operations Research tackles challenges in areas as diverse as transportation, search, natural language understanding, machine vision, datacenter design, and robotics. With a strong commitment to open source, we're actively involved in helping solve problems outside Google as well in areas such as aviation and health care.

Team focus summaries

Vehicle Routing

Given a fleet of vehicles and a set of tasks to be performed, what are good routes to take? Such problems crop up frequently in government, logistics, manufacturing, and retail.

Flow and Graph Algorithms

Networks are at the core of many engineering problems. How can you maximize the throughput of a computer network? How can you load-balance scarce resources? Algorithms like max flow and min-cost flow provide the starting points for systems that need to move items through a complex network.

Linear Programming

Operations Research began with a seemingly simple question: how can you solve a large set of linear inequalities as efficiently as possible? Research continues to this day.

Mixed-Integer Programming

Many optimization problems involve discrete variables: people, places, and things. A mixed-integer programming problem generalizes linear programming to include discrete variables, with applications to supply chain management, scheduling, bin-packing problems, and much more.

Constraint Programming

It's often convenient to express optimization problems simply as a set of constraints, variables, and a function to be minimized or maximized. Constraint propagation, aided by heuristics and local search, are used to manage exponentially large search trees.

Featured publications

Fast Routing in Very Large Public Transportation Networks Using Transfer Patterns
Hannah Bast
Erik Carlsson
Veselin Raychev
Algorithms - ESA 2010, 18th Annual European Symposium. Proceedings, Part I, Springer, pp. 290-301
Preview abstract We show how to route on very large public transportation networks (up to half a billion arcs) with average query times of a few milliseconds. We take into account many realistic features like: traffic days, walking between stations, queries between geographic locations instead of a source and a target station, and multi-criteria cost functions. Our algorithm is based on two key observations: (1) many shortest paths share the same transfer pattern, i.e., the sequence of stations where a change of vehicle occurs; (2) direct connections without change of vehicle can be looked up quickly. We precompute the respective data; in practice, this can be done in time linear in the network size, at the expense of a small fraction of non-optimal results. We have accelerated public transportation routing on Google Maps with a system based on our ideas. We report experimental results for three data sets of various kinds and sizes. View details
Overcommitment in Cloud Services – Bin Packing with Chance Constraints
Maxime Cohen
Phil Keller
ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems 2017
Preview abstract This paper considers the traditional problem of resource allocation, i.e., scheduling jobs on machines. One such recent application is cloud computing, where jobs arrive in an online fashion with capacity requirements and need to be allocated to physical machines. It is often observed that the requested capacities are not fully utilized, hence offering an opportunity to employ an overcommitment policy, i.e., selling resources beyond capacity. Setting the right overcommitment level can induce a significant cost reduction for the cloud provider, while taking a very low risk of violating capacity constraints. We introduce and study a model that quantifies the value of overcommitment by modeling the problem as a bin packing with chance constraints. We then propose an alternative formulation that transforms each chance constraint into a sub-modular function. We show that our model captures the risk pooling effect and allows to guide scheduling and overcommitment decisions. We also develop a family of online algorithms that are intuitive, easy to implement and provide a constant factor guarantee from optimal. Finally, we calibrate our model using realistic workload data and test our approach in a practical setting. Our analysis and experiments illustrate the benefit of overcommitting in cloud services. View details
Preview abstract In this paper we introduce the semi-online model that generalizes the classical online computational model. The semi-online model postulates that the unknown future has a predictable part and an adversarial part; these parts can be arbitrarily interleaved. An algorithm in this model operates as in the standard online model, i.e., makes an irrevocable decision at each step. We consider bipartite matching in the semi-online model. Our main contributions are competitive algorithms for this problem and a near-matching hardness bound. The competitive ratio of the algorithms nicely interpolates between the truly offline setting (i.e., no adversarial part) and the truly online setting (i.e., no predictable part). View details
Optimal Content Placement for a Large-Scale VoD System
Vijay Gopalakrishnan
K.K. Ramakrishnan
IEEE/ACM Transactions on Networking, 24 (2016), pp. 2114-2127
Preview abstract IPTV service providers offering Video-on-Demand currently use servers at each metropolitan office to store all the videos in their library. With the rapid increase in library sizes, it will soon become infeasible to replicate the entire library at each office. We present an approach for intelligent content placement that scales to large library sizes (e.g., 100Ks of videos). We formulate the problem as a mixed integer program (MIP) that takes into account constraints such as disk space, link bandwidth, and content popularity. To overcome the challenges of scale, we employ a Lagrangian relaxation-based decomposition technique combined with integer rounding. Our technique finds a near-optimal solution (e.g., within 1-2%) with orders of magnitude speedup relative to solving even the LP relaxation via standard software. We also present simple strategies to address practical issues such as popularity estimation, content updates, short-term popularity fluctuation, and frequency of placement updates. Using traces from an operational system, we show that our approach significantly outperforms simpler placement strategies. For instance, our MIP-based solution can serve all requests using only half the link bandwidth used by LRU or LFU cache replacement policies. We also investigate the trade-off between disk space and network bandwidth. View details
Partitioning Orders in Online Shopping Services
Debmalya Panigrahi
Conf. on Information and Knowledge Management (CIKM) (2017)
Preview abstract The rapid growth of the sharing economy has led to the widespread use of newer and richer models of online shopping and delivery services. The race to deliver fast has transformed such services into complex networks of shoppers, stores, and consumers. Needless to say, the efficiency of the store order management is critical to the business. Motivated by this setting, we consider the following problem: given a set of online shopping orders each consisting of a few items, how to best partition the orders among a given number of pickers? Owing to logistical constraints the orders are typically unsplittable in the partition. This partitioning, taking the physical location of the items in the store , has to optimize the utilization and amount of work done by the shoppers in the store. Formulating this as a combinatorial optimization problem, we propose a family of simple and efficient algorithms that admit natural constraints arising in this setting. In addition to showing provable guarantees for the algorithms, we also demonstrate their efficiency in practice on real-world data from Google Express [1], outperforming natural baselines. View details
Preview abstract In this paper, we develop a new variant of $k$-means++ seeding that in expectation achieves a constant approximation guarantee. We obtain this result by a simple combination of $k$-means++ sampling with a local search strategy. We evaluate our algorithm empirically and show that it also improves the quality of a solution in practice. View details
Expect the Unexpected : Sub-Second Optimization for Segment Routing
Renaud Hartert
Stefano Vissicchio
Steven Gay
INFOCOM 2017
Preview abstract We study how to perform traffic engineering at extremely-small time scale with segment routing, addressing a critical need for modern wide area networks. Prior work has shown that segment routing enables to better engineer traffic thanks to its ability to create detours in forwarding paths, at scale. Two main approaches have been explored so far, respectively based on integer linear programming and constraint programming. However, none of the previous works deeply investigated how much the proposed algorithms allow network controllers to quickly react to unexpected traffic changes and failures. We highlight limitations of previous algorithms in terms of required execution time. We therefore propose a new approach, based on local search that support recomputation of segment-routing paths complying with load requirements. Our experiments show that our solution is able to answer critical network situations in (sub-)seconds on large IP network. View details
Capacity planning for the Google backbone network
Ajay Kumar Bangla
Ben Preskill
Christoph Albrecht
Emilie Danna
Xiaoxue Zhao
ISMP 2015 (International Symposium on Mathematical Programming) (to appear)
Preview abstract Google operates one of the largest backbone networks in the world. In this talk, we present optimization and simulation techniques we use to design the network topology and provision its capacity to achieve conflicting objectives such as scale, cost, availability, and latency. View details
Preview abstract Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification, data summarization, and coverage problems. In this work, we study this question in the context of data streams, where elements arrive one at a time, and we want to design low-memory and fast update-time algorithms that maintain a good solution. Specifically, we focus on the sliding window model, where we are asked to maintain a solution that considers only the last $W$ items. In this context, we provide the first non-trivial algorithm that maintains a provable approximation of the optimum using space sublinear in the size of the window. In particular we give a $\nicefrac{1}{3} - \epsilon$ approximation algorithm that uses space polylogarithmic in the spread of the values of the elements, $\Spread$, and linear in the solution size $k$ for any constant $\epsilon > 0$. At the same time, processing each element only requires a polylogarithmic number of evaluations of the function itself. When a better approximation is desired, we show a different algorithm that, at the cost of using more memory, provides a $\nicefrac{1}{2} - \epsilon$ approximation, and allows a tunable trade-off between average update time and space. This algorithm matches the best known approximation guarantees for submodular optimization in insertion-only streams, a less general formulation of the problem. We demonstrate the efficacy of the algorithms on a number of real world datasets, showing that their practical performance far exceeds the theoretical bounds. The algorithms preserve high quality solutions in streams with millions of items, while storing a negligible fraction of them. View details
Cache-aware load balancing of data center applications
Aaron Schild
Ray Yang
Richard Zhuang
Proceedings of the VLDB Endowment, 12 (2019), pp. 709-723
Preview abstract Our deployment of cache-aware load balancing in the Google web search backend reduced cache misses by ~0.5x, contributing to a double-digit percentage increase in the throughput of our serving clusters by relieving a bottleneck. This innovation has benefited all production workloads since 2015, serving billions of queries daily. A load balancer forwards each query to one of several identical serving replicas. The replica pulls each term's postings list into RAM from flash, either locally or over the network. Flash bandwidth is a critical bottleneck, motivating an application-directed RAM cache on each replica. Sending the same term reliably to the same replica would increase the chance it hits cache, and avoid polluting the other replicas' caches. However, most queries contain multiple terms and we have to send the whole query to one replica, so it is not possible to achieve a perfect partitioning of terms to replicas. We solve this via a voting scheme, whereby the load balancer conducts a weighted vote by the terms in each query, and sends the query to the winning replica. We develop a multi-stage scalable algorithm to learn these weights. We first construct a large-scale term-query graph from logs and apply a distributed balanced graph partitioning algorithm to cluster each term to a preferred replica. This yields a good but simplistic initial voting table, which we then iteratively refine via cache simulation to capture feedback effects. View details

Some of our people