# Aaron Archer

Aaron is a Research Scientist at Google NYC, where he leads the Large-Scale Optimization research team, which is part of the broader NYC Algorithms and Optimization team. He came to Google in 2013 after 9+ years in the Algorithms and Optimization department at the AT&T Shannon Research Laboratory. Aaron earned degrees in Operations Research (M.S. 2002, Ph.D. 2004) from Cornell University, where he was advised by Eva Tardos and supported by a Hertz Fellowship. Prior to that, he studied Mathematics (B.S. 1998) at Harvey Mudd College, where he graduated first in his class.
Aaron has pursued theoretical, experimental and applied research in diverse areas including mathematical programming (e.g., linear, integer, and convex programming), approximation algorithms, graph algorithms, algorithmic game theory, online algorithms, network design, load balancing, and unsupervised machine learning (e.g., clustering). He particularly enjoys finding effective ways to model complex real-world optimization problems. His primary mission at Google has been to apply these and other techniques to improve the efficiency of Google's computational infrastructure, such as the backend for websearch.
Aaron has also made small forays into graphics, machine vision, combinatorics, graph theory, computational ecology, and mathematical rap.

Authored Publications

Google Publications

Other Publications

Sort By

Practical Performance Guarantees for Pipelined DNN Inference

Kuikui Liu

Prakash Prabhu

Proceedings of the 41st International Conference on Machine Learning(2024) (to appear)

Preview abstract
This work optimizes pipeline parallelism of machine learning model inference by
partitioning computation graphs into $k$ stages and minimizing the running time of the bottleneck stage.
We design practical algorithms for this NP-complete problem
and prove they are nearly optimal in practice by comparing against lower bounds
obtained from solving novel mixed-integer programming (MIP) formulations.
We apply these algorithms and lower-bound techniques
to production models to achieve substantial improvements in the approximation guarantees,
compared to simple combinatorial lower bounds.
For example, our new MIP formulations improve the lower bounds enough to
drop the geometric mean approximation ratio from $2.175$ to $1.082$ across
production data with $k=16$ pipeline stages.
This work shows that while bottleneck partitioning is theoretically hard,
in practice we have a handle on the algorithmic side of the problem and
much of the remaining challenge is in developing more accurate cost models
to give to the partitioning algorithms.
View details

Cache-aware load balancing of data center applications

Aaron Schild

Ray Yang

Richard Zhuang

Proceedings of the VLDB Endowment, 12(2019), pp. 709-723

Preview abstract
Our deployment of cache-aware load balancing in the Google web search backend reduced cache misses by ~0.5x, contributing to a double-digit percentage increase in the throughput of our serving clusters by relieving a bottleneck. This innovation has benefited all production workloads since 2015, serving billions of queries daily.
A load balancer forwards each query to one of several identical serving replicas. The replica pulls each term's postings list into RAM from flash, either locally or over the network. Flash bandwidth is a critical bottleneck, motivating an application-directed RAM cache on each replica. Sending the same term reliably to the same replica would increase the chance it hits cache, and avoid polluting the other replicas' caches. However, most queries contain multiple terms and we have to send the whole query to one replica, so it is not possible to achieve a perfect partitioning of terms to replicas.
We solve this via a voting scheme, whereby the load balancer conducts a weighted vote by the terms in each query, and sends the query to the winning replica. We develop a multi-stage scalable algorithm to learn these weights. We first construct a large-scale term-query graph from logs and apply a distributed balanced graph partitioning algorithm to cluster each term to a preferred replica. This yields a good but simplistic initial voting table, which we then iteratively refine via cache simulation to capture feedback effects.
View details

Wireless coverage prediction via parametric shortest paths

David Applegate

David S. Johnson

Evdokia Nikolova

Mikkel Thorup

Proceedings of the Nineteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc'18), ACM(2018), pp. 221-230

Preview abstract
When deciding where to place access points in a wireless network, it is useful to model the signal propagation loss between a proposed antenna location and the areas it may cover. The indoor dominant path (IDP) model, introduced by Wölfle et al., is shown in the literature to have good validation and generalization error, is faster to compute than competing methods, and is used in commercial software such as WinProp, iBwave Design, and CellTrace. Previously, the algorithms known for computing it involved a worst-case exponential-time tree search, with pruning heuristics to speed it up.
We prove that the IDP model can be reduced to a parametric shortest path computation on a graph derived from the walls in the floorplan. It therefore admits a quasipolynomial-time (i.e., nO(logn)) algorithm. We also give a practical approximation algorithm based on running a small constant number of shortest path computations. Its provable worst-case additive error (in dB) can be made arbitrarily small via appropriate choices of parameters, and is well below 1dB for reasonable choices. We evaluate our approximation algorithm empirically against the exact IDP model, and show that it consistently beats its theoretical worst-case bounds, solving the model exactly (i.e., no error) in the vast majority of cases.
View details

Optimal Content Placement for a Large-Scale VoD System

David Applegate

Vijay Gopalakrishnan

K.K. Ramakrishnan

IEEE/ACM Transactions on Networking, 24(2016), pp. 2114-2127

Preview abstract
IPTV service providers offering Video-on-Demand currently use servers at each metropolitan office to store all the videos in their library. With the rapid increase in library sizes, it will soon become infeasible to replicate the entire library at each office. We present an approach for intelligent content placement that scales to large library sizes (e.g., 100Ks of videos). We formulate the problem as a mixed integer program (MIP) that takes into account constraints such as disk space, link bandwidth, and content popularity. To overcome the challenges of scale, we employ a Lagrangian relaxation-based decomposition technique combined with integer rounding. Our technique finds a near-optimal solution (e.g., within 1-2%) with orders of magnitude speedup relative to solving even the LP relaxation via standard software. We also present simple strategies to address practical issues such as popularity estimation, content updates, short-term popularity fluctuation, and frequency of placement updates. Using traces from an operational system, we show that our approach significantly outperforms simpler placement strategies. For instance, our MIP-based solution can serve all requests using only half the link bandwidth used by LRU or LFU cache replacement policies. We also investigate the trade-off between disk space and network bandwidth.
View details

Content placement via the exponential potential function method

David Applegate

Vijay Gopalakrishnan

K.K. Ramakrishnan

Integer Programming and Combinatorial Optimization - 16th International Conference, IPCO 2013, Springer, pp. 49-61

Leveraging video viewing patterns for optimal content placement

Kyung-Wook Hwang

David Applegate

Vijay Gopalakrishnan

Vishal Misra

K. K. Ramakrishnan

Deborah F. Swayne

NETWORKING 2012 - 11th International IFIP TC 6 Networking Conference Proceedings, Part II, pp. 44-58

Combining predictors for recommending music: the False Positives' approach to KDD Cup track 2

Suhrid Balakrishnan

Rensheng Wang

Carlos Eduardo Scheidegger

Angus MacLellan

Yifan Hu

Shankar Krishnan

David Applegate

Guangqin Ma

S. Tom Au

JMLR Proceedings, 18(2012), pp. 199-213

Improved Approximation Algorithms for Prize-Collecting Steiner Tree and TSP

MohammadTaghi Hajiaghayi

Howard Karloff

SIAM Journal on Computing, 40(2)(2011), pp. 309-332

Voting power and target-based site prioritization

Steven J. Phillips

Robert L. Pressey

Desmond Torkornoo

David Applegate

David Johnson

Matthew E. Watts

Biological Conservation, 143(2010), pp. 1989-1997

Improved approximation algorithms for the minimum latency problem via prize-collecting strolls

Anna Blasiak

Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, SIAM, pp. 429-447

Importance sampling via load-balanced facility location

Shankar Krishnan

Integer Programming and Combinatorial Optimization, 13th International Conference, IPCO 2008, Springer, pp. 316-330

Characterizing truthful mechanisms with convex type spaces

A faster, better approximation algorithm for the minimum latency problem

Optimizing dispersal corridors for the Cape Proteaceae using network flow

Steven J. Phillips

Paul Williams

Guy Midgley

Ecological Applications, 18(2008), pp. 1200-1211

The 15 puzzle: How it drove the world crazy, by Jerry Slocum and Dic Sonneveld [book review]

Mathematical Intelligencer, 29(2007), pp. 83-85

Frugal path mechanisms

Approximation and collusion in multicast cost sharing

Joan Feigenbaum

Arvind Krishnamurthy

Rahul Sami

Scott Shenker

Games and Economic Behavior, 47(2004), pp. 36-71

Approximate classification via earthmover metrics

Jittat Fakcharoenphol

Robert Krauthgamer

Kunal Talwar

Eva Tardos

Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, SIAM, pp. 1079-1087

An approximate truthful mechanism for combinatorial auctions with single parameter agents

Christos H. Papadimitriou

Kunal Talwar

Eva Tardos

Internet Mathematics, 1(2003), pp. 129-150

Lagrangian relaxation for the k-median problem: new insights and continuity properties

Ranjithkumar Rajagopalan

David B. Shmoys

Algorithms - ESA 2003, 11th Annual European Symposium, Springer, pp. 31-42

Truthful mechanisms for one-parameter agents

Eva Tardos

42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, pp. 482-491

Two O(log* k)-approximation algorithms for the asymmetric k-center problem

Integer Programming and Combinatorial Optimization, 8th International IPCO Conference(2001), pp. 1-14

March Mathness: an analysis of a nonstandard basketball pool

On the upper chromatic numbers of the reals

Discrete Mathematics, 214(2000), pp. 65-75

A modern treatment of the 15 puzzle

American Mathematical Monthly, 106(1999), pp. 793-799

A case for stricter grading