
Zhe Feng
I am a Research Scientist in the Market Algorithms group at Google Research, Mountain View. I got my PhD in Computer Science at Harvard University in 2021, where I was fortunately advised by David C. Parkes. My research broadly lies in the intersection between economics and computer science, and mainly focus on understanding how to design better mechanisms through machine learning approaches. See my personal website for more information.
Authored Publications
Sort By
Preview abstract
We consider the Coalition Structure Learning (CSL) problem in multi-agent systems, motivated by the existence of coalitions in many real-world systems, e.g., trading platforms and auction systems. In this problem, there is a hidden coalition structure within a set of $n$ agents, which affects the behavior of the agents in games. Our goal is to actively design a sequence of games
for the agents to play, such that observations in these games can be used to learn the hidden coalition structure. In particular, we consider the setting where in each round, we design and present a game together with a strategy profile to the agents, and receive a multiple-bit observation -- for each agent, we observe whether or not they would like to deviate from the specified strategy in this given game. Our contributions are three-fold: First, we show that we can learn the coalition structure in $O(\log n)$ rounds if we are allowed to choose any normal-form game in each round, matching the information-theoretical lower bound, and the result can be extended to congestion games. Second, in a more restricted setting where we can only choose a graphical game with degree limit $d$, we develop an algorithm to learn the coalition structure in $O(n/d+\log d)$ rounds. Third, when we can only learn the coalition structure through running second-price auctions with personalized reserve prices, we show that the coalition structure can be learned in $O(c\log n)$ rounds, where $c$ is the size of the largest coalition.
View details
Preview abstract
We consider the problem of auto-bidding in online advertising from the perspective of a single advertiser. The goal of the advertiser is to maximize their value under the Return-on-Spend (RoS) constraint, with performance measured in terms of \emph{regret} against the optimal offline solution that knows all queries a priori. Importantly, the value of the item is \textit{unknown} to the bidder ahead of time. The goal of the bidder is to quickly identify the optimal bid, while simultaneously satisfying budget and RoS constraints. Using a simple UCB-style algorithm, we provide the first result which achieves optimal regret and constraint violation for this problem.
View details
Preview abstract
We initiate the study of centralized algorithms for welfare-maximizing allocation of goods to buyers subject to average-value constraints. We show that this problem is NP-hard to approximate beyond a factor of e/(e-1), and provide a 4e/(e-1)-approximate offline algorithm. For the online setting, we show that no non-trivial approximations are achievable under adversarial arrivals. Under i.i.d. arrivals, we present a polytime online algorithm that provides a constant approximation of the optimal (computationally-unbounded) online algorithm. In contrast, we show that no constant approximation of the ex-post optimum is achievable by an online algorithm.
View details
Preview abstract
We study an auction setting in which bidders bid for placement of their content within
a summary generated by a large language model (LLM), e.g., an ad auction in which the
display is a summary paragraph of multiple ads. This generalizes the classic ad settings such
as position auctions to an LLM generated setting, which allows us to handle general display
formats. We propose a novel factorized framework in which an auction module and an LLM
module work together via a prediction model to provide welfare maximizing summary outputs
in an incentive compatible manner. We provide a theoretical analysis of this framework and
synthetic experiments to demonstrate the feasibility and validity of the system together with
welfare comparisons.
View details
Preview abstract
In this paper, we investigate a problem of \emph{actively} learning threshold in latent space, where the \emph{unknown} reward $g(\gamma, v)$ depends on the proposed threshold $\gamma$ and latent value $v$ and it can be \emph{only} achieved if the threshold is lower than or equal to the \emph{unknown} latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring, etc. We first characterize the query complexity of learning a threshold with the expected reward at most $\eps$ smaller than the optimum and prove that the number of queries needed can be infinitely large even when $g(\gamma, v)$ is monotone with respect to both $\gamma$ and $v$. On the positive side, we provide a tight query complexity $\Tilde{\Theta}(1/\eps^3)$ when $g$ is monotone and the CDF of value distribution is Lipschitz. Moreover, we show a tight $\Tilde{\Theta}(1/\eps^3)$ query complexity can be achieved as long as $g$ satisfies one-sided Lipschitzness, which provides a complete characterization for this problem. Finally, we extend this model to an online learning setting and demonstrate a tight $\Theta(T^{2/3})$ regret bound using continuous-arm bandit techniques and the aforementioned query complexity results.
View details
Auto-bidding and Auctions in Online Advertising: A Survey
Ashwinkumar Badanidiyuru Varadaraja
Christopher Liaw
Haihao (Sean) Lu
Andres Perlroth
Georgios Piliouras
Ariel Schvartzman
Kelly Spendlove
Hanrui Zhang
Mingfei Zhao
ACM SIGecom Exchanges, 22 (2024)
Preview abstract
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
View details
Preview abstract
We propose a new Markov Decision Process (MDP) model for ad auctions to capture the
user response to the quality of ads, with the objective of maximizing the long-term discounted
revenue. By incorporating user response, our model takes into consideration all three parties
involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a
user-specific click-through rate (CTR) with the CTR changing in the next round according to the
set of ads shown to the user in the current round. We characterize the optimal mechanism for this MDP as a Myerson’s auction with a notion of modified virtual value, which relies on the value distribution of the advertiser, the current user state, and the future impact of showing the ad to the user. Leveraging this characterization, we design a sample-efficient and computationally-efficient algorithm which outputs an approximately optimal policy that requires only sample access to the true MDP and the value distributions of the bidders. Finally, we propose a simple mechanism built upon second price auctions with personalized reserve prices and show it can achieve a constant-factor approximation to the optimal long term discounted revenue.
View details
A Field Guide for Pacing Budget and ROS Constraints
Haihao (Sean) Lu
ICML (2024)
Preview abstract
Budget pacing has been a standard service offered by major Internet advertising platforms for quite some time now. Budget pacing systems seek to optimize advertiser returns subject to budget constraints, through smooth spending of advertiser budgets. In the past few years, autobidding products that provide value-optimizing real-time bidding subject to return-on-spend (ROS) constraints as a service to advertisers have seen a prominent rise in adoption. The algorithms that govern these two services, namely bidding and budgeting, are not necessarily always a single unified entity that optimizes a global objective. But should these algorithms jointly optimize? How do the separate and joint optimizations compare? Systematically answering these questions, with both theoretical analysis and empirical studies is the focus of this work.
We compare (a) the sequential algorithm that first constructs the advertiser's ROS-pacing bid and then lowers that bid for budget pacing, with (b) the optimal joint algorithm that optimizes advertiser returns subject to both budget and ROS constraints. We establish the superiority of joint optimization both theoretically as well as empirically based on data from a large advertising platform. In the process, we identify a third algorithm that retains the theoretical properties of the joint optimization algorithm, while performing almost as well empirically as the joint optimization algorithm. This algorithm eases the transition from a sequential to a fully joint implementation by minimizing the amount of interaction between the two services.
View details
Preview abstract
We study the design of loss functions for click-through rates (CTR) to optimize (social) welfare in ad auctions. Existing works either only focus on CTR predictions without consideration of business objectives (e.g. welfare) in auctions or assume that the distribution over the participants' expected cost-per-impression (eCPM) is known a priori and uses various additional assumptions on the parametric form of the distribution to derive loss functions for predicting CTRs. In this work, we bring back the welfare objectives of ad auctions into CTR predictions and propose a novel weighted rankloss to train CTR model. Compared with the previous literature, our approach provides a provable guarantee on welfare but without any assumptions on the eCPMs' distribution, while also avoiding the intractability of naively applying existing learning-to-rank methods. We further propose a theoretically justifiable technique for calibrating the losses using labels generated from a teacher network, only assuming that the teacher network has bounded $\ell_2$ generalization error. Finally, we demonstrate the practical advantages of the proposed loss on both synthetic and real-world data.
View details
Learning to Bid in Contextual First Price Auctions
Ashwinkumar Badanidiyuru Varadaraja
Guru Prashanth Guruganesh
The Proceedings of the ACM Web Conference 2023
Preview abstract
In this paper, we investigate the problem about how to bid in repeated contextual first price
auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions:
at each time t, the learner observes a context xt ∈ R
d and decides the bid based on historical
information and xt. We assume a structured linear model of the maximum bid of all the others
mt = α0 · xt + zt, where α0 ∈ R
d
is unknown to the learner and zt is randomly sampled from a
noise distribution F with log-concave density function f. We consider both binary feedback (the
learner can only observe whether she wins or not) and full information feedback (the learner can
observe mt) at the end of each time t. For binary feedback, when the noise distribution F is
known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method
to achieve at most Oe(
p
log(d)T ) regret. Moreover, we generalize this algorithm to the setting
with binary feedback and the noise distribution is unknown but belongs to a parametrized family
of distributions. For the full information feedback with unknown noise distribution, we provide
an algorithm that achieves regret at most Oe(
√
dT). Our approach combines an estimator for
log-concave density functions and then MLE method to learn the noise distribution F and linear
weight α0 simultaneously. We also provide a lower bound result such that any bidding policy
in a broad class must achieve regret at least Ω(√
T), even when the learner receives the full
information feedback and F is known.
View details