Ashwinkumar Badanidiyuru Varadaraja
Authored Publications
Google Publications
Other Publications
Sort By
Learning to Bid in Contextual First Price Auctions
Guru Prashanth Guruganesh
The Proceedings of the ACM Web Conference 2023
Preview abstract
In this paper, we investigate the problem about how to bid in repeated contextual first price
auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions:
at each time t, the learner observes a context xt ∈ R
d and decides the bid based on historical
information and xt. We assume a structured linear model of the maximum bid of all the others
mt = α0 · xt + zt, where α0 ∈ R
d
is unknown to the learner and zt is randomly sampled from a
noise distribution F with log-concave density function f. We consider both binary feedback (the
learner can only observe whether she wins or not) and full information feedback (the learner can
observe mt) at the end of each time t. For binary feedback, when the noise distribution F is
known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method
to achieve at most Oe(
p
log(d)T ) regret. Moreover, we generalize this algorithm to the setting
with binary feedback and the noise distribution is unknown but belongs to a parametrized family
of distributions. For the full information feedback with unknown noise distribution, we provide
an algorithm that achieves regret at most Oe(
√
dT). Our approach combines an estimator for
log-concave density functions and then MLE method to learn the noise distribution F and linear
weight α0 simultaneously. We also provide a lower bound result such that any bidding policy
in a broad class must achieve regret at least Ω(√
T), even when the learner receives the full
information feedback and F is known.
View details
Leveraging Bias-Variance Trade-offs for Regression with Label Differential Privacy
Avinash Varadarajan
Chiyuan Zhang
Ethan Leeman
Pritish Kamath
NeurIPS 2023 (2023)
Preview abstract
We propose a new family of label randomization mechanisms for the task of training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better noising mechanisms depending on a privately estimated prior distribution over the labels. We demonstrate that these mechanisms achieve state-of-the-art privacy-accuracy trade-offs on several datasets, highlighting the importance of bias-reducing constraints when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal bias-reduced mechanisms.
View details
Follow-ups Matter: Improving Contextual Bandits via Post-serving Features
Chaoqi Wang
Ziyu Ye
Haifeng Xu
NeurIPS-23 (Spotlight) (2023)
Preview abstract
In the realm of contextual bandit algorithms, the standard framework involves observing a context, selecting an arm, and then observing the reward. This approach, while functional, often falls short when dealing with the complexity of real-world scenarios where additional valuable contexts are revealed after arm-selection. We introduce a new algorithm, pLinUCB, designed to incorporate these post-serving contexts effectively, thereby achieving sublinear regret. Our key technical contribution is a robustified and generalized version of the well-known Elliptical Potential Lemma (EPL), which allows us to handle the noise in the context vectors, a crucial aspect in a practical setting. This generalized EPL is of independent interest as it has potential applications beyond the scope of this work. Through extensive empirical tests on both synthetic and real-world datasets, we demonstrate that our proposed algorithm outperforms the state-of-the-art, thus establishing its practical potential. Our work underscores the importance of post-serving contexts in the contextual bandit setting and lays the groundwork for further research in this field.
View details
Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards
Tianxi Li
Haifeng Xu
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022) (2022)
Preview abstract
Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central problem for advertisers in advertising systems. In this paper, we investigate the problem about how the advertiser can decide the bid through learning conversion incrementality, in an online manner. We formulate this problem as an episodic Markov Decision Process (MDP) and propose a novel reinforcement learning (RL) algorithm that can achieve regret at most $O(H^2\sqrt{T})$, which doesn't rely on the number of actions (bids), where $H$ is the number of rounds in each episode and $T$ is the total number of episodes in MDP. In sharp contrast with standard RL framework, conversion incrementality feedback are delayed and mixed. To handle this difficulty we propose a novel pairwise moment-matching algorithm to learn conversion incrementality, which is of independent of interest.
View details
Preview abstract
In performance based digital advertising, one of the key technical
tools is to predict the expected value of post ad click purchases
(a.k.a. conversions). Some of the salient aspects of this problem
such as having a non-binary label and advertisers reporting the label
in different scales make it a much harder problem than predicting
probability of a click. In this paper we ask what is a good way to
model the label and extract as much information as possible. A label
can affect the model in multiple ways and we consider three such
directions and come up with new techniques for each of them. The
first is that the label scale can affect how the model capacity is
devoted to different advertisers, the second is how labels for outliers
can affect over-fitting and the third is if we can use information in
the distribution of the label and not just the mean.
View details
Preview abstract
Classic mechanism design often assumes that a bidder’s action is restricted to report a type or a signal,
possibly untruthfully. In today’s digital economy, bidders are holding increasing amount of private
information about the auctioned items. And due to legal or ethical concerns, they would demand to reveal
partial but truthful information, as opposed to report untrue signal or misinformation. To accommodate
such bidder behaviors in auction design, we propose and study a novel mechanism design setup where
each bidder holds two kinds of information: (1) private value type, which can be misreported; (2) private
information variable, which the bidder may want to conceal or partially reveal, but importantly, not to
misreport. We refer to bidders with such behaviors as strategically reticent bidders. Among others, one
direct motivation of our model is the ad auction in which many ad platforms today elicit from each bidder
not only their private value per conversion but also their private information about Internet users (e.g.,
their conversion history) in order to improve the platform’s estimation of all bidders’ conversion rates.
We show that in this new setup, it is still possible to design mechanisms that are both Incentive and
Information Compatible (IIC). We develop two different black-box transformations, which convert any
mechanism M for classic bidders to a mechanism M′
for strategically reticent bidders, based on either
outcome of expectation or expectation of outcome, respectively. We identify properties of the original
mechanismMunder which the transformation leads to IIC mechanismsM′
. Interestingly, as corollaries
of these results, we show that running VCG with expected bidder values maximizes welfare whereas the
mechanism using expected outcome of Myerson’s auction maximizes revenue. Finally, we study how
regulation on the auctioneer’s usage of information may lead to more robust mechanisms.
View details
Handling many conversions per click in modeling delayed feedback
Andrew Evdokimov
Vinodh Krishnan
Pan Li
Wynn Vonnegut
Jayden Wang
ADKDD 2021 (2021)
Preview abstract
Predicting the expected number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspect is handling the delayed feedback with respect to conversions, which can happen multiple times with various delay. This task is difficult, as the delay distribution is different for each advertiser, is long-tailed, often does not follow any particular class of parametric distributions, and can change over time. We tackle these challenges using an unbiased estimation model with the following three ideas. The first idea is to split the label as a sum of labels with different delay buckets, each of which trains only on mature label, the second is to use thermometer encoding to increase accuracy and reduce inference cost, and the third is to use auxiliary information to increase the stability of the model and to handle drift in distribution.
View details
Online Learning via Offline Greedy: Applications in Market Design and Optimization
Rad Niazadeh
Negin Golrezaei
Joshua Wang
Fransisca Susan
EC 2021, Management Science Journal (2020)
Preview abstract
Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have O(T^{1/2}) (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a O(T^{2/3})(approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds.
View details
Preview abstract
In this paper, we introduce a novel technique forconstrained submodular maximization, inspired by barrier functions in continuous optimization.This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a k-matchoid and l-knapsack constraint (for l≤k), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an ε error it is guaranteed that we have found a feasible set with a 2(k+1+ε)-approximation factor which can indeed be further improved to (k+1+ε)by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.
View details
Response Prediction for Low-Regret Agents
Sadra Yazdanbod
Web and Internet Economics 2019
Preview abstract
Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most f(T) within the first T rounds, we can learn to play the multi-armed bandits game without observing the rewards) in such a way that the regret of our selected actions is at most O(k^4 (f(T) + 1) log(T)).
View details
Preview abstract
Autobidding is becoming increasingly important in the domain of online advertising, and has become a critical tool used by many advertisers for optimizing their ad campaigns. We formulate fundamental questions around the problem of bidding for performance under very general affine cost constraints. We design optimal single-agent bidding strategies for the general bidding problem, in multi-slot truthful auctions. We show that there is an intimate connection between bidding and auction design, in that the bidding formula is optimal if and only if the underlying auction is truthful. We show how a MWU algorithm can be used to learn this optimal bidding formula.
Next, we move from the single-agent view to taking a full-system view: What happens when all advertisers adopt optimal autobidding? We prove that in general settings, there exists an equilibrium between the bidding agents for all the advertisers. Further, we prove a Price of Anarchy result: For the general affine constraints, the total value (conversions) obtained by the advertisers in the bidding agent equilibrium is no less than 1/2 of what we could generate via a centralized ad allocation scheme, one which does not consider any auction incentives or provide any per-advertiser guarantee.
View details
Preview abstract
Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately,
this is not always in the best interest of the ad platform – partially hiding some information
could be more beneficial for the platform’s revenue. In this paper, we examine the following basic question
in the context of second-price ad auctions: how should an ad platform optimally reveal information
about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which
bidders’ valuations depend on a random state of the ad opportunity. Different from previous work, we
focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities
is extremely large. We thus focus on developing algorithms whose running time is polynomial
in the number of bidders, but is independent of the number of ad opportunity realizations.
We assume that the auctioneer can commit to a signaling scheme to reveal noisy information about
the realized state of the ad opportunity, and examine the auctioneer’s algorithmic question of designing
the optimal signaling scheme. We first consider that the auctioneer is restricted to send a public
signal to all bidders. As a warm-up, we start with a basic (though less realistic) setting in which the
auctioneer knows the bidders’ valuations, and show that an -optimal scheme can be implemented in
time polynomial in the number of bidders and 1/. We then move to a well-motivated Bayesian valuation
setting in which the auctioneer and bidders both have private information, and present two results.
First, we exhibit a characterization result regarding approximately optimal schemes and prove that any
constant-approximate public signaling scheme must use exponentially many signals. Second, we present
a “simple” public signaling scheme that serves as a constant approximation under mild assumptions.
Finally, we initiate an exploration on the power of being able to send different signals privately to
different bidders. In the basic setting where the auctioneer knows bidders’ valuations, we exhibit a
polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium.
This illustrates the surprising power of private signaling schemes in extracting revenue.
View details
Locally adaptive optimization: adaptive seeding for monotone submodular functions
Christos Papadimitriou
Aviad Rubinstein
Lior Seeman
Yaron Singer
SODA (2016)
Preview abstract
The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize.
Our main result is a (1 - 1/e)^2-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call locally-adaptive policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the (1 - 1/e)^2-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies.
We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most k that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from Planted-Clique that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of locally-adaptive policies we use in the main result.
View details
Preview abstract
Can we summarize multi-category data based on
user preferences in a scalable manner? Many
utility functions used for data summarization satisfy
submodularity, a natural diminishing returns
property. We cast personalized data summarization
as an instance of a general submodular
maximization problem subject to multiple
constraints. We develop the first practical and
FAst coNsTrained submOdular Maximization algorithm,
FANTOM, with strong theoretical guarantees.
FANTOM maximizes a submodular function
(not necessarily monotone) subject to the intersection
of a p-system and l knapsacks constrains.
It achieves a (1+ϵ)(p+1)(2p+2l+1)/p
approximation guarantee with only O(
nrp log(n)/\epsilon)
query complexity (n and r indicate the size of
the ground set and the size of the largest feasible
solution, respectively). We then show how
we can use FANTOM for personalized data summarization.
In particular, a p-system can model
different aspects of data, such as categories or
time stamps, from which the users choose. In
addition, knapsacks encode users’ constraints including
budget or time. In our set of experiments,
we consider several concrete applications:
movie recommendation over 11K movies, personalized
image summarization with 10K images,
and revenue maximization on the YouTube
social networks with 5000 communities. We observe
that FANTOM constantly provides the highest
utility against all the baselines.
View details
Preview abstract
How can one find a subset, ideally as small as possible, that well represents a
massive dataset? I.e., its corresponding utility, measured according to a suitable
utility function, should be comparable to that of the whole dataset. In this paper,
we formalize this challenge as a submodular cover problem. Here, the utility is
assumed to exhibit submodularity, a natural diminishing returns condition prevalent
in many data summarization applications. The classical greedy algorithm is
known to provide solutions with logarithmic approximation guarantees compared
to the optimum solution. However, this sequential, centralized approach is impractical
for truly large-scale problems. In this work, we develop the first distributed
algorithm – DISCOVER – for submodular set cover that is easily implementable
using MapReduce-style computations. We theoretically analyze our approach,
and present approximation guarantees for the solutions returned by DISCOVER.
We also study a natural trade-off between the communication cost and the number
of rounds required to obtain such a solution. In our extensive experiments,
we demonstrate the effectiveness of our approach on several applications, including
active set selection, exemplar based clustering, and vertex cover on tens of
millions of data points using Spark.
View details
Resourceful contextual bandits
Streaming Submodular Maximization: Massive Data Summarization on the Fly
Fast algorithms for maximizing submodular functions
Robust Multi-objective Learning with Mentor Feedback
Alekh Agarwal
Miroslav Dudik
Robert E. Schapire
Aleksandrs Slivkins
COLT 2014
Bandits with Knapsacks
Robert Kleinberg
Aleksandrs Slivkins
IEEE Symposium on Foundations of Computer Science (FOCS) (2013)
Approximating Low-Dimensional Coverage Problems
Sketching Valuation Functions
Shahar Dobzinski
Hu Fu
Robert Kleinberg
Noam Nisan
Tim Roughgarden
SODA 2012
Learning on a Budget: Posted Price Mechanisms for Online Procurement
Optimization with Demand Oracles
Buyback Problem - Approximate matroid intersection with cancellation costs
ICALP 2011
Randomized Online Algorithms for the Buyback Problem
On Tradeoff Between Network Connectivity, Phase Complexity and Communication Complexity of Reliable Communication Tolerating Mixed Adversary