# Jon Schneider

### Research Areas

Authored Publications

Google Publications

Other Publications

Sort By

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

Sreenivas Gollapudi

Guru Prashanth Guruganesh

NeurIPS(2021)

Preview abstract
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i.e. recommend to the user) action $x_t$, we obtain utility $\langle x_t, w^* \rangle$ but only learn the identity of the best action $\arg\max_{x \in \X_t} \langle x, w^* \rangle$.
We design algorithms for this problem which achieve regret $O(d\log T)$ and $\exp(O(d \log d))$. To accomplish this, we design novel cutting-plane algorithms with low “regret” -- the total distance between the true point $w^*$ and the hyperplanes the separation oracle returns.
We also consider the variant where we are allowed to provide a list of several recommendations. In this variant, we give an algorithm with $O(d^2 \log d)$ regret and list size $\poly(d)$. Finally, we construct nearly tight algorithms for a weaker variant of this problem where the learner only learns the identity of an action that is better than the recommendation. Our results rely on new algorithmic techniques in convex geometry (including a variant of Steiner’s formula for the centroid of a convex set) which may be of independent interest.
View details

Prior-Free Dynamic Auctions with Low Regret Buyers

Advances in Neural Information Processing Systems(2019), pp. 4803-4813

Preview abstract
We study the problem of how to repeatedly sell to a buyer running a no-regret, mean-based algorithm. Previous work (Braverman et al., EC '18) shows that it is possible to design effective mechanisms in such a setting that extract almost all of the economic surplus, but these mechanisms require the buyer's values each round to be drawn iid from a fixed distribution. In this paper, we do away with this assumption and consider the {\it prior-free setting} where the buyer's value each round is chosen adversarially (possibly adaptively).
We show that even in this prior-free setting, it is possible to extract a $(1-\varepsilon)$-approximation of the full economic surplus for any $\varepsilon > 0$. The menu complexity of our mechanism (the number of options offered to a buyer in any round) scales independently of the number of rounds $T$ and polynomially in $\varepsilon$. We show that this is optimal up to a polynomial factor; any mechanism achieving this approximation factor, even when values are drawn stochastically, requires menu complexity at least $\Omega(1/\varepsilon)$.
Finally, we examine what is possible when we constrain our mechanism to a natural auction format where overbidding is dominated. Braverman et al. show that even when values are drawn from a known stochastic distribution supported on $[1/H, 1]$, it is impossible in general to extract more than $O(\log\log H / \log H)$ of the economic surplus. We show how to achieve the same approximation factor in the {\it prior-independent} setting (where the distribution is unknown to the seller), and an approximation factor of $O(1 / \log H)$ in the prior-free setting.
View details

Strategizing against No-regret Learners

Advances in Neural Information Processing Systems(2019), pp. 1579-1587

Preview abstract
How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.
View details

Contextual Search via Intrinsic Volumes

59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pp. 268-282

Preview abstract
We study the problem of contextual search, a multidimensional generalization of bi- nary search that captures many problems in contextual decision-making. In contextual search, a learner is trying to learn the value of a hidden vector v ∈ [0, 1]d. Every round the learner is provided an adversarially-chosen context ut ∈ Rd, submits a guess pt for the value of ⟨ut, v⟩, learns whether pt < ⟨ut, v⟩, and incurs loss ⟨ut, v⟩. The learner’s goal is to minimize their total loss over the course of T rounds.
We present an algorithm for the contextual search problem for the symmetric loss function l(θ,p) = |θ−p| that achieves Od(1) total loss. We present a new algorithm for the dynamic pricing problem (which can be realized as a special case of the contextual search problem) that achieves Od(log log T ) total loss, improving on the previous best known upper bounds of Od(logT) and matching the known lower bounds (up to a polynomial dependence on d). Both algorithms make significant use of ideas from the field of integral geometry, most notably the notion of intrinsic volumes of a convex set. To the best of our knowledge this is the first application of intrinsic volumes to algorithm design.
View details

Contextual Pricing for Lipschitz Buyers

Jieming Mao

Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montr\'eal, Canada., pp. 5648-5656

Preview abstract
We investigate the problem of learning a Lipschitz function from binary feedback.
In this problem, a learner is trying to learn a Lipschitz function f : [0, 1] d → [0, 1]
over the course of T rounds. On round t, an adversary provides the learner with an
input x t , the learner submits a guess y t for f (x t ), and learns whether
P y t > f (x t )
or y t ≤ f (x t ). The learner’s goal is to minimize their total loss t `(f (x t ), y t )
(for some loss function `). The problem is motivated by contextual dynamic pric-
ing, where a firm must sell a stream of differentiated products to a collection of
buyers with non-linear valuations for the items and observes only whether the item
was sold or not at the posted price.
For the symmetric loss `(f (x t ), y t ) = |f (x t ) − y t |, we provide an algorithm for
(d−1)/d
this problem achieving total loss O(log T ) when d = 1 and O(T
) when
√
d > 1, and show that both bounds are tight (up to a factor of log T ). For the
pricing loss function `(f (x t ), y t ) = f (x t ) − y t 1{y t ≤ f (x t )} we show a regret
bound of O(T d/(d+1) ) and show that this bound is tight. We present improved
bounds in the special case of a population of linear buyers
View details

No Results Found