Jon Schneider

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Blackwell's celebrated theory measures approachability using the $\ell_2$ (Euclidean) distance. In many applications such as regret minimization, it is often more useful to study approachability under other distance metrics, most commonly the $\ell_\infty$ metric. However, the time and space complexity of the algorithms designed for $\ell_\infty$ approachability depend on the dimension of the space of the vectorial payoffs, which is often prohibitively large. We present a framework for converting high-dimensional $\ell_\infty$ approachability problems to low-dimensional \emph{pseudonorm} approachability problems, thereby resolving such issues. We first show that the $\ell_\infty$ distance between the average payoff and the approachability set can be equivalently defined as a \emph{pseudodistance} between a lower-dimensional average vector payoff and a new convex convex set we define. Next, we develop an algorithmic theory of pseudonorm approachability analogous to previous work norm approachability showing that it can be achieved via online linear optimization (OLO) over a convex set given by the Fenchel dual of the unit pseudonorm ball. We then use that to show, modulo mild normalization assumptions, that there exists an $\ell_\infty$ approachability algorithm whose convergence is independent of the dimension of the original vector payoff. We further show that that algorithm admits a polynomial-time complexity, assuming that the original $\ell_\infty$-distance can be computed efficiently. We also give an $\ell_\infty$ approachability algorithm whose convergence is logarithmic in that dimension using an FTRL algorithm with a maximum-entropy regularizer. Finally, we illustrate the benefits of our framework by applying it to several problems in regret minimization. View details
    Preview abstract We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy, while the other, the optimizer, is a rational utility maximizer. We consider general Bayesian games, where the payoffs of both the optimizer and the learner could depend on the type, which is drawn from a publicly known distribution, but revealed privately to the learner. We address the following questions: (a) what is the bare minimum that the optimizer is guaranteed to obtain regardless of the no-regret learning algorithm employed by the learner? (b) are there learning algorithms that cap the optimizer payoff at this minimum? (c) can these generalizations be implemented efficiently? While building this theory of optimizer-learner interactions, we define a new combinatorial notion of regret called polytope swap regret, that could be of independent interest in other settings. View details
    Preview abstract We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i.e. recommend to the user) action $x_t$, we obtain utility $\langle x_t, w^* \rangle$ but only learn the identity of the best action $\arg\max_{x \in \X_t} \langle x, w^* \rangle$. We design algorithms for this problem which achieve regret $O(d\log T)$ and $\exp(O(d \log d))$. To accomplish this, we design novel cutting-plane algorithms with low “regret” -- the total distance between the true point $w^*$ and the hyperplanes the separation oracle returns. We also consider the variant where we are allowed to provide a list of several recommendations. In this variant, we give an algorithm with $O(d^2 \log d)$ regret and list size $\poly(d)$. Finally, we construct nearly tight algorithms for a weaker variant of this problem where the learner only learns the identity of an action that is better than the recommendation. Our results rely on new algorithmic techniques in convex geometry (including a variant of Steiner’s formula for the centroid of a convex set) which may be of independent interest. View details
    Preview abstract In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\R^d$ and only observes the purchasing decisions of a buyer with a fixed but unknown linear valuation over the products. The regret measures the difference between the revenue the seller could have obtained knowing the buyer valuation and what can be obtained by the learning algorithm. We give a poly-time algorithm for contextual pricing with $O(d \log \log T + d \log d)$ regret which matches the $\Omega(d \log \log T)$ lower bound up to the $d \log d$ additive factor. If we replace pricing loss by the symmetric loss, we obtain an algorithm with nearly optimal regret of $O(d \log d)$ matching the $\Omega(d)$ lower bound up to $\log d$. These algorithms are based on a novel technique of bounding the value of the Steiner polynomial of a convex region at various scales. The Steiner polynomial is a degree $d$ polynomial with intrinsic volumes as the coefficients. We also study a generalized version of contextual search where the hidden linear function over the Euclidean space is replaced by a hidden function $f : \X \rightarrow \Y$ in a certain hypothesis class $\H$. We provide a generic algorithm with $O(d^2)$ regret where $d$ is the covering dimension of this class. This leads in particular to a $\tilde{O}(s^2)$ regret algorithm for linear contextual search if the linear function is guaranteed to be $s$-sparse. Finally we also extend our results to the noisy feedback model, where each round our feedback is flipped with a fixed probability $p < 1/2$. View details
    Preview abstract We study a variant of linear regression with a dis-continuous loss function that we term Myersonianregression. In this variant, we wish to find a linearfunctionf:Rd→Rthat well approximates a setof points(xi,vi)∈Rd×[0,1]in the followingsense: we receive a loss ofviwhenf(xi)> viand a loss ofvi−f(xi)whenf(xi)≤vi. Thisarises naturally in the economic application ofdesigning a pricing policy for differentiated items(where the loss is the gap between the perfor-mance of our policy and the optimal Myersonprices).We show that Myersonian regression is NP-hardto solve exactly and furthermore that no fullypolynomial-time approximation scheme exists forMyersonian regression conditioned on the Expo-nential Time Hypothesis being true. In contrast tothis, we demonstrate a polynomial-time approx-imation scheme for Myersonian regression thatobtains anmadditive approximation to the op-timal possible revenue and can be computed intimeO(exp(poly(1/))poly(m,n)). We showthat this algorithm is stable and generalizes wellover distributions of samples View details
    Preview abstract We study optimization with an approximate zero order oracle where there is a cost c() associated with querying the oracle with  accuracy. We consider the task of reconstructing a linear function: given a linear function f : X ⊆ R d → [−1, 1] defined on a not-necessarily-convex set X, the goal is to reconstruct ˆf such that |f(x) − ˆf(x)| ≤  for all x ∈ X. We show that this can be done with cost O(d · c(/d)). The algorithm is based on a (poly-time computable) John-like theorem for simplices instead of ellipsoids, which may be of independent interest. This question is motivated by optimization with threshold queries, which are common in economic applications such as pricing. With threshold queries, approximating a number up to precision  requires c() = log(1/). For this, our algorithm has cost O(d log(d/)) which matches the Ω(d log(1/)) lower bound up to log factors View details
    Prior-Free Dynamic Auctions with Low Regret Buyers
    Advances in Neural Information Processing Systems(2019), pp. 4803-4813
    Preview abstract We study the problem of how to repeatedly sell to a buyer running a no-regret, mean-based algorithm. Previous work (Braverman et al., EC '18) shows that it is possible to design effective mechanisms in such a setting that extract almost all of the economic surplus, but these mechanisms require the buyer's values each round to be drawn iid from a fixed distribution. In this paper, we do away with this assumption and consider the {\it prior-free setting} where the buyer's value each round is chosen adversarially (possibly adaptively). We show that even in this prior-free setting, it is possible to extract a $(1-\varepsilon)$-approximation of the full economic surplus for any $\varepsilon > 0$. The menu complexity of our mechanism (the number of options offered to a buyer in any round) scales independently of the number of rounds $T$ and polynomially in $\varepsilon$. We show that this is optimal up to a polynomial factor; any mechanism achieving this approximation factor, even when values are drawn stochastically, requires menu complexity at least $\Omega(1/\varepsilon)$. Finally, we examine what is possible when we constrain our mechanism to a natural auction format where overbidding is dominated. Braverman et al. show that even when values are drawn from a known stochastic distribution supported on $[1/H, 1]$, it is impossible in general to extract more than $O(\log\log H / \log H)$ of the economic surplus. We show how to achieve the same approximation factor in the {\it prior-independent} setting (where the distribution is unknown to the seller), and an approximation factor of $O(1 / \log H)$ in the prior-free setting. View details
    Strategizing against No-regret Learners
    Advances in Neural Information Processing Systems(2019), pp. 1579-1587
    Preview abstract How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility. View details
    Contextual Search via Intrinsic Volumes
    59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pp. 268-282
    Preview abstract We study the problem of contextual search, a multidimensional generalization of bi- nary search that captures many problems in contextual decision-making. In contextual search, a learner is trying to learn the value of a hidden vector v ∈ [0, 1]d. Every round the learner is provided an adversarially-chosen context ut ∈ Rd, submits a guess pt for the value of ⟨ut, v⟩, learns whether pt < ⟨ut, v⟩, and incurs loss ⟨ut, v⟩. The learner’s goal is to minimize their total loss over the course of T rounds. We present an algorithm for the contextual search problem for the symmetric loss function l(θ,p) = |θ−p| that achieves Od(1) total loss. We present a new algorithm for the dynamic pricing problem (which can be realized as a special case of the contextual search problem) that achieves Od(log log T ) total loss, improving on the previous best known upper bounds of Od(logT) and matching the known lower bounds (up to a polynomial dependence on d). Both algorithms make significant use of ideas from the field of integral geometry, most notably the notion of intrinsic volumes of a convex set. To the best of our knowledge this is the first application of intrinsic volumes to algorithm design. View details
    Contextual Pricing for Lipschitz Buyers
    Jieming Mao
    Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montr\'eal, Canada., pp. 5648-5656
    Preview abstract We investigate the problem of learning a Lipschitz function from binary feedback. In this problem, a learner is trying to learn a Lipschitz function f : [0, 1] d → [0, 1] over the course of T rounds. On round t, an adversary provides the learner with an input x t , the learner submits a guess y t for f (x t ), and learns whether P y t > f (x t ) or y t ≤ f (x t ). The learner’s goal is to minimize their total loss t `(f (x t ), y t ) (for some loss function `). The problem is motivated by contextual dynamic pric- ing, where a firm must sell a stream of differentiated products to a collection of buyers with non-linear valuations for the items and observes only whether the item was sold or not at the posted price. For the symmetric loss `(f (x t ), y t ) = |f (x t ) − y t |, we provide an algorithm for (d−1)/d this problem achieving total loss O(log T ) when d = 1 and O(T ) when √ d > 1, and show that both bounds are tight (up to a factor of log T ). For the pricing loss function `(f (x t ), y t ) = f (x t ) − y t 1{y t ≤ f (x t )} we show a regret bound of O(T d/(d+1) ) and show that this bound is tight. We present improved bounds in the special case of a population of linear buyers View details
    No Results Found