Greg Kochanski

MIT Physics 1982 S.B., 1987 Ph.D. AT&T Bell Laboratories & Lucent Technologies Bell Laboratories, 1987-2002. University of Oxford Faculty of Linguistics 2003-2011.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Zeroth-order optimization is the process of minimizing an objective $f(x)$ given oracle access to evaluations at adaptively chosen inputs $x$. In this paper, we present two simple yet powerful GradientLess Descent (GLD) algorithms that do not rely on an underlying gradient estimate and are numerically stable. We analyze our algorithm from a novel geometric perspective and we derive two invariance properties of our algorithm: monotone and affine invariance. Specifically, for {\it any monotone transform} of a smooth and strongly convex objective with latent dimension $k$, then we present a novel analysis that shows convergence within an $\epsilon$-ball of the optimum in $O(kQ\log(n)\log(R/\epsilon))$ evaluations, where the input dimension is $n$, $R$ is the diameter of the input space and $Q$ is the condition number. Our rates are the first of its kind to be both 1) poly-logarithmically dependent on dimensionality and 2) invariant under monotone transformations. From our geometric perspective, we can further show that our analysis is optimal. We emphasize that monotone and affine invariance are key to the empirical success of gradientless algorithms, as demonstrated on BBOB and MuJoCo benchmarks. View details
    Preview abstract Any sufficiently complex system acts as a black box when it becomes easier to experiment with than to understand. Hence, black-box optimization has become increasingly important as systems have become more complex. In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. Google Vizier is used to optimize many of our machine learning models and other systems, and also provides core capabilities to Google's Cloud Machine Learning HyperTune subsystem. We discuss our requirements, infrastructure design, underlying algorithms, and advanced features such as transfer learning and automated early stopping that the service provides. View details
    Bayesian Optimization for a Better Dessert
    Subhodeep Moitra
    Proceedings of the 2017 NIPS Workshop on Bayesian Optimization, December 9, 2017, Long Beach, USA (to appear)
    Preview abstract We present a case study on applying Bayesian Optimization to a complex real-world system; our challenge was to optimize chocolate chip cookies. The process was a mixed-initiative system where both human chefs, human raters, and a machine optimizer participated in 144 experiments. This process resulted in highly rated cookies that deviated from expectations in some surprising ways -- much less sugar in California, and cayenne in Pittsburgh. Our experience highlights the importance of incorporating domain expertise and the value of transfer learning approaches. View details
    Preview abstract We present a simple and robust optimization algorithm related to genetic algorithms, and with analogies to the popular CMA-ES search algorithm, that serves as a cheap alternative to Bayesian Optimization. The algorithm is robust against both monotonic transforms of the objective function value and affine transformations of the feasible region. It is fast and easy to implement, and has performance comparable to CMA-ES on a suite of benchmarks while spending less CPU in the optimization algorithm, and can exhibit better overall performance than Bayesian Optimization when the objective function is cheap. View details