Greg Kochanski
MIT Physics 1982 S.B., 1987 Ph.D.
AT&T Bell Laboratories & Lucent Technologies Bell Laboratories, 1987-2002.
University of Oxford Faculty of Linguistics 2003-2011.
Research Areas
Authored Publications
Sort By
Gradientless Descent: High-Dimensional Zeroth-Order Optimization
ICLR 2020 (2020)
Preview abstract
Zeroth-order optimization is the process of minimizing an objective $f(x)$ given oracle access to evaluations at adaptively chosen inputs $x$. In this paper, we present two simple yet powerful GradientLess Descent (GLD) algorithms that do not rely on an underlying gradient estimate and are numerically stable. We analyze our algorithm from a novel geometric perspective and we derive two invariance properties of our algorithm: monotone and affine invariance. Specifically, for {\it any monotone transform} of a smooth and strongly convex objective with latent dimension $k$, then we present a novel analysis that shows convergence within an $\epsilon$-ball of the optimum in $O(kQ\log(n)\log(R/\epsilon))$ evaluations, where the input dimension is $n$, $R$ is the diameter of the input space and $Q$ is the condition number. Our rates are the first of its kind to be both 1) poly-logarithmically dependent on dimensionality and 2) invariant under monotone transformations. From our geometric perspective, we can further show that our analysis is optimal. We emphasize that monotone and affine invariance are key to the empirical success of gradientless algorithms, as demonstrated on BBOB and MuJoCo benchmarks.
View details
Bayesian Optimization for a Better Dessert
Subhodeep Moitra
Proceedings of the 2017 NIPS Workshop on Bayesian Optimization, December 9, 2017, Long Beach, USA (to appear)
Preview abstract
We present a case study on applying Bayesian Optimization to a complex real-world system; our challenge was to optimize chocolate chip cookies. The process was a mixed-initiative system where both human chefs, human raters, and a machine optimizer participated in 144 experiments. This process resulted in highly rated cookies that deviated from expectations in some surprising ways -- much less sugar in California, and cayenne in Pittsburgh. Our experience highlights the importance of incorporating domain expertise and the value of transfer learning approaches.
View details
Black Box Optimization via a Bayesian-Optimized Genetic Algorithm
Advances in Neural Information Processing Systems 30 (NIPS 2017) (to appear)
Preview abstract
We present a simple and robust optimization algorithm related to genetic algorithms, and with analogies to the popular CMA-ES search algorithm, that serves as a cheap alternative to Bayesian Optimization. The algorithm is robust against both monotonic transforms of the objective function value and affine transformations of the feasible region. It is fast and easy to implement, and has performance comparable to CMA-ES on a suite of benchmarks while spending less CPU in the optimization algorithm, and can exhibit better overall performance than Bayesian Optimization when the objective function is cheap.
View details
Google Vizier: A Service for Black-Box Optimization
Subhodeep Moitra
ACM (2017), pp. 1487-1495
Preview abstract
Any sufficiently complex system acts as a black box when it becomes easier to
experiment with than to understand. Hence, black-box optimization has become
increasingly important as systems have become more complex. In this paper we
describe Google Vizier, a Google-internal service for performing
black-box optimization that has become the de facto parameter tuning
engine at Google. Google Vizier is used to optimize many of our machine
learning models and other systems, and also provides core capabilities to
Google's Cloud Machine Learning HyperTune subsystem. We discuss our
requirements, infrastructure design, underlying algorithms, and advanced
features such as transfer learning and automated early stopping that the
service provides.
View details