Jump to Content
Kevin Canini

Kevin Canini

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract We investigate machine learning models that can provide diminishing returns and accelerating returns guarantees to capture prior knowledge or policies about how outputs should depend on inputs. We show that one can build flexible, nonlinear, multi-dimensional models using lattice functions with any combination of concavity/convexity and monotonicity constraints on any subsets of features, and compare to new shape-constrained neural networks. We demonstrate on real-world examples that these shape constrained models can provide tuning-free regularization and improve model understandability. View details
    Preview abstract We propose learning deep models that are monotonic with respect to a user specified set of inputs by alternating layers of linear embeddings, ensembles of lattices, and calibrators (piecewise linear functions), with appropriate constraints for monotonicity, and jointly training the resulting network. We implement the layers and projections with new computational graph nodes in TensorFlow and use the ADAM optimizer and batched stochastic gradients. Experiments on benchmark and real-world datasets show that six-layer monotonic deep lattice networks achieve state-of-the art performance for classification and regression with monotonicity guarantees. View details
    Monotonic Calibrated Interpolated Look-Up Tables
    Maya Gupta
    Andrew Cotter
    Konstantin Voevodski
    Alexander Mangylov
    Wojciech Moczydlowski
    Alexander van Esbroeck
    Journal Machine Learning Research (JMLR) (2016)
    Preview abstract Real-world machine learning applications may require functions to be interpretable and fast to evaluate, in addition to accurate. In particular, guaranteed monotonicity of the learned function can be critical to user trust. We propose meeting these three goals for low-dimensional machine learning problems by learning flexible, monotonic functions using calibrated interpolated look-up tables. We extend the structural risk minimization framework of lattice regression to train monotonic look-up tables by solving a convex prob- lem with appropriate linear inequality constraints. In addition, we propose jointly learning interpretable calibrations of each feature to normalize continuous features and handle categorical or missing data, though this changes the optimization problem to be non-convex. We address large-scale learning through parallelization, mini-batching, and propose random sampling of additive regularizer terms. Experiments on seven real-world problems with five to sixteen features and thousands to millions of training samples show the proposed monotonic functions can achieve state-of-the-art accuracy on practical problems while providing greater transparency to users. View details
    Preview abstract In many real-world machine learning problems, there are some inputs that are known should be positively (or negatively) related to the output, and in such cases constraining the trained model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation. View details
    Launch and Iterate: Reducing Prediction Churn
    Quentin Cormier
    Mahdi Milani Fard
    Maya Gupta
    NIPS (2016)
    Preview abstract Practical applications of machine learning often involve successive training iterations with ever improving features and increasing training examples. Ideally, changes in the output of any new model should only be improvements (wins) over the previous iteration, but in practice the predictions may change neutrally for many examples, resulting in extra net-zero wins and losses, that we refer to as churn. These changes in the predictions are problematic for usability for some applications, and make it harder to measure if a change is statistically significant positive. In this paper, we formulate the problem and present a stabilization operator to regularize a classifier towards a previous classifier. We use a Markov chain Monte Carlo stabilization operator to produce a model with more consistent predictions but without degrading accuracy. We investigate the properties of the proposal with theoretical analysis. Experiments on benchmark datasets for three different classification algorithms demonstrate the method and the range of churn reduction it can provide. View details
    Parallel Boosting with Momentum
    Indraneel Mukherjee
    Rafael Frongillo
    Yoram Singer
    ECML PKDD 2013, Part III, LNAI 8190, Springer, Heidelberg, pp. 17-32 (to appear)
    Preview abstract We describe a new, simplified, and general analysis of a fusion of Nesterov’s accelerated gradient with parallel coordinate descent. The resulting algorithm, which we call BOOM, for boosting with momentum, enjoys the merits of both techniques. Namely, BOOM retains the momentum and convergence properties of the accelerated gradient method while taking into account the curvature of the objective function. We describe a distributed implementation of BOOM which is suitable for massive high dimensional datasets. We show experimentally that BOOM is especially effective in large scale learning problems with rare yet informative features. View details
    No Results Found