Jump to Content
Jan Pfeifer

Jan Pfeifer

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank
    Sebastian Bruch
    Rohan Anil
    Stephan Wolf
    Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2019), pp. 2970-2978
    Preview abstract Learning-to-Rank deals with maximizing the utility of a list of examples presented to the user, with items of higher relevance being prioritized. It has several practical applications such as large-scale search, recommender systems, document summarization and question answering. While there is widespread support for classification and regression based learning, support for learning-to-rank in deep learning has been limited. We propose TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework. It is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning-to-rank setting. Our library is developed on top of TensorFlow and can thus fully leverage the advantages of this platform. For example, it is highly scalable, both in training and in inference, and can be used to learn ranking models over massive amounts of user activity data, which can include heterogeneous dense and sparse features. We empirically demonstrate the effectiveness of our library in learning ranking functions for large-scale search and recommendation applications in Gmail and Google Drive. We also show that ranking models built using our model scale well for distributed training, without significant impact in metrics. The proposed library is available to the open source community, with the hope that it facilitates further academic research and industrial applications in the field of learning-to-rank. View details
    Preview abstract TensorFlow Ranking is the first open source library for solving large-scale ranking problems in a deep learning framework. It is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning-to-rank setting. Our library is developed on top of TensorFlow and can thus fully leverage the advantages of this platform. For example, it is highly scalable, both in training and in inference, and can be used to learn ranking models over massive amounts of user activity data. We empirically demonstrate the effectiveness of our library in learning ranking functions for large-scale search and recommendation applications in Gmail and Google Drive. View details
    Preview abstract We propose learning deep models that are monotonic with respect to a user specified set of inputs by alternating layers of linear embeddings, ensembles of lattices, and calibrators (piecewise linear functions), with appropriate constraints for monotonicity, and jointly training the resulting network. We implement the layers and projections with new computational graph nodes in TensorFlow and use the ADAM optimizer and batched stochastic gradients. Experiments on benchmark and real-world datasets show that six-layer monotonic deep lattice networks achieve state-of-the art performance for classification and regression with monotonicity guarantees. View details
    Preview abstract Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints. View details
    Monotonic Calibrated Interpolated Look-Up Tables
    Maya Gupta
    Andrew Cotter
    Konstantin Voevodski
    Alexander Mangylov
    Wojciech Moczydlowski
    Alexander van Esbroeck
    Journal Machine Learning Research (JMLR) (2016)
    Preview abstract Real-world machine learning applications may require functions to be interpretable and fast to evaluate, in addition to accurate. In particular, guaranteed monotonicity of the learned function can be critical to user trust. We propose meeting these three goals for low-dimensional machine learning problems by learning flexible, monotonic functions using calibrated interpolated look-up tables. We extend the structural risk minimization framework of lattice regression to train monotonic look-up tables by solving a convex prob- lem with appropriate linear inequality constraints. In addition, we propose jointly learning interpretable calibrations of each feature to normalize continuous features and handle categorical or missing data, though this changes the optimization problem to be non-convex. We address large-scale learning through parallelization, mini-batching, and propose random sampling of additive regularizer terms. Experiments on seven real-world problems with five to sixteen features and thousands to millions of training samples show the proposed monotonic functions can achieve state-of-the-art accuracy on practical problems while providing greater transparency to users. View details
    Preview abstract In many real-world machine learning problems, there are some inputs that are known should be positively (or negatively) related to the output, and in such cases constraining the trained model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation. View details
    No Results Found