TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Akshay Naresh Modi

Chiu Yuen Koo

Chuan Yu Foo

Clemens Mewald

Denis M. Baylor

Eric Breck

Heng-Tze Cheng

Jarek Wilkiewicz

Levent Koc

Lukasz Lew

Martin A. Zinkevich

Martin Wicke

Mustafa Ispir

Neoklis Polyzotis

Noah Fiedel

Salem Elie Haykal

Steven Whang

Sudip Roy

Sukriti Ramesh

Vihan Jain

Xin Zhang

Zakaria Haque

KDD 2017

Download Google Scholar

Abstract

Creating and maintaining a platform for reliably producing and deploying machine learning models requires careful orchestration of many components—a learner for generating models based on training data, modules for analyzing and validating both data as well as models, and finally infrastructure for serving models in production. This becomes particularly challenging when data changes over time and fresh models need to be produced continuously. Unfortunately, such orchestration is often done ad hoc using glue code and custom scripts developed by individual teams for specific use cases, leading to duplicated effort and fragile systems with high technical debt.

We present TensorFlow Extended (TFX), a TensorFlow-based general-purpose machine learning platform implemented at Google. By integrating the aforementioned components into one platform, we were able to standardize the components, simplify the platform configuration, and reduce the time to production from the order of months to weeks, while providing platform stability that minimizes disruptions.

We present the case study of one deployment of TFX in the Google Play app store, where the machine learning models are refreshed continuously as new data arrive. Deploying TFX led to reduced custom code, faster experiment cycles, and a 2% increase in app installs resulting from improved data and model analysis.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Abstract

Research Areas

Meet the teams driving innovation