Slice Finder: Automated Data Slicing for Model Validation

Neoklis Polyzotis; Steven Whang; Tim Klas Kraska; Yeounoh Chung

Slice Finder: Automated Data Slicing for Model Validation

Neoklis Polyzotis

Steven Whang

Tim Klas Kraska

Yeounoh Chung

Proceedings of the IEEE Int' Conf. on Data Engineering (ICDE), 2019 (to appear)

Download Google Scholar

Abstract

As machine learning (ML) systems become democratized,
helping users easily debug their models becomes increasingly
important. Yet current data tools are still primitive when
it comes to helping users trace model performance problems
all the way to the data. We focus on the particular prob-
lem of slicing data to identify subsets of the training data
where the model performs poorly. Unlike general techniques
(e.g., clustering) that can find arbitrary slices, our goal is to
find interpretable slices (which are easier to take action com-
pared to arbitrary subsets) that are problematic and large.
We propose Slice Finder, which is an interactive framework
for identifying such slices using statistical techniques. The
slices can be used for applications like diagnosing model fair-
ness and fraud detection where describing slices that are
interpretable to humans is necessary.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Slice Finder: Automated Data Slicing for Model Validation

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs