Google Research

The Building Blocks of Interpretability

  • Christopher Olah
  • Arvind Satyanarayan
  • Ian Johnson
  • Shan Carter
  • Ludwig Schubert
  • Katherine Ye
  • Alexander Mordvintsev
Distill (2018)

Abstract

Interpretability techniques are normally studied in isolation. We explore the powerful interfaces that arise when you combine them -- and the rich structure of this combinatorial space.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work