DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling

Jiecao Chen; Liu Yang; Karthik Raman; Mike Bendersky; Jung-Jung Yeh; Yun Zhou; Marc Najork; Danyang Cai; Ehsan Emadzadeh

DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling

Jiecao Chen

Liu Yang

Karthik Raman

Mike Bendersky

Jung-Jung Yeh

Yun Zhou

Marc Najork

Danyang Cai

Ehsan Emadzadeh

Findings of EMNLP 2020

Google Scholar

Abstract

Pre-trained models like BERT have dominated NLP / IR applications such as single sentence classification, text pair classification, and question answering. However, deploying these models in real systems is highly non-trivial due to their exorbitant computational costs. A common remedy to this is knowledge distillation, leading to faster inference. However –as we show here – existing works are not optimized for dealing with pairs (or tuples) of texts. Consequently, they are either not scalable or demonstrate subpar performance. In this work,we propose DiPair— a novel framework for distilling fast and accurate models on text pair tasks. Coupled with an end-to-end training strategy, DiPair is both highly scalable and offers improved quality-speed tradeoffs. Empirical studies conducted on both academic and real-world e-commerce benchmarks demonstrate the efficacy of the proposed approach with speedups of over 350x and minimal quality drop relative to the cross-attention teacherBERT model.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs