Small and Practical BERT models for Sequence Labeling

Henry Tsai; Jason Riesa; Melvin Johnson; Naveen Ari; Xin Li; Amelia Archer

Small and Practical BERT models for Sequence Labeling

Henry Tsai

Jason Riesa

Melvin Johnson

Naveen Ari

Xin Li

Amelia Archer

EMNLP 2019 (to appear)

Download Google Scholar

Abstract

We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU. Starting from a public multilingual BERT checkpoint, our final model is 34x smaller and 15x faster, and has higher accuracy than a state-of-the-art multilingual baseline. We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples. And we show the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 47 languages.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Small and Practical BERT models for Sequence Labeling

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs