Semantic label smoothing for sequence to sequence problems

Michal Lukasik; Himanshu Jain; Aditya Krishna Menon; Seungyeon Kim; Srinadh Bhojanapalli; Felix Yu; Sanjiv Kumar

Semantic label smoothing for sequence to sequence problems

Michal Lukasik

Himanshu Jain

Aditya Krishna Menon

Seungyeon Kim

Srinadh Bhojanapalli

Felix Yu

Sanjiv Kumar

EMNLP (2020) (to appear)

Download Google Scholar

Abstract

Label smoothing has been shown to be an effective regularization strategy in classification, that prevents overfitting and helps in label de-noising.
However, extending such methods directly to seq2seq settings, such as Machine Translation, has been hindered by the large target output space, making it intractable to apply label smoothing over all possible outputs. Most existing approaches for seq2seq settings either do token level smoothing, or smooth over sequences generated by randomly substituting tokens in the target sequence. Unlike these works, in this paper, we propose a technique that smooths over \emph{well formed} relevant sequences that not only have sufficient n-gram overlap with the target sequence, but are also \emph{semantically similar}. Our method shows a consistent and significant improvement over the state-of-the-art techniques on different datasets.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Semantic label smoothing for sequence to sequence problems

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs