Adaptive mixing of auxiliary losses in supervised learning

Durga Sivasubramanian; Ayush Maheshwari; Prathosh AP; Pradeep Shenoy; Ganesh Ramakrishnan

Adaptive mixing of auxiliary losses in supervised learning

Durga Sivasubramanian

Ayush Maheshwari

Prathosh AP

Pradeep Shenoy

Ganesh Ramakrishnan

AAAI 2023 (2023) (to appear)

Google Scholar

Abstract

In several supervised learning scenarios, auxiliary losses are used in order to introduce additional information or constraints into the supervised learning objective. For instance, knowledge distillation aims to mimic outputs of a powerful teacher model; similarly, in rule-based approaches, weak labeling information is provided by labeling functions which may be noisy rule-based approximations to true labels. We tackle the problem of learning to combine these losses in a principled manner. Our proposal, AMAL, uses a bi-level optimization criterion on validation data to learn optimal mixing weights, at an instance-level, over the training data. We describe a meta-learning approach towards solving this bi-level
objective, and show how it can be applied to different scenarios in supervised learning. Experiments in a number of knowledge distillation and rule denoising domains show that AMAL provides noticeable gains over competitive baselines in those domains. We empirically analyze our method and share insights into the mechanisms through which it provides performance gains.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Adaptive mixing of auxiliary losses in supervised learning

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs