Google Research

Data Valuation using Reinforcement Learning

Abstract

Quantifying the value of datum is a fundamental problem in machine learning. Besides building insights about the learning task, data valuation has applications in diverse use-cases, such as domain adaptation, corrupted sample discovery, and robust learning. To adaptively learn data values jointly with the predictive model, we propose a meta learning framework - named Data Valuator using Reinforcement Learning (DVRL). We employ a data value estimator, modeled by a deep neural network, to output how likely each datum is used in training of the predictive model. Training of the data value estimator is guided with the reinforcement signal based on a reward directly obtained from the performance on the target task. We evaluate DVRL in various applications across multiple types of datasets. DVRL yields superior quality data value estimates compared to alternative methods. The corrupted sample discovery performance of DVRL is close to optimal (i.e. as if the noisy samples are apriori known) in many regimes. For domain adaptation and robust learning tasks, outperformance of DVRL is significant - 14.6\% and 10.8\% average performance improvements, respectively.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work