Gal Yona
Gal Yona is a Research Scientist at Google Research, Tel Aviv, where she is working on improving factuality in large language models, with an emphasis on robustness and uncertainty. Before joining Google, Gal completed her PhD in Computer Science at the Weizmann Institute of Science, developing definitions and algorithms for preventing discrimination in machine learning models. Gal received numerous award during her PhD, including the Blavatnik Prizes for Outstanding Israeli Doctoral Students in Computer Science (2022) and the Google PhD Fellowship in Machine Learning (2021).
Research Areas
Authored Publications
Sort By
Useful Confidence Measures: Beyond the Max Score
NeurIPS 2022 Workshop on Distribution Shifts (DistShift) (2022) (to appear)
Preview abstract
An important component in deploying machine learning (ML) in safety-critic applications is having a reliable measure of confidence in the ML's predictions. For a classifier $f$ producing a probability vector $f(x)$ over the candidate classes, the confidence is typically taken to be $\max_i f(x)_i$. This approach is potentially limited, as it disregards the rest of the probability vector. In this work, we derive several confidence measures that depend on information beyond the maximum score, such as margin-based and entropy-based measures, and empirically evaluate their usefulness. We focus on NLP tasks and Transformer-based models. We show that in the "out of the box" regime (where the scores of $f$ are used as is), using only the maximum score to inform the confidence measure is highly suboptimal. In the post-processing regime (where the scores of $f$ can be improved using additional held-out data), this remains true (though the differences are less pronounced), with entropy-based confidence emerging as a surprisingly useful measure.
View details
Active Learning with Label Comparisons
Shay Moran
Uncertainty in Artificial Intelligence (submitted) (2022)
Preview abstract
Supervised learning typically relies on manual annotation of the true labels. However, when there are many potential labels, it will be time consuming for a human annotator to search these for the best one. On the other hand, comparing two candidate labels is often much easier. In this paper, we focus on this type of pairwise supervision, and ask how it can be used effectively in learning, and in particular active learning. We obtain several surprising results in this context. In principle, finding the best label out of $k$ can be done with $k-1$ active queries. However, we show that there is a natural class where this approach is in fact sub-optimal, and that there is a more comparison-efficient active learning scheme. A key element in our analysis is the ``label neighborhood graph'' of the true distribution, which has an edge between two classes if they share a decision boundary. We also show that in the PAC setting, pairwise comparisons cannot provide improved sample complexity in the worst case. We complement our theoretical results with experiments, clearly demonstrating the effect of the neighborhood graph on sample complexity.
View details