Jinsung Yoon
I am a research scientist at Google Cloud AI. I am currently working on diverse machine learning research topics such as generative models, self- and semi-supervised learning, model interpretation, data imputation, and synthetic data generation.
Previously, I worked on machine learning for medicine with Professor Mihaela van der Schaar as a graduate student researcher in UCLA Electrical and Computer Engineering Department. I received my Ph.D. and M.S. in Electrical and Computer Engineering Department at UCLA, and B.S. in Electrical and Computer Engineering at Seoul National University (SNU).
https://scholar.google.com/citations?user=kiFd6A8AAAAJ&hl=en&oi=ao
Authored Publications
Sort By
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction
Somesh Jha
Transactions on Machine Learning Research (TMLR) (2024)
Preview abstract
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. These predictions can then be deferred to humans for further evaluation. As an everlasting challenge for machine learning, in many real-world scenarios, the distribution of test data is different from the training data. This results in more inaccurate predictions, and often increased dependence on humans, which can be difficult and expensive. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. Selective prediction and active learning have been approached from different angles, with the connection between them missing. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain while increasing accuracy and coverage. For this new paradigm, we propose a simple yet effective approach, ASPEST, that utilizes ensembles of model snapshots with self-training with their aggregated outputs as pseudo labels. Extensive experiments on numerous image, text and structured datasets, which suffer from domain shifts, demonstrate that ASPEST can significantly outperform prior work on selective prediction and active learning (e.g. on the MNIST→SVHN benchmark with the labeling budget of 100, ASPEST improves the AUACC metric from 79.36% to 88.84%) and achieves more optimal utilization of humans in the loop.
View details
Preview abstract
We study anomaly clustering, grouping data into coherent clusters of anomaly types. This is different from anomaly detection that aims to divide anomalies from normal data.Unlike object-centered image clustering, anomaly clustering is particularly challenging as anomalous patterns are subtle and local. We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods. We define a distance function between images, each of which is represented as a bag of embeddings, by the Euclidean distance between weighted averaged embeddings. The weight defines the importance of instances (i.e., patch embeddings) in the bag, which may highlight defective regions. We compute weights in an unsupervised way or in a semi-supervised way when labeled normal data is available. Extensive experimental studies show the effectiveness of the proposed clustering framework along with a novel distance function upon existing multiple instance or deep clustering frameworks. Overall, our framework achieves 0.451 and 0.674 normalized mutual information scores on MVTec object and texture categories and further improve with a few labeled normal data(0.577, 0.669), far exceeding the baselines (0.244, 0.273)or state-of-the-art deep clustering methods (0.176, 0.277).
View details
Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs
Somesh Jha
Findings of the Association for Computational Linguistics: EMNLP (2023)
Preview abstract
Large language models (LLMs) have recently shown great advances in a variety of tasks, including natural language understanding and generation. However, their use in high-stakes
decision-making scenarios is still limited due to the potential for errors. Selective prediction
is a technique that can be used to improve the reliability of the LLMs by allowing them to abstain from making predictions when they are unsure of the answer. In this work, we propose a novel framework for adaptation with self-evaluation to improve the selective prediction performance of LLMs. Our framework is based on the idea of using parameter-efficient tuning to adapt the LLM to the specific task at hand while improving its ability to perform self-evaluation. We evaluate our method on a variety of question-answering (QA) datasets and show that it outperforms state-of-the-art selective prediction methods. For example, on the CoQA benchmark, our method improves the AUACC from 91.23% to 92.63% and improves the AUROC from 74.61% to 80.25%.
View details
SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch
Chun-Liang Li
Kihyuk Sohn
Transactions on Machine Learning Research (TMLR) (2023)
Preview abstract
Semi-supervised anomaly detection is a common problem, as often the datasets containing anomalies are partially labeled. We propose a canonical framework: Semi-supervised Pseudo-labeler Anomaly Detection with Ensembling (SPADE) that isn't limited by the assumption that labeled and unlabeled data come from the same distribution. Indeed, the assumption is often violated in many applications -- for example, the labeled data may contain only anomalies unlike unlabeled data, or unlabeled data may contain different types of anomalies, or labeled data may contain only `easy-to-label' samples. SPADE utilizes an ensemble of one class classifiers as the pseudo-labeler to improve the robustness of pseudo-labeling with distribution mismatch. Partial matching is proposed to automatically select the critical hyper-parameters for pseudo-labeling without validation data, which is crucial with limited labeled data. SPADE shows state-of-the-art semi-supervised anomaly detection performance across a wide range of scenarios with distribution mismatch in both tabular and image domains. In some common real-world settings such as model facing new types of unlabeled anomalies, SPADE outperforms the state-of-the-art alternatives by 5% AUC in average.
View details
Self-supervise, Refine, Repeat: Improving Unsupervised Anomaly Detection
Chun-Liang Li
Kihyuk Sohn
Transactions on Machine Learning Research (TMLR) (2022)
Preview abstract
Anomaly detection (AD), separating anomalies from normal data, has many applications across domains, from security to healthcare. While most previous works were shown to be effective for cases with fully or partially labeled data, that setting is in practice less common due to labeling being particularly tedious for this task. In this paper, we focus on fully unsupervised AD, in which the entire training dataset, containing both normal and anomalous samples, is unlabeled. To tackle this problem effectively, we propose to improve the robustness of one-class classification trained on self-supervised representations using a data refinement process. Our proposed data refinement approach is based on an ensemble of one-class classifiers (OCCs), each of which is trained on a disjoint subset of training data. Representations learned by self-supervised learning on the refined data are iteratively updated as the data refinement improves. We demonstrate our method on various unsupervised AD tasks with image and tabular data. With a 10% anomaly ratio on CIFAR-10 image data / 2.5% anomaly ratio on Thyroid tabular data, the proposed method outperforms the state-of-the-art one-class classifier by 6.3 AUC and 12.5 average precision / 22.9 F1-score.
View details
Algorithmic fairness in pandemic forecasting: lessons from COVID-19
Thomas Tsai
Benjamin Jacobson
Nate Yoder
Dario Sava
Meg Mitchell
Garth Graham
npj Digital Medicine (2022)
Preview abstract
Racial and ethnic minorities have borne a particularly acute burden of the COVID-19 pandemic in the United States. There is a growing awareness from both researchers and public health leaders of the critical need to ensure fairness in forecast results. Without careful and deliberate bias mitigation, inequities embedded in data can be transferred to model predictions, perpetuating disparities, and exacerbating the disproportionate harms of the COVID-19 pandemic. These biases in data and forecasts can be viewed through both statistical and sociological lenses, and the challenges of both building hierarchical models with limited data availability and drawing on data that reflects structural inequities must be confronted. We present an outline of key modeling domains in which unfairness may be introduced and draw on our experience building and testing the Google-Harvard COVID-19 Public Forecasting model to illustrate these challenges and offer strategies to address them. While targeted toward pandemic forecasting, these domains of potentially biased modeling and concurrent approaches to pursuing fairness present important considerations for equitable machine-learning innovation.
View details
Preview abstract
Understanding black-box machine learning models is crucial for their widespread adoption.
Learning globally interpretable models is one approach, but achieving high performance
with them is challenging. An alternative approach is to explain individual predictions
using locally interpretable models. For locally interpretable modeling, various methods
have been proposed and indeed commonly used, but they suffer from low fidelity, i.e. their
explanations do not approximate the predictions well. In this paper, our goal is to push the
state-of-the-art in high-fidelity locally interpretable modeling. We propose a novel framework,
Locally Interpretable Modeling using Instance-wise Subsampling (LIMIS). LIMIS utilizes a
policy gradient to select a small number of instances and distills the black-box model into a
low-capacity locally interpretable model using those selected instances. Training is guided
with a reward obtained directly by measuring the fidelity of the locally interpretable models.
We show on multiple tabular datasets that LIMIS near-matches the prediction accuracy of
black-box models, significantly outperforming state-of-the-art locally interpretable models in
terms of fidelity and prediction accuracy.
View details
Preview abstract
We propose a novel training method that integrates rules into deep learning, in a way the strengths of the rules are controllable at inference. Deep Neural Networks with Controllable Rule Representations (DeepCTRL) incorporates a rule encoder into the model coupled with a rule-based objective, enabling a shared representation for decision making. DeepCTRL is agnostic to data type and model architecture. It can be applied to any kind of rule defined for inputs and outputs. The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio. In real-world domains where incorporating rules is critical -- such as Physics, Retail and Healthcare -- we show the effectiveness of DeepCTRL in teaching rules for deep learning. DeepCTRL improves the trust and reliability of the trained models by significantly increasing their rule verification ratio, while also providing accuracy gains at downstream tasks. Additionally, DeepCTRL enables novel use cases such as hypothesis testing of the rules on data samples, and unsupervised adaptation based on shared rules between datasets.
View details
Preview abstract
We present a two-stage framework for deep one-class classification, where in the first stage, we learn self-supervised deep representations from one-class data, and then we build a classifier using generative or discriminative models on learned representations. In particular, we present a novel distribution-augmented contrastive learning by extending training distributions via data augmentation to obstruct the uniformity of vanilla contrastive representations, yielding more suitable representations for one-class classification. Moreover, we argue that classifiers inspired by the statistical perspective in generative or discriminative ways are more effective than existing approaches, such as an average of normality scores from a surrogate classifier. In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks. Not only learning a better representation, the proposed framework permits building one-class classifiers more faithful to the target task. Finally, we present visual explanations, confirming that the decision making process of our deep one-class classifier is human-intuitive.
View details
Preview abstract
In this work, we aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data.
To this end, we propose a simple two-stage framework for building anomaly detectors using normal training data only, where we first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at random location of a large image.
Our empirical study on MVTec anomaly detection database demonstrates the proposed algorithm is general to detecting various types of real-world defects. We bring the
improvement upon previous arts by 3 AUCs when learning representations from scratch. By transfer learning representations from an ImageNet pretrained model, we achieve a new state-of-the-art 96.6 AUC.
Lastly, we extend the framework to learn and extract representations from patches to allow localization of defective areas without the need of annotation.
View details