Conformal prediction under ambiguous ground truth
Abstract
In safety-critical classification tasks, conformal prediction allows to perform rigorous uncertainty quan tification by providing confidence sets including the true class with a user-specified probability. This generally assumes the availability of a held-out calibration set with access to ground truth labels. Unfor tunately, in many domains, such labels are difficult to obtain and usually approximated by aggregating some expert opinions. In fact, this holds true for almost all datasets, including well-known ones such as CIFAR and ImageNet. When expert opinions are not resolvable, there is inherent ambiguity present in the labels. That is, we do not have “crisp”, definitive ground truth labels and this uncertainty should be taken into account during calibration. In this paper, we develop a conformal prediction framework for such settings with ambiguous ground truth which relies on an approximation of the underlying posterior distribution of labels given inputs. We demonstrate our methodology on synthetic and real datasets, including a case study of skin condition classification in dermatology.