Jump to Content

A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy

Emma Beede
Elizabeth Baylor
Fred Hersch
Lauren Wilcox
Dr. Paisan Raumviboonsuk
Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020)


Deep learning algorithms promise to improve clinician workflows and patient outcomes. However, these gains have yet to be fully demonstrated in real world clinical settings. In this paper, we describe a human-centered study of a deep learning system used in clinics for the detection of diabetic eye disease. Through observation and interviews with nurses across eleven clinics across Thailand, we characterize several socio-environmental factors that impact model performance, nursing workflows, and patient experience. We find tensions between the model’s thresholds for data quality, and the quality of data that arise from an imperfect, resource-constrained environment. We discuss several advantages to conducting human-centered evaluative research alongside prospective evaluations of model accuracy, including: understanding contextual practices of clinicians and patients in order to inform system design, being able to utilize authentic clinical data in system evaluations, and understanding how the system operates within the context of clinical care prior to widespread deployment.