A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy
Abstract
Deep learning algorithms promise to improve clinician workflows and patient outcomes. However, these gains have yet to be fully demonstrated in real world clinical settings. In this paper, we describe a human-centered study of a deep learning system used in clinics for the detection of diabetic eye disease. Through observation and interviews with nurses across eleven clinics across Thailand, we characterize several socio-environmental factors that impact model performance, nursing workflows, and patient experience. We find tensions between the model’s thresholds for data quality, and the quality of data that arise from an imperfect, resource-constrained environment. We discuss several advantages to conducting human-centered evaluative research alongside prospective evaluations of model accuracy, including: understanding contextual practices of clinicians and patients in order to inform system design, being able to utilize authentic clinical data in system evaluations, and understanding how the system operates within the context of clinical care prior to widespread deployment.