Linear-Time User-Level DP SCO via Robust Statistics
Abstract
User-level differentially private stochastic convex optimization (DP-SCO) has garnered significant attention due to the paramount importance in safeguarding user privacy in large-scale machine learning applications. Current methods, such as those based on Differentially Private Stochastic Gradient Descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility due to the need to privatize every intermediate iterate. In this work, we introduce a novel linear-time algorithm that leverages robust statistics, specifically the geometric median and trimmed mean, to overcome these challenges. Our approach uniquely bounds the sensitivity of all intermediate iterates of SGD with gradient estimation based on robust statistics, thereby significantly reducing the gradient estimation noise and enhancing the privacy-utility trade-off. By sidestepping the repeated privatization required by previous methods, our algorithm not only achieves an improved theoretical privacy-utility balance but also maintains computational efficiency. This work sets the stage for more robust and efficient privacy-preserving techniques in machine learning, with implications for future research and application in the field.