Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints
Abstract
Federated learning (FL) is currently the most widely
adopted framework for collaborative training of (deep) machine
learning models under privacy constraints. Albeit its popularity,
it has been observed that FL yields suboptimal results if the
local clients’ data distributions diverge. To address this issue,
we present clustered FL (CFL), a novel federated multitask
learning (FMTL) framework, which exploits geometric properties
of the FL loss surface to group the client population into clusters
with jointly trainable data distributions. In contrast to existing
FMTL approaches, CFL does not require any modifications to the
FL communication protocol to be made, is applicable to general
nonconvex objectives (in particular, deep neural networks), does
not require the number of clusters to be known a priori, and
comes with strong mathematical guarantees on the clustering
quality. CFL is flexible enough to handle client populations that
vary over time and can be implemented in a privacy-preserving
way. As clustering is only performed after FL has converged to a
stationary point, CFL can be viewed as a postprocessing method
that will always achieve greater or equal performance than
conventional FL by allowing clients to arrive at more specialized
models. We verify our theoretical analysis in experiments with
deep convolutional and recurrent neural networks on commonly
used FL data sets.