# Fast and Accurate k-means++ via Rejection Sampling

• Ashkan Norouzi Fard
• Christian Sohler
• Ola Svensson
• Silvio Lattanzi
We present the first distributed approximation algorithm for the Euclidean $k$-median problem with an optimal trade-off between memory usage and the number of parallel rounds. Our algorithm even works in the setting where each machine has very limited memory $s\in \Omega(\log n)$ and it is work efficient. In the future, it would be interesting to obtain similar results for other clustering problems and to improve the approximation factor of our algorithm.