Massively Parallel Computation via Remote Memory Access

Laxman Dhulipala
Soheil Behnezhad
Warren Schudy
SPAA 2019

Abstract

We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the widely popular Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a round in a distributed data store. In the following round all machines are provided with random read access to the data store, subject to the same constraints on the total amount of communication as in the MPC model. Our model is inspired by the previous empirical studies of distributed graph algorithms using MapReduce and a distributed hash table service. This extension allows us to give new graph algorithms with much lower round complexities compared to the best known solutions in the MPC model. In particular, in the AMPC model we show how to solve maximal independent set in O(1) rounds, and connectivity/minimum spanning tree in O(log log_{m/n} n) rounds, which is an exponential improvement upon the best known algorithms in the MPC model with sublinear space per machine. Our results imply that the 2-Cycle conjecture, the most popular hardness conjecture in the MPC model, does not hold in the AMPC model.