Abstract

We consider straggler resiliency in decentralized learning using stochastic gradient descent under the notion of network differential privacy (DP). In particular, we extend the recently proposed framework of privacy amplification by decentralization by Cyffers and Bellet to include training latency—comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for training over a logical ring for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. Our analytical results show a tradeoff between training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme. Finally, results when training a logistic regression model on a real-world dataset are presented.