Browsing Applied Mathematics by Subject "analysis of stochastic gradient descent"
Now showing items 1-1 of 1
-
Bidirectional TopK Sparsification for Distributed Learning
(University of Waterloo, 2022-05-27)Training large neural networks requires a large amount of time. To speed up the process, distributed training is often used. One of the largest bottlenecks in distributed training is communicating gradients across different ...