Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- Video23:50219.9 MBPublished By ACM
FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks
The performance and efficiency of distributed training of Deep Neural Networks (DNN) highly depend on the performance of gradient averaging among participating processes, a step bound by communication costs. There are two major approaches to reduce ...
- Video24:59440.6 MBPublished By ACM
DCDB Wintermute: Enabling Online and Holistic Operational Data Analytics on HPC Systems
As we approach the exascale era, the size and complexity of HPC systems continues to increase, raising concerns about their manageability and sustainability. For this reason, more and more HPC centers are experimenting with fine-grained monitoring ...