I found HierarchicalCopyAllReduce is much slower than NcclAllReduce, related issues of multi-Gpus training · Issue #971 · google/automl · GitHub. Any ideas?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Why `tf.keras.applications` is so slow? | 1 | 876 | July 7, 2021 | |
| Training speed of cnn model is too slow even after using google colab | 2 | 703 | November 16, 2023 | |
| Training with multi-gpus can not accelerate | 2 | 440 | December 13, 2022 | |
| Slow (2x30s) model load on VM pass-through GPU with NVLink | 0 | 1653 | November 30, 2021 | |
| Self-supervised contrastive learning with SimSiam | 10 | 3484 | July 8, 2021 |