I found HierarchicalCopyAllReduce is much slower than NcclAllReduce, related issues of multi-Gpus training · Issue #971 · google/automl · GitHub. Any ideas?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Why `tf.keras.applications` is so slow? | 1 | 873 | July 7, 2021 | |
Training speed of cnn model is too slow even after using google colab | 2 | 694 | November 16, 2023 | |
Training with multi-gpus can not accelerate | 2 | 439 | December 13, 2022 | |
Slow (2x30s) model load on VM pass-through GPU with NVLink | 0 | 1652 | November 30, 2021 | |
Self-supervised contrastive learning with SimSiam | 10 | 3461 | July 8, 2021 |