Hi,
Can you explain the difference between calling Adam
from tf.keras.optimizers
and tf.keras.optimizers.legacy
?
I’m using TensorFlow 2.14 with CUDA 11.2 on an RTX 3060 and 64 GB RAM. When training models like an autoencoder, my kernel crashes, even with small datasets (e.g., 100 images) and simple models. Monitoring system performance, I noticed a sudden spike in GPU usage just before the crash.
After trying many solutions, I found that using Adam
from tf.keras.optimizers.legacy
instead of tf.keras.optimizers
or passing Adam
directly in compile()
solved the problem.
I’m curious why using the legacy version resolves the issue, and why TensorFlow didn’t provide any clear error output for the crash.