Memory issue when start fit a model

Hello.
I have datasets with 48k train and 12k validation images. Image size ±1mb
Model some kind of UNET.
My machine is i5/64Gb RAM/NVIDIA GeForce RTX 3080 Ti
And I have a problem with memory - if my batch size great than 2! I have errors about memory.
I load datasets with tf.keras.preprocessing.image_dataset_from_directory

I use

gpus = tf.config.experimental.list_physical_devices(‘GPU’)
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)

Is this a normal behaviour for my dataset size? Or I use a bad pipeline with my data?

I only could train my model on 4 A100 GPUs. (using
mirrored_strategy = tf.distribute.MirroredStrategy()
with mirrored_strategy.scope(): )
In this case I can use batch = 32

Hi @Alexander_Tov ,

Your issue with memory when training a UNET model with a batch size greater than 2 on your local machine may be due to the large size of the images (±1MB) and the architecture of the UNET model, which is known for being memory-intensive due to its multiple layers and operations .
Here are few suggestions addressing potential issues ,

  • Optimize Data Pipeline:
  • Use the tf.data.Dataset API for more control and optimization. Enable data prefetching and parallel data loading.
  • Reduce Batch Size: Stick with a smaller batch size if memory is constrained. A batch size of 2 might be necessary given your current setup.
  • Model Optimization: Reduce the number of layers or filters in your UNET model. Use mixed precision training to reduce memory usage.
  • Memory Growth: Ensure that TensorFlow is configured to allow memory growth as you’ve done.

Hope this Helps ,

Thank You .