Using flow_from_directory for segmentation

Hamed · October 31, 2022, 12:51am

Hi

I am trying to do semantic segmentation (binary segmentation: background:0, object: 1), and I am looking for an efficient way to load my data.
My first method was as follows, but it usually takes 5 minutes or more to load all the data. Then the model will be run.

    x_train = np.zeros((len(train_img_paths), img_height, img_width, img_channels), dtype=np.uint8)
    y_train = np.zeros((len(train_mask_paths), img_height, img_width, 1), dtype=np.bool)

    x_val = np.zeros((len(val_img_paths), img_height, img_width, img_channels), dtype=np.uint8)
    y_val = np.zeros((len(val_mask_paths), img_height, img_width, 1), dtype=np.bool)
    print('\nLoading training images: ', len(train_img_paths), 'images ...')
    for n, file_ in tqdm(enumerate(train_img_paths)):
        img = tf.keras.preprocessing.image.load_img(file_, target_size=img_size)
        x_train[n] = img

    print('\nLoading training masks: ', len(train_mask_paths), 'masks ...')
    for n, file_ in tqdm(enumerate(train_mask_paths)):
        img = tf.keras.preprocessing.image.load_img(file_, target_size=img_size, color_mode="grayscale")
        y_train[n] = np.expand_dims(img, axis=2)

    print('\nLoading test images: ', len(val_img_paths), 'images ...')
    for n, file_ in tqdm(enumerate(val_img_paths)):
        img = tf.keras.preprocessing.image.load_img(file_, target_size=img_size)
        x_val[n] = img

    print('\nLoading test masks: ', len(val_mask_paths), 'masks ...')
    for n, file_ in tqdm(enumerate(val_mask_paths)):
        img = tf.keras.preprocessing.image.load_img(file_, target_size=img_size, color_mode="grayscale")
        y_val[n] = np.expand_dims(img, axis=-1)

    model = get_model(img_height, img_width, img_channels, num_classes)
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=METRICS)

    results = model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=epochs, batch_size=batch_size)

Everything is working fine using the above code, and I can get a really good IoU.
But when I try to use ImageDataGenerator and flow_from_directory, my model overfits. The training_IoU increases (loss decreases), but the validation_IoU fluctuates around 0.72, and the loss increases after a while.
Here is my code:
I also tried to change the shuffle to True and False though maybe can solve the problem, but nothing changed. Could you please let me know what can be the possible problem or what I am missing?
The model is exactly the same as before, and the only thing I changed is the data loading method.

image_datagen = ImageDataGenerator(validation_split=0.2)
mask_datagen = ImageDataGenerator(validation_split=0.2)

seed = 123

image_generator = image_datagen.flow_from_directory(
    train_im_directory,
    class_mode=None,
    batch_size=batch_size,
    seed=seed,
    color_mode='rgb',
    target_size=(256, 256),
    subset = 'training',
    shuffle=False)

mask_generator = mask_datagen.flow_from_directory(
    train_mask_directory,
    class_mode=None,
    batch_size=batch_size,
    seed=seed,
    color_mode='grayscale',
    target_size=(256, 256),
    subset = 'training',
    shuffle=False
   )

val_image_generator = image_datagen.flow_from_directory(
    train_im_directory,
    class_mode=None,
    batch_size=batch_size,
    seed=seed,
    color_mode='rgb',
    target_size=(256, 256),
    subset = 'validation',
    shuffle=False)

val_mask_generator = mask_datagen.flow_from_directory(
    train_mask_directory,
    class_mode=None,
    batch_size=batch_size,
    seed=seed,
    color_mode='grayscale',
    target_size=(256, 256),
    subset = 'validation',
    shuffle=False
   )


train_generator = zip(image_generator, mask_generator)
val_generator = zip(val_image_generator, val_mask_generator)


model = get_model(img_hieght, img_width, img_channel, num_classes)

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=METRICS)

history = model.fit(train_generator,
                    verbose=1,
                    epochs=100,
                    steps_per_epoch= image_generator.samples//batch_size,
                    validation_data= val_generator,
                    validation_steps= val_image_generator.samples//batch_size
                    )

Thank you.

Zhenyu_Tan · November 3, 2022, 10:00pm

Check out this:
https://github.com/keras-team/keras-cv/blob/master/examples/training/semantic_segmentation/pascal_voc/deeplab_v3.py#L59

Topic		Replies	Views
Keras loss is NaN when training for semantic segmentation Keras models , help_request	1	2586	October 14, 2022
Loss become nan after some epochs General Discussion datasets , keras , help_request	5	4251	February 13, 2023
Problem of learning machine infinity reading images General Discussion datasets , model-training , tf_function	3	342	November 1, 2023
Unknown reason for low semantic segmentation U-net performance General Discussion models , keras	0	445	May 18, 2023
Keras throwing a shape mismatch error between logits and labels Keras models , help_request	1	20007	November 1, 2022

Using flow_from_directory for segmentation

Related topics