Having issue with tf.keras.preprocessing.image_dataset_from_directory

When I try to read the images from directory order is not preserved. Please find the below screenshots for reference.

Below is the code

train_set = tf.keras.preprocessing.image_dataset_from_directory(
“train”, shuffle=False,
color_mode=‘grayscale’,
#class_names=class_names,
labels=LABELS_n,
#label_mode=‘int’,
seed=None,
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)

plt.figure(figsize=(25, 25))
for image_batch, labels_batch in train_set.take(1):
for i in range(32):
ax = plt.subplot(8, 4, i + 1)
plt.imshow(image_batch[i].numpy())
plt.title(class_names[labels_batch[i]])
plt.axis(“off”)

because of this wrong ordering, my train set is wrongly tagged with classes.
Any help is highly


image
appreciated.

You could use class_names param to control order of classes.

Hi Kzyh - Thanks for the your reply.

Images in the directory are not classified, instead i have seperate .csv file with the classes. So, to map the images with corresponding classes i have written below code.

import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras import models, layers
import matplotlib.pyplot as plt
from IPython.display import HTML

IMAGE_SIZE=256
BATCH_SIZE=32
EPOCHS=50

LABELS=pd.read_csv('labels.csv')

LABELS_n = LABELS['label'].tolist()

class_names=['Top','Trouser','Pullover','Dress','Coat','Sandal','Shirt','Sneaker','Bag','Ankle boot']

train_set = tf.keras.preprocessing.image_dataset_from_directory(
"train", shuffle=False,
    color_mode='grayscale',
    #class_names=class_names,
    labels=LABELS_n,
    #label_mode='int',
    seed=None,
  image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)

plt.figure(figsize=(25, 25))
for image_batch, labels_batch in train_set.take(1):
    for i in range(32):
        ax = plt.subplot(8, 4, i + 1)
        plt.imshow(image_batch[i].numpy())
        plt.title(class_names[labels_batch[i]])
        plt.axis("off")

Can you show me your train directory structure?
I think it should be something like this

main_directory/
…class_a/
…a_image_1.jpg
…a_image_2.jpg
…class_b/
…b_image_1.jpg
…b_image_2.jpg

Please find below folder structure.

C:\Users\Admin\Documents\Python Scripts\Kaggle\Apparel_Classification\train*train*

60K images in same folder no sub folders.

for some reason i couldnt able to upload screenshot.

If you want to use image_dataset_from_directory, you should put every file in class subfolder.
Or you can create dataset using tf.data

1 Like