Why does image_dataset_from_directory return a different array than loading images normally?

Vegard_Skagestad · June 28, 2023, 2:50pm

Referring to numpy - Why does image_dataset_from_directory return a different array than loading images normally? - Stack Overflow

I am using the tf.keras.utils.image_dataset_from_directory to load images from my hard drive when training and testing my classification model. However, I also want to be able to predict class on one single image at a time. When loading single images I have tried tf.keras.utils.load_img, PIL, cv2, and they all return the image as an array which is slightly different than the array returned from tf.keras.utils.image_dataset_from_directory. This causes differences in prediction when using the model on a dataset versus using the model to predict on one single image.

This is causing big problems for me. Is there a bug in tf.keras.utils.image_dataset_from_directory ?

Laxma_Reddy_Patlolla · June 28, 2023, 6:39pm

Hi @Vegard_Skagestad,

I don’t think there’s a bug in tf.keras.utils.image_dataset_from_directory.

The differences you observe in the arrays returned by tf.keras.utils.image_dataset_from_directory and other methods (tf.keras.utils.load_img , PIL, cv2) are due to the different preprocessing steps applied by each method.

When using tf.keras.utils.image_dataset_from_directory, the images are typically preprocessed.The resulting image dataset is usually in the format expected by the model.

On the other hand, tf.keras.utils.load_img , PIL, and cv2 simply load the image file without any preprocessing. The resulting array represents the raw image data.

To ensure consistent predictions between dataset and single image inference, you need to preprocess the single image in the same way as the images in the dataset. You can use the same preprocessing functions that are applied to the images in tf.keras.utils.image_dataset_from_directory .

I hope this helps!

Thanks.

Vegard_Skagestad · June 29, 2023, 8:06am

I am using the exact same preprocessing, as far as I know.

tf.keras.utils.image_dataset_from_directory(
“C:/Users/vegardsk/SystemsDeltid/predTest”,
labels=‘inferred’,
label_mode=‘categorical’,
class_names=None,
color_mode=‘rgb’,
batch_size=batchSize,
image_size=imSize,
shuffle=False,
seed=None,
validation_split=None,
subset=None,
interpolation=‘bilinear’,
follow_links=False,
crop_to_aspect_ratio=False,
)

loadImage = tf.keras.utils.load_img(
path,
grayscale=False,
color_mode=‘rgb’,
target_size=imSize,
interpolation=‘bilinear’,
keep_aspect_ratio=False
)

Is there any other preprocessing going on in tf.keras.utils.image_dataset_from_directory that I am not aware of? I have also tried using no preprocessing by having only one image in the dataset and setting image_size to the orginal size of the image. When returning the arrays, the pixel values are still different by ±1.

Topic		Replies	Views
Why does tf.keras.utils.image_dataset_from_directory load images differently than other methods? General Discussion datasets , keras	1	714	July 3, 2023
Confusion regarding how tf.keras.preprocessing.image_dataset_from_directory works General Discussion api , keras , help_request	2	2098	October 10, 2022
Output difference: model(images) vs model.predict(images) TensorFlow models	2	1184	July 19, 2023
A Bug or a API Change? General Discussion bug , api , keras , help_request	2	648	August 3, 2022
Having issue with tf.keras.preprocessing.image_dataset_from_directory General Discussion api , keras , help_request	5	2209	January 19, 2022

Why does image_dataset_from_directory return a different array than loading images normally?

Related topics