Referring to numpy - Why does image_dataset_from_directory return a different array than loading images normally? - Stack Overflow
I am using the tf.keras.utils.image_dataset_from_directory to load images from my hard drive when training and testing my classification model. However, I also want to be able to predict class on one single image at a time. When loading single images I have tried tf.keras.utils.load_img, PIL, cv2, and they all return the image as an array which is slightly different than the array returned from tf.keras.utils.image_dataset_from_directory. This causes differences in prediction when using the model on a dataset versus using the model to predict on one single image.
This is causing big problems for me. Is there a bug in tf.keras.utils.image_dataset_from_directory ?
Hi @Vegard_Skagestad,
I don’t think there’s a bug in tf.keras.utils.image_dataset_from_directory
.
The differences you observe in the arrays returned by tf.keras.utils.image_dataset_from_directory
and other methods (tf.keras.utils.load_img
, PIL, cv2) are due to the different preprocessing steps applied by each method.
When using tf.keras.utils.image_dataset_from_directory, the images are typically preprocessed.The resulting image dataset is usually in the format expected by the model.
On the other hand, tf.keras.utils.load_img
, PIL, and cv2 simply load the image file without any preprocessing. The resulting array represents the raw image data.
To ensure consistent predictions between dataset and single image inference, you need to preprocess the single image in the same way as the images in the dataset. You can use the same preprocessing functions that are applied to the images in tf.keras.utils.image_dataset_from_directory
.
I hope this helps!
Thanks.
I am using the exact same preprocessing, as far as I know.
tf.keras.utils.image_dataset_from_directory(
“C:/Users/vegardsk/SystemsDeltid/predTest”,
labels=‘inferred’,
label_mode=‘categorical’,
class_names=None,
color_mode=‘rgb’,
batch_size=batchSize,
image_size=imSize,
shuffle=False,
seed=None,
validation_split=None,
subset=None,
interpolation=‘bilinear’,
follow_links=False,
crop_to_aspect_ratio=False,
)
loadImage = tf.keras.utils.load_img(
path,
grayscale=False,
color_mode=‘rgb’,
target_size=imSize,
interpolation=‘bilinear’,
keep_aspect_ratio=False
)
Is there any other preprocessing going on in tf.keras.utils.image_dataset_from_directory that I am not aware of? I have also tried using no preprocessing by having only one image in the dataset and setting image_size to the orginal size of the image. When returning the arrays, the pixel values are still different by ±1.