First steps in Keras - Error

Hello,

I’m trying to get into Keras and tried this example: Image classification from scratch

Sadly, after ~15min of training the model I get this error and can’t find why:

Epoch 1/50
 51/586 [=>............................] - ETA: 22:58 - loss: 0.7538 - accuracy: 0.5607
Corrupt JPEG data: 1153 extraneous bytes before marker 0xd9
 54/586 [=>............................] - ETA: 22:51 - loss: 0.7506 - accuracy: 0.5625
Corrupt JPEG data: 396 extraneous bytes before marker 0xd9
121/586 [=====>........................] - ETA: 20:21 - loss: 0.7217 - accuracy: 0.5837
Corrupt JPEG data: 65 extraneous bytes before marker 0xd9
156/586 [======>.......................] - ETA: 19:08 - loss: 0.7039 - accuracy: 0.5990
Corrupt JPEG data: 239 extraneous bytes before marker 0xd9
277/586 [=============>................] - ETA: 14:13 - loss: 0.6671 - accuracy: 0.6269
Corrupt JPEG data: 2226 extraneous bytes before marker 0xd9
321/586 [===============>..............] - ETA: 12:15 - loss: 0.6515 - accuracy: 0.6395
Corrupt JPEG data: 162 extraneous bytes before marker 0xd9
323/586 [===============>..............] - ETA: 12:09 - loss: 0.6510 - accuracy: 0.6399
Warning: unknown JFIF revision number 0.00
347/586 [================>.............] - ETA: 11:04 - loss: 0.6477 - accuracy: 0.6428
Corrupt JPEG data: 128 extraneous bytes before marker 0xd9
384/586 [==================>...........] - ETA: 9:23 - loss: 0.6404 - accuracy: 0.6475
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/tmp/ipykernel_407/871483020.py in <module>
     11     metrics=["accuracy"],
     12 )
---> 13 model.fit(
     14     train_ds, epochs=epochs, callbacks=callbacks, validation_data=val_ds,
     15 )

/usr/local/lib/python3.9/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.9/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     52   try:
     53     ctx.ensure_initialized()
---> 54     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     55                                         inputs, attrs, num_outputs)
     56   except core._NotOkStatusException as e:

InvalidArgumentError: Graph execution error:

Unknown image file format. One of JPEG, PNG, GIF, BMP required.
	 [[{{node decode_image/DecodeImage}}]]
	 [[IteratorGetNext]] [Op:__inference_train_function_6829]

I looked into the files but there are only the .jpg files provided by the tutorial.

Maybe files are corrupted. Did you try downloading them again?

I tried it two times, but I can try again.
At first, I thought its the two Thumb.db files in the two folders and deleted them, but this didn’t work.

Deleting corrupted files is mentioned in the example.
Did you try running this script? It should remove 1500 files.

I did run it, but is says zero images deleted, this is my whole notebook, maybe there is something I oversaw: https://filebox.fhooecloud.at/index.php/s/YCXDrEZz7bYp4Zq

First thing I would try is to debug this script.
You can try printing file name in this loop to see if theres something wrong with path.

import os

num_skipped = 0
for folder_name in ("Cat", "Dog"):
    folder_path = os.path.join("PetImages", folder_name)
    for fname in os.listdir(folder_path):
        print(fname) # <---- add this
        fpath = os.path.join(folder_path, fname)
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)

print("Deleted %d images" % num_skipped)

I tried it again completely from the beginning and it is deleting the files (but not the same amount as in the example) so maybe this is not the problem: https://filebox.fhooecloud.at/index.php/s/Go6maP6ZMnsqEdd

After many trial an error it worked with this code to delete corrupted files (it deleted two more files compared to the previous one):

#Filter out corrupted images

from PIL import Image

num_skipped = 0

for folder_name in ("Cat", "Dog"):
    folder_path = os.path.join("PetImages", folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        if fname.endswith('.jpg'):
            try:
                img = Image.open(fpath) # open the image file
                #print("%s", fpath)
                exif_data = img._getexif()
                img.verify() # verify that it is, in fact an image
            except:
                num_skipped += 1
                # Delete corrupted image
                os.remove(fpath)
print("PIL deleted %d images" % num_skipped)
1 Like

@Tropaion,

Because of corrupted files, you received this error. Before training your model, you have to filter out corrupted images as shown here.

I am successfully able to run code on Colab using TF v2.8. Please find the gist here for reference.Thanks!

@chunduriv I don’t want to be rude, but always giving me the same answer doesn’t help me. Since I used exactly the same code without any difference and it still doesn’t work, it has to be something wrong.