Does anyone have any idea why JPEG images decoded from [TFRecord]s using tf.io.decode_jpeg
(TFRecord and tf.train.Example | TensorFlow Core)s look different from the original images? For example:
-
Original image:
-
The same image after being decoded and reshaped to 600x600:
Is this normal?
I created a dataset of sharded TFRecords converted from a dataset of images, of shape 600x600x3, using the Keras image_dataset_from_directory
function. This function automatically converts images to tensors, of dimensions 600x600x3 in this case, and each tensor was encoded to a byte string using the tf.io.encode_jpeg
just like this:
image = tf.image.convert_image_dtype(image_tensor, dtype=tf.uint8)
image = tf.io.encode_jpeg(image)
Each TFRecord example was created like this:
def make_example(encoded_image, label):
image_feature = tf.train.Feature(
bytes_list=tf.train.BytesList(value=[
encoded_image
])
)
label_feature = tf.train.Feature(
int64_list=tf.train.Int64List(value=[
label
])
)
features = tf.train.Features(feature={
'image': image_feature,
'label': label_feature
})
example = tf.train.Example(features=features)
return example.SerializeToString()
And below is the code that loads the TFRecords Dataset, using tf.image.decode_jpeg to decode the images back to tensors of shape 600x600x3, and then saves one image to disk using PIL:
def read_tfrecord(example):
tfrecord = {
"image": tf.io.FixedLenFeature([], tf.string),
"label": tf.io.FixedLenFeature([], tf.int64),
}
example = tf.io.parse_single_example(example, tfrecord)
image = tf.image.decode_jpeg(example['image'], channels=3)
label = tf.cast(example['label'], tf.int32)
return image, label
I have absolutely no idea what is causing this apparent loss of image information, so any help would be much appreciated!
Notes:
-
I’m using Tensorflow v2.5.0. and Pillow v8.0.1
-
Find entire source code here: Images2TFRecords.py · GitHub