TypeError("dataset length is unknown.") while reading tfrecord

Rishik_Mourya · July 14, 2021, 5:13pm

Trying to read the tfrecord file and use it to train the model with .fit call but getting this error:

TypeError("dataset length is unknown.")

Here’s my tfrecord code:

FEATURE_DESCRIPTION = {
    'lr': tf.io.FixedLenFeature([], tf.string),
    'hr': tf.io.FixedLenFeature([], tf.string),
}

def parser(example_proto):
    parsed_example = tf.io.parse_single_example(example_proto, FEATURE_DESCRIPTION)
    lr = tf.io.decode_jpeg(parsed_example['lr'])
    hr = tf.io.decode_jpeg(parsed_example['hr'])
    return lr, hr

train_data = tf.data.TFRecordDataset(TFRECORD_PATH)\
                    .map(parser)\
                    .batch(BATCH_SIZE, drop_remainder = True)\
                    .prefetch(tf.data.AUTOTUNE)

And len(train_data) is giving error TypeError("dataset length is unknown.") because the cardinality is -2, or in other words the train_data is unable to capture the total number of samples because the dataset source is a file.
Is there any way we can tell the train_data how many samples/batches are there?

Bhack · July 14, 2021, 6:17pm

Check this thread:

https://tensorflow-prod.ospodiscourse.com/t/typeerror-dataset-length-is-unknown-tensorflow/

Rishik_Mourya · July 15, 2021, 4:10am

The solution is to manually set the cardinality as below:

# print(len(train_data)) gives error
train_data = train_data.apply(tf.data.experimental.assert_cardinality(NUM_BATCHES))
print(len(train_data)) # NUM_BATCHES

Topic		Replies	Views
TypeError: dataset length is unknown tensorflow General Discussion help_request	9	8682	May 20, 2021
Unable to read TFRecord using tf.data.TFRecordDataset TensorFlow models , datasets , learning , help_request	2	1997	August 27, 2021
Tensorflow Dataset and NetCDF (tf.py_function) General Discussion datasets , help_request	1	466	September 3, 2024
How to set Dataset Cardinality of a dataset created using tf.data.Dataset.from_generator()? General Discussion datasets , tfdata , help_request	4	3108	August 8, 2022
Why I am getting 'None' while decoding Tfrecords? General Discussion help_request	1	416	November 22, 2024

TypeError("dataset length is unknown.") while reading tfrecord

Related topics