My data is saved in a list, and each data instance has a variable length. The data is shown in the following screenshot:
How I store this list-indexed data in tf.data. Dataset for the purpose of developing efficient input pipeline?
My data is saved in a list, and each data instance has a variable length. The data is shown in the following screenshot:
How I store this list-indexed data in tf.data. Dataset for the purpose of developing efficient input pipeline?
Do you have a few line dummy example?
You can use
tf.data.Dataset.from_tensor_slices(data)
to keep your list in a tf.data.Dataset object. I believe it doesn’t really matter if the arrays are of variable length.
@Bhack
Here is my code:
train_var_len = tf.data.Dataset.from_generator(lambda: data,
tf.float64,
output_shapes=(None, 6)
)
ds_series_batch = dataset.shuffle(batch_size).padded_batch(batch_size)
Now I’m interested in merging data with labels. The label’s dimensions are ‘(200*7)’. The whole line of code is as follows:
train_ds_one = (
tf.data.Dataset.from_tensor_slices((ds_series_batch, train_y))
)
However it gives me the following error:
raise ValueError("Slicing dataset elements is not supported for rank 0.")
ValueError: Slicing dataset elements is not supported for rank 0.
Any idea, why it giving this error and how I should solve it?
@Abhiraam_Eranti I can use tf.data.Dataset.from_tensor_slices(data)
directly, but sometimes the input is excessively large. To handle this, I am using a generator to manage GPU memory. Yes then I use tf.data.Dataset.from_tensor_slices(data)
to combine data and labels. However, I am getting following error:
raise ValueError("Slicing dataset elements is not supported for rank 0.") ValueError: Slicing dataset elements is not supported for rank 0.
Now how should I fix this error? For further detail, you can see the code example in my above comment. Thanks
You will have to use
tf.data.dataset.zip((ds_series_batch, train_y)) to combine features and labels.
Also could you explain your GPU memory problem?
Thank you for the answer. Getting this error: AttributeError: module 'tensorflow._api.v2.data' has no attribute 'dataset'
Sorry for the spelling mistake. it’s tf.data.Dataset.zip