I’m trying to create a text generation system. My model has sparse inputs in the form
[0]+document
, where document is a list of integer indices corresponding to the words in a document, and I am training it with Sparse Categorical Cross Entropy, so the y values should be in the form document+[0]
. I want to use an object derived from keras.utils.Sequence
to train the model, but I’m finding that the model doesn’t understand the shape of the y values.
How should my Sequence
structure its y values?
Some further information - added here as I’m too new to edit posts.
When I structure my y values as
tensorflow.ragged.constant([row+[self.stop]
for row in batch])
The error I get is
ValueError: Inconsistent shapes: saw (None,) but expected ()
Hi @Peter_Bleackley ,
Welcome to the TensorFlow Forum .
Thank you for raising this issue. When working with ragged tensors in TensorFlow, if your sequences have consistent lengths across batches (including the stop token), you can convert them to regular tensors using tf.ragged.to_tensor()
. This approach makes the output compatible with the expected shape for keras.utils.Sequence
.
I’ve replicated your scenario and implemented this solution in a Colab notebook, which you can find here. The notebook demonstrates how to effectively handle ragged tensors in this context.
Thank you !