I am having a bit of a time wrapping my brain around the dataset functionality. I have a csv file that looks like this:
gender,products
M,[1,2]
M,[1,7,5,8]
I normally would create a dataset by doing something like:
tf.data.Dataset.from_tensor_slices(dict(df))
But TF complains about not being able to figure out the type spec from a list or numpy array. I tried to just leave it as a string and convert it in a map function with something like this:
@tf.function
def eval_products(x):
y = {}
y.update(x)
if tf.strings.length(x['products']) > 2:
arr = tf.strings.split(tf.strings.substr(x['products'], 1, tf.strings.length(x['products'])-2), ',')
y['products'] = tf.strings.to_number(tf.strings.strip(arr))
else:
y['products'] = tf.ragged.constant([-1.])
return y
ds = ds.map(eval_products)
This works, but feels really messy, and I can’t get it to make products
a RaggedTensor
instead of a plain Tensor
. It seems like I’m missing something really basic, but I am not sure what it is and I can’t find any examples in the docs or anywhere else, so any ideas are appreciated.