Hi everyone,
I have the following question regarding the comment made in this code about the categorical columns.
# For all categorical columns except the label column, we generate a
# vocabulary but do not modify the feature. This vocabulary is instead
# used in the trainer, by means of a feature column, to convert the feature
# from a string to an integer id.
for key in CATEGORICAL_FEATURE_KEYS:
outputs[key] = tft.compute_and_apply_vocabulary(
tf.strings.strip(inputs[key]),
num_oov_buckets=NUM_OOV_BUCKETS,
vocab_filename=key
)
I do not understand why the comment says that the features are not modified. The vocabulary is created ok, but it seems to me that the outputs[‘key’] is modified, corresponding to a modification of the feature.
Am I missing something here?
Thanks
Best regards
Jerome