If I want to do a multi-label text classification task, not multi-class classification, and my data is in this format:
1 this is a test. 0,0,1,0
2 this is another test 0,1,1,1
3 one more test 1,0,0,1
How should I prepare my data so that Keras preprocessing API can easily create TF.DataSet from it? For single label classification, I can use this format (one file directory per class) as below from the Keras/TF tutorial. But if my task is multi-label classification, how should I go about this and make tf.keras.preprocessing.text_dataset_from_directory still works with my data?
raw_train_ds = tf.keras.preprocessing.text_dataset_from_directory(
'aclImdb/train',
batch_size=batch_size,
validation_split=0.2,
subset='training',
seed=seed)
class_names = raw_train_ds.class_names
train_ds = raw_train_ds.cache().prefetch(buffer_size=AUTOTUNE)