I always get the error :
TypeError: Target data is missing. Your model has loss
: <function TPFVAE.train_model.. at 0x19a69d430>, and therefore expects target data to be passed in fit()
.
while training to train the VAE.
As I am trying to use unsupervised learning with a VAE there shouldnt be any targets to rely on. Appartently the function i referened first is taking care of that for the mnist dataset.
But my data isn’t image based. Its just pure csv, lots of columns and rows with numbers. Just simply said. There are no target values and the VAE should be responsible to drill the dataset down to its most important information.
My code setup is as follows right now:
self.train_dataset = tf.data.Dataset.from_tensor_slices(self.data.X_train).batch(128)
self.val_dataset = tf.data.Dataset.from_tensor_slices(self.data.X_val).batch(128)
self.test_dataset = tf.data.Dataset.from_tensor_slices(self.data.X_test).batch(128)
This is the model setup as by the aforementioned tutorial.
input_dimensions = self.data.inputs_dim
latent_space_dimensions = 5
activation = tf.keras.layers.ReLU()
# Prior assumption is currently a normal gaussian distribution
prior = tfd.Independent(tfd.Normal(loc=tf.zeros(latent_space_dimensions), scale=1),
reinterpreted_batch_ndims=1)
encoder_inputs = keras.Input(shape=(input_dimensions))
h1 = layers.Dense(input_dimensions, activation=activation)(encoder_inputs)
h2 = layers.Dense(input_dimensions / 2, activation=activation)(h1)
h3 = layers.Dense(input_dimensions / 3, activation=activation)(h2)
h4 = tfkl.Dense(tfpl.MultivariateNormalTriL.params_size(latent_space_dimensions),
activation=None)(h3)
h5 = tfpl.MultivariateNormalTriL(
latent_space_dimensions,
activity_regularizer=tfpl.KLDivergenceRegularizer(prior, weight=1.0))(h4)
self.encoder = keras.Model(encoder_inputs, h5, name="encoder")
# Build the decoder
decoder_inputs = keras.Input(shape=(latent_space_dimensions,))
h1 = layers.Dense(input_dimensions / 3, activation=activation)(decoder_inputs)
h2 = layers.Dense(input_dimensions / 2, activation=activation)(h1)
decoder_outputs = layers.Dense(input_dimensions)(h2)
self.decoder = keras.Model(decoder_inputs, decoder_outputs, name="decoder")
print(self.encoder.summary())
print(self.decoder.summary())
self.vae = tfk.Model(inputs=self.encoder.inputs,
outputs=self.decoder(self.encoder.outputs[0]))
Compiling the model and fiting the model looks like this:
negloglik = lambda x, rv_x: -rv_x.log_prob(x)
self.vae.compile(optimizer=tf.optimizers.Adam(learning_rate=1e-3)
, loss=negloglik)
_ = self.vae.fit(self.train_dataset,
epochs=15,
validation_data=eval_dataset,
batch_size=32,
verbose=1)
However, as already mentioned, as soon as the model tries to learn it errors out complaining about missing target data. I dont know how to create that, and how to fix this issue. I dont have labels, and the implementation of a VAE without the probability library doesnt require labels either.