When I try to train a keypoint detection model, I first try tf.GradientTape(). Data loading uses tf.keras.utils.Sequence, code show as below
epoch = 20
learing_rate = 0.001
model = MyModel()
optimzers = tf.keras.optimizers.Adam(learning_rate=learing_rate)for i in range(epoch):
for j in range(0,my_training_batch_generator.len()): # iterate all image,label
images,labels = my_training_batch_generator.getitem(j) # image(4,224,224,3),label(4,56,56,17)
with tf.GradientTape() as tape:
y_pred = model(images) # get model output(4,56,56,17)
loss = tf.square(labels, y_pred)
loss = tf.reduce_mean(loss)
grads = tape.gradient(loss, model.trainable_variables)
optimzers.apply_gradients(grads_and_vars=zip(grads, model.trainable_variables))
It works very well and achieves good detection results. But I want to make use of the callback, so trying to train with model.fit
def loss_function(y_true, y_pred):
loss = tf.square(y_true-y_pred)
loss = tf.reduce_mean(loss)
return lossmodel.compile(
loss = loss_function,
# loss = tf.keras.losses.MSE,
optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate)
)model.fit(
x=my_training_batch_generator, # tf.keras.utils.Sequence
epochs=epoch,
)
There is no change in any other settings, the accuracy after training is very bad, and I observe that the loss value keeps the same in each epoch.
Is there any difference between the two methods of training? This question has really bothered me for days, thanks for your answer