I’m currently training a Deep Q Network with the gradient tape, as outlined in the below code:
with tf.GradientTape() as tape:
q_values_current_state_dqn = self.dqn_architecture(states)
one_hot_actions = tf.keras.utils.to_categorical(actions, self.num_legal_actions, dtype=np.float32) # e.g. [[0,0,1,0],[1,0,0,0],...]
q_values_current_state_dqn = tf.reduce_sum(tf.multiply(q_values_current_state_dqn, one_hot_actions), axis=1)
error = q_values_current_state_dqn - target_q_values
loss = tf.keras.losses.Huber()(target_q_values, q_values_current_state_dqn)
dqn_architecture_gradients = tape.gradient(loss, self.dqn_architecture.trainable_variables) # Computes the gradient using operations recorded in context of this tape.
self.dqn_architecture.optimizer.apply_gradients(zip(dqn_architecture_gradients, self.dqn_architecture.trainable_variables))
But I’d like to disable the logging of the training progress, as shown below:
1/1 [==============================] - 0s 34ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 11ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 11ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 13ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 11ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 11ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 12ms/step
I understand that you can set verbose equal to 0 when using model.fit(), but I’m unsure how to go about it when using gradient tape.
Any help would be appreciated.