Calculating gradient from model with multiple outputs

I have a model that has 16 outputs. I want to calculate the derivative of each output with respect to time. Time is one of the 18 inputs to the model. I tried to code below

def ddt(model,x,t):
    u_t = np.zeros([50,16])
    with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
        tape.watch(t)
        u_pred = model(tf.concat([x,t], axis=1))
    for i in range(16):
         u_t[:,i] = tape.gradient(u_pred[:i],t)
    del tape
    return u_pred, u_t

u_pred shape is (50,16).
However, u_t is only nan.
tape.gradient(u_pred,t) returns a (50,1) tensor of values.
Any suggestions?

I am using tensforflow version 2.10.0

It looks like I need to use tape.batch_jacobian.
When I compare the batch_jacobian output to np.gradient(u_pred,dt) they do differ. Is the correct assumption that batch_jacobian is closer to the correct answer since np.gradient relies more on the dt time step. For example if dt got smaller and smaller would it converge on batch_jacobian?