Tape.batch_jacobian() and tape.gradient() give different results

efreet408 · February 21, 2022, 11:01am

model: hidden layers - 8, hidden size - 20, input size - 2, output size - 1, tf 2.4.0

I am confused, why different ways of calculating the second derivative give different results?

@tf.function
def get_vanilla_hess(model, xs):
    with tf.GradientTape(persistent=True) as tape:
        tape.watch(xs)
        ys = model(xs)
        xbar = tape.gradient(ys, xs)
    xbarbar = tape.batch_jacobian(xbar, xs)
    
    return (ys, xbar, xbarbar)
print(get_vanilla_hess(vanilla_model, X_r)[-1][:, 0, 0])

returns

[-0.0004067 , -0.00038697, -0.00037729, ..., -0.00035329,
        -0.00038197, -0.00038998]

while

@tf.function
def get_vanilla_hess_alt(model, xs):
    with tf.GradientTape(persistent=True) as tape:
        tape.watch(xs)
        ys = model(xs)
        xbar = tape.gradient(ys, xs)
    xbarbar = tape.gradient(xbar, xs)
    
    return (ys, xbar, xbarbar)
print(get_vanilla_hess_alt(vanilla_model, X_r)[-1][:, 0])

returns

[-0.00036503, -0.00033761, -0.00032976, ..., -0.00029553,
        -0.00032992, -0.00034215]

Also:
Manually created graph for calculating the hessian returns

[-0.00040658, -0.00038687, -0.00037727, ..., -0.0003532 ,
       -0.00038189, -0.00039003]

Does tape.gradient() + tape.gradient() returns same as tape.gradient() + tape.batch_jacobian() on the diag? (d^2f/dx^2)

lgusm · February 28, 2022, 1:00pm

@markdaoust can you help here?

Mark_Daoust · February 28, 2022, 2:39pm

xbarbar = tape.gradient(xbar, xs)

On this line you’re trying to use tape.gradient to calculate the jacobian.

That’s not how it works.

When you pass gradient a non-scalar as the target, the result is the gradient of the sum of target.

Topic		Replies	Views
Calculating gradient from model with multiple outputs TensorFlow tfgradient , gradienttape	1	24	March 16, 2025
Question about how gradients are computed General Discussion help_request , tfcore	1	620	March 18, 2023
Best way to compute Hessian-vector product? General Discussion docs , help_request , tfcore	1	2277	November 29, 2023
Unexpected behavior when using batch_jacobian with multiple inputs/outputs in quantum-classical neural network TensorFlow models , tensorflow	1	42	January 19, 2025
Autograd unable to find gradient General Discussion tfgradient , help_request	1	295	June 21, 2024

Tape.batch_jacobian() and tape.gradient() give different results

Related topics