Question about how gradients are computed

Juan_Maronas · March 17, 2023, 2:09am

I have realized that I can take the gradient of a vector w.r.t an input. In other words I could perform:

import numpy as np
import tensorflow as tf

w = tf.Variable( [[1. , 2. ],[ 3., 4.]] , name='w')
x = tf.Variable( [[1., 2.]]             , name='x')

with tf.GradientTape(persistent=True) as tape:
  y = x @ w 
  loss = y 

grad = tape.gradient( loss, [x])

Here y is a vector so I would expect gradient to be a matrix. Since I would be asking to compute the gradient of each coordinate in y wrt each coordinate in the vector ‘x’. In other words I am expecting the Jacobian. What is happening here under the hood? Since I am getting a vector.

chunduriv · March 18, 2023, 10:59am

@Juan_Maronas,

Welcome to the Tensorflow Forum!

When taking the gradient of loss with respect to x, the resulting gradient is a 1x2 matrix representing the partial derivatives of each element of loss with respect to each element of x.

To get the full Jacobian matrix of y with respect to both x and w, we can pass both x and w to the gradient method as shown below

grad = tape.gradient(loss, [x, w])

Thank you!

Topic		Replies	Views
Tf.gradients() vs tf.gradientTape.gradient() in graph mode General Discussion education , help_request , tfcore	1	1250	September 22, 2023
Tape.batch_jacobian() and tape.gradient() give different results General Discussion education , help_request , tfcore	2	1266	February 28, 2022
Compute the directional derivative of a function with a tensor General Discussion api , keras , gradienttape	3	521	January 18, 2023
Code error using Gradient Tape General Discussion help_request , tensorflow	2	1781	July 13, 2022
Backpropagation in tensorflow General Discussion tfcore	3	570	June 22, 2023

Question about how gradients are computed

Related topics