Autograd unable to find gradient

I’m trying to use autograd to find the hessian of a cost function for a problem I want to solve, but Tensorflow doesn’t seem to be able to calculate the gradient and I’m not sure why. My code looks like:

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

from AFutilsPhaseMag import getParams

# get shared sim parameters (these don't change w/in a run in this test)
[f0, rho, tht, x, y, fi] = getParams()

# convert to tf constants for use in calculations
a = tf.constant(np.array([1., 1.]), name='a', dtype = tf.complex128)
alpha = tf.constant(np.array([0., 0.]), name='alpha', dtype = tf.complex128)
x = tf.constant(x, name = 'x', dtype = tf.complex128)
y = tf.constant(y, name = 'y', dtype = tf.complex128)
rho = tf.constant(rho, name = 'rho', dtype = tf.complex128)
tht = tf.constant(tht, name = 'tht', dtype = tf.complex128)

# record everything on a gradient tape
with tf.GradientTape() as tape:
    # watch the optimized agent params
    tape.watch(a)
    tape.watch(alpha)
    
    # get num Tx and Rx
    Na = x.numpy().size
    Ns = rho.numpy().size
    
    # make placeholder gains and storage for rec'd AF
    # TODO: gain calculation - this is just to get working grad calculations
    k = tf.constant(tf.ones((Na,1),dtype = tf.complex128), dtype = tf.complex128)
    AF = tf.Variable(tf.zeros((Ns,1), dtype = tf.complex128), dtype=tf.complex128)
    
    # calculated rec'd AF (for each Rx find and sum the contribution from each Tx)
    for rec in range(Ns):
        nextAF = tf.Variable(np.array(0), dtype = tf.complex128)
        for agent in range(Na):
            nextAF = nextAF + tf.exp(1j*alpha[agent] + k[agent]*x[agent]*tf.math.cos(tht[rec]) + k[agent]*y[agent]*tf.math.sin(tht[rec]))
            AF[rec].assign(nextAF)
            
    # convert to dB
    AF = 20*tf.math.log(AF)
    
    # find total error
    err = tf.Variable(tf.reduce_sum(tf.abs(fi - AF)), name = 'err')
    
# get gradient of error w/r.t. optimized paams
gradVars = {'a': a, 'alpha': alpha}
grad = tape.gradient(err, gradVars)

Where getParams() is a helper function that just returns a handful of numpy arrays. When I run this script I find that

grad = {'a': None, 'alpha': None}

I’ve read through the tutorials on using autograd and how tf.Variables work multiple times, but I can’t figure out what I’m doing wrong. Can anyone else spot my mistake?

TensorFlow 2.16 Repo via JARaaS Hybrid RAG

The issue you are facing with TensorFlow not being able to compute the gradients could be due to several factors. Based on your description and the provided code, here are some key points to consider:

  1. Tape Context: Ensure all operations related to the variables you want to differentiate are inside the GradientTape context.

  2. Complex Numbers: TensorFlow handles complex numbers, but sometimes special attention is needed when dealing with complex gradients. Make sure that the operations you’re performing are differentiable and supported for complex numbers.

  3. Variable Assignments: The use of tf.Variable for intermediate steps and assigning operations in a loop might affect the gradient computation. Each assign operation should propagate correctly with respect to the variables you’re watching.

  4. Loss Conversion: Ensure that the final loss value you’re computing is being used correctly within the tape context. Intermediate steps might lead to issues if they aren’t properly linked back to the initial watched variables.

Here’s a refined version of your code to address these points:

import numpy as np  
import tensorflow as tf

# Assuming getParams is defined elsewhere and correctly returns the parameters
[f0, rho, tht, x, y, fi] = getParams()  

# Convert to tf constants for use in calculations
a = tf.constant(np.array([1., 1.]), name='a', dtype=tf.complex128)  
alpha = tf.constant(np.array([0., 0.]), name='alpha', dtype=tf.complex128)  
x = tf.constant(x, name='x', dtype=tf.complex128)  
y = tf.constant(y, name='y', dtype=tf.complex128)  
rho = tf.constant(rho, name='rho', dtype=tf.complex128)  
tht = tf.constant(tht, name='tht', dtype=tf.complex128)  

# Record everything on a gradient tape
with tf.GradientTape(persistent=True) as tape:
    # Watch the optimized agent params
    tape.watch(a)  
    tape.watch(alpha)
    
    # Get num Tx and Rx
    Na = x.numpy().size  
    Ns = rho.numpy().size  
    
    # Placeholder gains and storage for rec'd AF
    k = tf.ones((Na, 1), dtype=tf.complex128)
    AF = tf.Variable(tf.zeros((Ns, 1), dtype=tf.complex128))
    
    # Calculated rec'd AF (for each Rx find and sum the contribution from each Tx)
    for rec in range(Ns):  
        nextAF = tf.Variable(0, dtype=tf.complex128)
        for agent in range(Na):  
            nextAF.assign_add(tf.exp(1j * alpha[agent] + k[agent] * x[agent] * tf.math.cos(tht[rec]) + k[agent] * y[agent] * tf.math.sin(tht[rec])))
        AF[rec].assign(nextAF)
    
    # Convert to dB
    AF_dB = 20 * tf.math.log(AF)  
    
    # Find total error
    err = tf.reduce_sum(tf.abs(fi - AF_dB))

# Get gradient of error w.r.t. optimized params
gradVars = [a, alpha]
grad = tape.gradient(err, gradVars)

print(f"Gradient with respect to 'a': {grad[0]}")
print(f"Gradient with respect to 'alpha': {grad[1]}")

Key Adjustments:

  1. Persistent Tape: By using persistent=True, you can call gradient multiple times for debugging or analysis.
  2. Assign Operations: Use assign_add for accumulating values in the inner loop instead of reassigning to ensure gradient propagation.
  3. Loss Calculation: Ensured that err is computed correctly and linked back to the watched variables.

By double-checking the points mentioned and using the refined code, you should be able to compute the gradients and proceed with obtaining the Hessian for your cost function.

Additional Resources:

If the issue persists, consider verifying the differentiability of every operation inside the GradientTape context and consulting TensorFlow’s documentation on higher-order gradients and complex numbers.