Calculate Hessian of loss with respect to model layer for a batch of samples

Hossein_Arjomandi · November 21, 2022, 3:47am

Hello!

I want to calculate the influence score for a model. This requires computing the average of the second derivation of training loss for the whole dataset. The picture below shows it:

The influence score can be calculated as:

What I need to do is to calculate H once and then plug in different test/train values into the left and right terms, respectively.

First of all, it turns out that if I am to calculate the second derivative, I have to do it per layer of the model. Ok, this is acceptable.

My question is, how could I vectorize this Hessian calculation for a batch of samples? For example, given that my model layer is a fully connected layer of shape (200, 10) and a batch of 13 images, what I need is a tensor of shape (13,200,10,200,10) which is the desired H.

So far I have been able to do it element-wise, but have had a very hard time finding a way to vectorize the process as it is super slow.

Thanks.

Bhack · November 21, 2022, 10:09pm

Do you want to replicate [1703.04730] Understanding Black-box Predictions via Influence Functions ?

I have not tested the repo personally but have you already tried to look at:

Hossein_Arjomandi · November 21, 2022, 10:26pm

Hi.
Thank you for your reply.

Yes, I want to replicate it. The thing is formula I mentioned is the vanilla version which is too slow. There are other approximation methods such as the hessian vector product.

I wanted to implement the Vanilla version on MNIST but as it is very expensive I wanted to see how good are the results of the original version before switching to approximation methods.

Topic		Replies	Views
Calculate Hessian matrix of a layer w.r.t loss for a batch of samples General Discussion models	0	329	November 21, 2022
Best way to compute Hessian-vector product? General Discussion docs , help_request , tfcore	1	2273	November 29, 2023
Help with refactoring nested loops General Discussion help_request , tfcore	2	1559	March 14, 2022
Help with VICReg loss terms TensorFlow models , help_request	1	513	February 18, 2023
Tape.batch_jacobian() and tape.gradient() give different results General Discussion education , help_request , tfcore	2	1264	February 28, 2022

Calculate Hessian of loss with respect to model layer for a batch of samples

Related topics