Hello!

I want to calculate the influence score for a model. This requires computing the average of the second derivation of training loss for the whole dataset. The picture below shows it:

The influence score can be calculated as:

What I need to do is to calculate H once and then plug in different test/train values into the left and right terms, respectively.

First of all, it turns out that if I am to calculate the second derivative, I have to do it per layer of the model. Ok, this is acceptable.

My question is, how could I vectorize this Hessian calculation for a batch of samples? For example, given that my model layer is a fully connected layer of shape (200, 10) and a batch of 13 images, what I need is a tensor of shape (13,200,10,200,10) which is the desired H.

So far I have been able to do it element-wise, but have had a very hard time finding a way to vectorize the process as it is super slow.

Thanks.