Given a trained TensorFlow CNN model, for image classification into 5 classes , I want to find the neuron in the last convolution layer that has maximum activation (in case of same highest value for multiple neurons, any one can be takes). Then I want to trace this neuron back to the original image, and find the patch that caused this neuron to fire(draw a small box around it ). As shown in this Youtube video.
For example here is the architecture of one of my CNN model
USE_BIAS = True
arch1 = keras.Sequential([
keras.layers.Conv2D(8, 11, strides=4, padding='valid', activation='relu', input_shape=(224, 224, 3), use_bias=USE_BIAS),
keras.layers.MaxPooling2D(3, strides=2),
keras.layers.Conv2D(16, 5, strides=1, padding='valid', activation='relu', use_bias=USE_BIAS),
keras.layers.MaxPooling2D(3, strides=2),
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu', use_bias=USE_BIAS),
keras.layers.Dense(5, activation='softmax', use_bias=USE_BIAS)
])
After I compile and fit this model, I do one forward pass on one image ,I want to find the neuron, say N , in the second Conv2D
layer , which has the highest output and then trace back to the MaxPooling2D
before this and find which neurons are connected to N (i.e N is the maximum of what patch ), then trace back that patch to first Conv2D
, and then to the original image.
How can this be done?
I’ve tried searching online but all I could find online was ways to generate feature maps or find inputs that maximally activate a layer, but these don’t help me.
I am beginner in TensorFlow, so I don’t really know where to begin.