Hello there!
I am trying to quantize a custom model for image classification. The idea is creating a model with two branches: early exit and main. Depending on the outcome of the early exit branch, the model should either take that as the output or go back to the backbone for a final prediction. I have trained the model with both outputs and a weighted sum, however, for inference, I have done the following to make sure only one output is given:
def EE_new():
input = Input(shape=[32,32,3])
x = EE.layers[1](input)
x = EE.layers[2](x)
tres = EE.layers[3](x)
x = EE.layers[4](tres)
x = EE.layers[5](x)
x = EE.layers[6](x)
x = EE.layers[7](x)
x = EE.layers[8](x)
x = EE.layers[9]([x, tres])
activation_7 = EE.layers[10](x)
x = EE.layers[11](activation_7)
x = EE.layers[12](x)
x = EE.layers[13](x)
x = EE.layers[14](x)
x = EE.layers[16](x)
activation_7 = EE.layers[15](activation_7)
x = EE.layers[17]([x,activation_7])
common = EE.layers[18](x)
output1 = EE.layers[-4](common)
output1 = EE.layers[-2](output1)
output = ChooseBranchLayer()(common, output1)
return Model(inputs=input, outputs=output)
Where ChooseBrachLayer is:
class ChooseBranchLayer(tf.keras.layers.Layer):
def init(self):
super(ChooseBranchLayer, self).init()
def call(self, common_output, output1):
top_values, top_indices = tf.math.top_k(output1, k=2)
condition = (tf.tensordot(top_values, [ 5.39075334, -1.86204806] , axes=1) -3.78367282) > 0
return tf.cond(condition, lambda: output1, lambda: self.branch2(common_output))
def branch2(self, common_output):
x = EE.layers[19](common_output)
x = EE.layers[20](x)
x = EE.layers[21](x)
x = EE.layers[22](x)
x = EE.layers[24](x)
common = EE.layers[23](common_output)
x = EE.layers[25]([x, common])
activation_11 = EE.layers[26](x)
x = EE.layers[27](activation_11)
x = EE.layers[28](x)
x = EE.layers[29](x)
x = EE.layers[30](x)
x = EE.layers[32](x)
activation_11 = EE.layers[31](activation_11)
x = EE.layers[33]([x, activation_11])
x = EE.layers[34](x)
x = EE.layers[35](x)
x = EE.layers[37](x)
output2 = EE.layers[39](x)
return output2
This model works fine on TensorFlow. Conversion to tflite gives no problems as well. However, after providing a representative dataset to try post training integer conversion, I get the following error when calling the invoke method:
RuntimeError: tensorflow/lite/kernels/conv.cc:374 affine_quantization->zero_point->data[i] != 0 (-11 != 0)Node number 32 (CONV_2D) failed to prepare.Node number 27 (IF) failed to prepare.
Dynamic range quant works fine as well but I would love to get integer only quantization.
Thank you very much!!