When convert FP TF models to TFlite models, how can I specify the precision of each operations? I understand how to specify the integer precision of weights and activations, but how can I get information (and set) how each operation is computed? For example, if I have an element-wise addition, how do I know if the addition is computed in 8, 16, or 32bits?
As a practical example, say I need to compute z=x+y
for a residual connection. x
and y
are both 8 bits tensors from previous conv layers. How can I compute x+y
in 16 bits? It seems to me that TFlite hasn’t offered such flexibility?