The general approach in TFLite Micro (Quantized) for layers such as Convolution, Multiplication is that the operation is performed and the resulting tensor is Scaled/Saturated (SaturatingRoundingDoublingHighMul followed by RoundingDivideByPOT )
However the Add Operation seems to have a different approach.
The input tensors are first scaled/saturated, the resulting tensor after computation is also scaled/saturated.
Any mathematical reason as to why the add layer needs this preprocessing?
The same computational approach is also followed in CMSIS-NN for Add layer.
(arm_nn_requantize function from arm_nnsupportfunctions.h)
Curious to know the mathematical implications behind this.