I am currently investigating the memory management of the tensorflow lite.
There are various SoC in mobile devices with unified memory, which is used to share the main memory to CPU, GPU, NPU, etc.
However, it seems that tflite does not utilize the shared memory architecture.
I think it is crucial because some operators fallback to the CPU when they are not supported by the target device, which will lead to excessive memory use.
Does tflite allocate different memory spaces for each device? Or, is there any management policy that exploit unified memory, and where can I find it?