Hello guys,
I’d like to ask if there is some info, examples, sample code or recent benchmarking data for XLA/tfcompile for inference of float and quantized models?
I’d like to find out and compare tflite vs. XLA/AOT.
Thank you
Hello guys,
I’d like to ask if there is some info, examples, sample code or recent benchmarking data for XLA/tfcompile for inference of float and quantized models?
I’d like to find out and compare tflite vs. XLA/AOT.
Thank you
/cc: @battery
Could you shed some light on this?
There is XLA and there is tfcompile() - I am looking for a way how to use it for ARM-based devices, is it possible to do AOT (Ahead-of-Time) compilation?
Per indication, tfcompile() can be used as a standalone tool, which converts TensorFlow graph into executable code but for x86-64 CPU only.
I’d like to try out XLA/tfcompile()/AOT for arm-based device and I’d like to have better ways to improve performance. As indicated thus can reduce binary size, and also avoid runtime overhead. So I’d like to achieve this.