Asynchronous Inference Execution with TFLite Model

Hi I’m currently using the TFLite benchmark tool to measure the performance of my model. I’m interested in executing some of the graph’s operations asynchronously. I noticed that TensorFlow Lite seems to support asynchronous operations under tensorflow/tensorflow/lite/core/async/, but I’m unsure how to utilize this in my benchmarking process.

Could anyone provide guidance or examples on how to implement asynchronous inference execution using TFLite? Any help or pointers would be greatly appreciated. Thank you!

Hi @rita19991020 ,

TFLite asynchronous execution is right now under experimental phase. Once it is fully functional we will get back to you or track the link.

However, for the time being, for your specific usecase if you would like to proceed with Asynchronous execution, modify benchmarking tool’s code by introducing delays between certain operations in the graph.

Thank You

Hi thanks for reply!!
I am exploring methods other than inserting delays and am interested in controlling the execution of certain operations within a model. It seems that using Signature allows for such control.
Can I generate subgraphs with Signature, create a signature runner to manage the execution of each operation, and then use std::async to execute them asynchronously?

Hi @rita19991020,

Sorry for the delayed response. You can create subgraphs with TFLite SignatureDefs and the execution of specific parts of the model can be controlled by the signature runner.

Here is a sample model with defined signatures. Change according to your model.

import tensorflow as tf

class Model(tf.Module):
     def __init__(self):
              super(Model,self).__init__()
              self.dense1 = tf.keras.layers.Dense(2)
              self.dense2 = tf.keras.layers.Dense(1)
     @tf.function(input_signature = [tf.TensorSpec(shape=[None], dtype=tf.f;oat32)])
     def subgraph1(self, x):
     return self.dense1(x)

     @tf.function(input_signature = [tf.TensorSpec(shape=[None], dtype=tf.f;oat32)])
     def subgraph2(self, x):
     return self.dense2(x)
model = Model( )
tf. saved_model.save(model, "path to saved model",  signatures = {'subgrapg1' : model.subgraph1,
                                                                                               'subpraph2 : model.subgraph2 })

Then Convert the model into TFLite and load and execute the subgraphs using SignatureRunner in C++. Please follow Signatures in Tensorflow Lite documentation for complete guidance.

Thank You