I have quantized model with float32. After making tflite model it’s predicting perfectly with single image but when using in while loop it’s showing an error. I tried to follow the instruction of TensorFlow here but didn’t understand their way.
CODE:
def generate_frames(frame):
while True:
image = cv2.resize(frame,(256,256))
#converting into float32
image = tf.image.convert_image_dtype((image/255.0), dtype=tf.float32).numpy()
image = run_inference(np.expand_dims(image[:,:,:3], axis=0))
final_result = (image*255).astype(np.uint8)
ret,buffer=cv2.imencode('.jpg',final_result)
frame=buffer.tobytes()
return frame
#load model
def load_trained_model():
global interpreter, input_details, output_details
interpreter = tf.lite.Interpreter(model_path="quant_model.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
def run_inference(image):
# perform inference and parse the outputs
interpreter.set_tensor(input_details[0]['index'], image)
interpreter.invoke()
outputs = interpreter.get_tensor(output_details[0]['index'])[0]
return outputs
if __name__ == '__main__':
load_trained_model()
app.run(debug=True)
ERROR:
RuntimeError: There is at least 1 reference to internal data in the
interpreter in the form of a NumPy array or slice. Be sure to only
hold the function returned from tensor() if you are using raw data
access.
@Bhack Thanks for the source. As far I understood we need to delete internal buffer after each iteration. From the interpreter_test.py we need to perform “del in0” operation. But I am confused about how to perform it? Can you give me a hint?
interpreter = tf.lite.Interpreter(model_path="quant_model.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
def run_inference(image):
# perform inference and parse the outputs
interpreter.set_tensor(input_details[0]['index'], image)
interpreter.invoke()
outputs = interpreter.get_tensor(output_details[0]['index'])[0]
**I think, need to perform the buffer delete operation here** (but how ?)
return outputs
If you see many of these operation has a safety guard, you can find the check description here:
I don’t think that the problem is on set_tensor and get_tensor as they are the slow (copy) API instead of tensor().
Have you tried if holding input_details and output_details is going to be similar to the WRONG pattern explained at:
This could also clarify why probably it was working when you tried with all the code gist in a single function as these references were confined to the function scope.
P.s. If it is still slow as you need to load and recreate the interpreter as its lifecycle end on each request you could try to run TF Serving instance and consume it with Flask: