Understinding TF Serving for inference

Laura · September 27, 2022, 1:40am

Hi all. I am trying to perform the inference of one image using a QKeras quantized YOLOv4 model. The original code can be found here. However, I have changed the loading model line code to load my quantized model instead of the original one:

        print("Load model...")
        saved_model_loaded = qkeras_utils.load_qmodel('yolo_quantized.h5')
        print("Model loaded!")

        batch_data = tf.constant(images_data) 
        infer = saved_model_loaded.signatures['serving_default']
        pred_bbox = infer(batch_data) #SHOWS AN ERROR

       for key, value in pred_bbox.items():
            boxes = value[:, :, 0:4]
            pred_conf = value[:, :, 4:]

Currenly, I am experiencing this error:

Load model...
Model loaded!
Traceback (most recent call last):
  File "/mnt/beegfs/gap/laumecha/conda-qkeras/tensorflow-yolov4-tflite/detect.py", line 117, in <module>
    app.run(main)
  File "/mnt/beegfs/gap/laumecha/miniconda3/envs/qkeras_env/lib/python3.9/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/mnt/beegfs/gap/laumecha/miniconda3/envs/qkeras_env/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/mnt/beegfs/gap/laumecha/conda-qkeras/tensorflow-yolov4-tflite/detect.py", line 85, in main
    infer = saved_model_loaded.signatures['serving_default']
AttributeError: 'Functional' object has no attribute 'signatures'

I have been trying to learn about the TF serving, and I have understood that I need to convert my model to TF to use the TF serving. However, I assume that converting my QKeras model to the Tensorflow model will raise a QConv layer exception or something similar.

Is there a way to translate this TF serving inference function to one that does not need TF serving? Also, any explanation about the original code (how we can run the inference with this TF serving) is welcome.

Topic		Replies	Views
Direct loading/inference on model created in Vertex AI General Discussion models , serving , tfx , ml_ops , help_request	1	1552	July 8, 2022
Fastest way to load_model for inference Keras models , tfx , keras , help_request	4	4494	November 20, 2021
How to properly deploy Keras models for inference in Python? General Discussion models , keras , help_request	7	2123	March 31, 2022
Load and run a model with Convolutions and Pooling in Java JVM models , help_request , java	2	2998	June 6, 2022
Serving Stable Diffusion in TF Serving Show and Tell serving , keras , generative-ai , tf-serving	1	1781	January 16, 2023

Understinding TF Serving for inference

Related topics