Tensorflow GPU predicting concurrency python

manu1 · June 12, 2023, 10:48am

I’m running a localhost API in python. The /predict endpoint takes one base64 encoded image which is then processed and predicted by the previously imported saved_model.I’m predicting with my GPU using CUDA. Now the problem is that the model only predicts one image at a time due to the GIL I think, however I’d like to predict multiple images concurrently in case there are multiple requests incoming at the same time. At the moment my GPU is only being utilized to 1-2%. Is there any way to accomplish this in python similar to Tensorflow Serving API which sadly doesn’t properly work with GPU on windows.

This is my code:

from fastapi import FastAPI
import uvicorn
from pydantic import BaseModel
import tensorflow as tf
import numpy as np
import io
from PIL import Image
import base64
import numpy as np
import time

model_path = './saved_model'
model = tf.saved_model.load(model_path)

class PredictionRequest(BaseModel):
    image: str

app = FastAPI()

@app.post("/predict")
async def predict_objects(request: PredictionRequest):
    
    image_data = base64.b64decode(request.image)
    image = Image.open(io.BytesIO(image_data))
    image_np = np.array(image)
    input_tensor = tf.convert_to_tensor(image_np)
    input_tensor = input_tensor[tf.newaxis, ...]
    ts = time.perf_counter()
    detections = model(input_tensor)
    ts2 = time.perf_counter()
    print(int(ts2 * 1000 - ts * 1000))
    detections = detections['detection_boxes'][0].numpy()
    data = []
    
    # processing return data

    return data

if __name__ == "__main__":
    uvicorn.run(app, host="127.0.0.1", port=8000)

Topic		Replies	Views
[SavedModel] Please help me check for multithreading usage General Discussion getting_started , gpu	2	601	April 6, 2023
Parallel inferencing - image classification model General Discussion gpu , help_request	1	490	June 28, 2022
Error when using TFLite interpreter in Flask General Discussion tflite , help_request	39	5678	September 16, 2022
Tensorflow serving GRPC mode General Discussion models , serving	0	1653	August 26, 2022
How to do batch prediction in pyspark? General Discussion models , help_request	1	779	July 9, 2022

Tensorflow GPU predicting concurrency python

Related topics