In a webcam stream, model has a lag problem

Abhiraam_Eranti · March 16, 2022, 2:25am

I am working on an app to teach kids animal names.

The way it works is that they have a camera stream that looks at what they are pointing at, and says the animal to them(via saying the name, as well as the sound of the animal). For example, if a kid were to point the phone at a dog, the app would say “dog, woof”.

The model that I am using has a latency of around 2 seconds per prediction. I can’t show the code, but here is how it (kind of) works:

process_image: resizing, normalization
prediction
speak prediction
if the prediction came quickly, wait 3 - latency of prediction seconds.
Get the next available frame, not the second frame in the sequence.
Repeat.

A little problem that I found is that I point the camera at a dog, it says the prediction correctly, then I move the camera. The problem arises when it says dog again, before saying the correct prediction.

Why does this happen, and how can i fix it?

Thanks!

lgusm · March 16, 2022, 11:18am

Hi,

But when it says dog again, did the model predicted dog on a random image or is it a delay on your pipeline?

are you using TFLite?
it would be better to put more information on what you’re using.
For image classification 2 seconds prediction is usually too much, which model are you using? if custom, how did you customize it?

Abhiraam_Eranti · March 16, 2022, 4:53pm

Wait, sorry. 2 seconds isn’t the prediction time, it’s between 200-500 ms but there is extra time needed to say the prediction.

I’m using a tflite model which uses the yolov5s architecture. There are around 20 animals that I am detecting and found that yolo worked a lot better than ssd mobilenet or simple image classification with efficientnet.

Not sure what you mean by random image, but no, its the next available frame.

Here’s a more detailed pipeline(pseudocode)

predLoaded = true
camera.onNextFrame = predict 

async predict(image) {
   delay = 3000ms 
   if not predLoaded then return;
   
   Image =  await preprocess(image)
   Prediction = await model.predict(image)
   Prediction = await nms(prediction)
   Speak(prediction)

   Combined time = time(preprocess) + time(prediction) + time(nms)

   If Combined time < delay {
    Await Wait delay - Combined time
  }

   predLoaded = true

   
}

Abhiraam_Eranti · March 16, 2022, 4:54pm

Sorry if it is a little confusing

Abhiraam_Eranti · March 16, 2022, 4:55pm

Actually I just checked, Combined prediction time with saying animal name is around 5 seconds but on extremely fast devices it can get below the 3s threshold

Topic		Replies	Views
Using converted Yolov8 in Tensorflow js General Discussion tfjs	2	2701	August 21, 2023
Tensorflow Task Library not detecting objects in android app General Discussion tflite , models , help_request	1	664	March 13, 2024
Why is my custom model with mobile netv2 is so slow in inference time? General Discussion models , help_request , tflite , model_garden	2	1652	July 31, 2021
Image classification_loss not imporving General Discussion datasets , models , help_request , object-detection	0	1037	November 5, 2021
Tflite model not detecting any objects General Discussion tflite , keras , models , help_request	1	808	December 6, 2023

In a webcam stream, model has a lag problem

Related topics