False prediction in-between frames w/ Fast RCNN object detection model

Hi,

I trained a Fast RCNN Model to detect water puddle, and the model predicted well. However, there is an issue with the model on decoding a video stream running at 30fps. As shown in the attached images,

frame# 476 - detected a puddle with 100% confidence level
frame# 477 - did not detect anything
frame# 478 - detected the same puddle again at 100% confidence level.

I would like to know if anyone has similar experience with Fast RCNN model, and what did you do to fix it?

FYI, I also did training with two other models, MobileNet v2 SSD and ResNet. These two models gave gradual prediction results (conf. level fluctuates) as camera is panned over the subject. Fast RCNN behaves erratically, for the most part, confidence level of the detected object is either > 98% or close to zero. Please share if there is a way to fix this!

frame 476: https://drive.google.com/file/d/1yFWXxfCMspJWSqd2sPAKIVfGWgvnwS02/view?usp=sharing

frame 477: https://drive.google.com/file/d/11aMaK_qZc5Z0FKw_DdvnDMBlHjywmB6O/view?usp=sharing

here is the third image:

frame 478: https://drive.google.com/file/d/1Ke75RQgG_ISpFDbkK60epeBiy63jwQ1-/view?usp=sharing