Decoding of tflite custom object detector output from model trained with mediapipe (MobileNetV2)

Carlos1 · April 29, 2024, 12:42am

Hi all, I trained a custom object detector model with Mediapipe, when I exported it to tflite (ok) and tried to make predictions I obtain two dictionaries as output, one supposed for bboxes and one for scores, the outputs are obtained as follows for one image:

output_details = interpreter.get_output_details()

which gives:

[{'name': 'StatefulPartitionedCall:0',
  'index': 425,
  'shape': array([    1, 12276,     4], dtype=int32),
  'shape_signature': array([    1, 12276,     4], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:1',
  'index': 423,
  'shape': array([    1, 12276,     4], dtype=int32),
  'shape_signature': array([    1, 12276,     4], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}}]

code for prediciton:

interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
boxes  = interpreter.get_tensor(output_details[0]['index'])
scores = interpreter.get_tensor(output_details[1]['index'])

boxes:

[[[ 0.01087701 -0.27369365 -0.53198564 -0.8404835 ]
  [ 0.05485853  0.02915781 -1.390534   -1.670182  ]
  [-0.12034623  0.00819616 -0.9961058  -0.395994  ]
  ...
  [-0.3435838  -0.35941318 -1.0712042  -0.43489447]
  [-0.4016505  -0.03572614 -0.67902136 -0.7194235 ]
  [-0.47916242  0.01016152  0.13207799 -0.7979872 ]]]

scores:

[[[0.005811   0.00431303 0.00324296 0.01789892]
  [0.00658012 0.01305784 0.00548336 0.01855727]
  [0.01610166 0.00838473 0.01678689 0.01819396]
  ...
  [0.00505611 0.02350343 0.01970816 0.00919266]
  [0.00427777 0.01386124 0.00888682 0.01396356]
  [0.00742702 0.00696907 0.00702236 0.00696763]]]

my question is how do you decode the output in a format that can be used for inference and visualization, I’m not understanding the structure of the values of boxes (there are negative values), the same with scores and relate them to the classes, in my case I’m trying to predict objects which belongs to one of 3 classes (+ 1 of background). I’ve seen in some examples that tflite object detectors also provide the labels and number of detections in other two dictioneries, but here I only get 2. I trained the detector based on the MobileNetV2 model with an example from mediapipe (by recommendations of google developers since tflite model maker seems to have facing issues and it will moves to mediapipe in the future).

Does anyone has used this model before and docoded the outputs to make predictions with the exported tflite models? is there a way to add more information when exporting the model to tflite and then get the other two mentioned outputs and in a decoded format directly?

Thanks,
any help will be very useful,
best regards
Carlos

mushfiqr · May 20, 2024, 3:41am

I am looking for a way to make sense of this as well. Would really be nice if someone could drop some info on this.
In my case I tried to import the model into an android application to run inference with Vision Tasks of TFLite. However, it expects 4 output tensors and is finding 2 as mentioned in the posts. tflite-model-maker has been out of the scene for quite a while now due to version conflicts and mediapipe-model-maker seems to be eating up output tensors.

Topic		Replies	Views
Issues in deciphering TensorFlowLite Interpreter output General Discussion models , android , tflite , help_request	10	1326	June 22, 2021
How to figure out what each tensor from output represent? General Discussion tfjs	1	519	October 30, 2023
Error on interpreter.get_tensor(output_details General Discussion models , tflite , help_request	13	5148	August 9, 2023
Why does TFLite (mobilenet v2) only show ten objects in the image? General Discussion tflite , help_request	2	677	April 18, 2023
Expanding tflite object detector functionality General Discussion models , keras , help_request	5	693	July 12, 2021

Decoding of tflite custom object detector output from model trained with mediapipe (MobileNetV2)

Related topics