Tflite inference on a single image

I have gone through the image classifucation example with tflite model maker: TensorFlow Lite Model Maker による画像分類 and I have generated the custom model.

So now, I want to run an inference on this model, and when I look at the docs: Image classification with TensorFlow Lite Model Maker it says the image should be 224,224,3 and in the range [0,1]

The 224,224,3 is fine, but I am puzzled by the [0,1] normalisation. Currently, I do this:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.uint8)
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

as the preprocessing step, but then my outputs are in the [0-255] uint8 type. However, I was hoping to see these confidence scores in the [0-1] range.

Am I missing something here?

Hi @John_J_Watson ,

I am a little confused. You are writing about pre-processing but in the end you mention confidence scores.
You can normalize the values by dividing the image pixel RGB values by 255.

Best

@George_Soloupis Thank for for your comment. I apologize - my question was not very clear. So, what I did was follow the tutorial for image clasification and produced a tflite model from this. When I run inference on a single image (with preprocessing copied from tflite webpages), I get result, an array (of dim=number_of_classes) with values within the range 0,256 (the sum of the array is always 256. I am somewhat perplexed as to why this is and I am trying to understand a bit more.

To this end, my first thought was that maybe I need to somehow normalise the input image (I know, now, this does not make sense).

When I try dividing the result array contents by 256 it kinda makes sense as a probability score - but I am unsure. I am unsure because, there is a small variantion between these scores and the ones output by perdict_top_k.

I hope my question is a bit more clearer now.

@John_J_Watson If for training you have used a preprocessing step, then for inference even for a single image you have to use the same. But check first what model has been used as a backbone for training. By checking this you will be able to retrieve the preprocessing steps.

@George_Soloupis thank you again for taking the time to reply to this.

So, I have literally just followed the official tflite classification tutorial to generate the model.
The tutorial simply loads data from folders from what I understand. The default model is efficientnet0. It says the following on the page:

Preprocess the raw input data. Currently, preprocessing steps including normalizing the value of each image pixel to model input scale and resizing it to model input size. EfficientNet-Lite0 have the input scale [0, 1] and the input image size [224, 224, 3].

Now, I am unsure how to scale the input in the 0-1 range mentioned here ← hence we my original question since my input seems to be in 0-255 range.

So, I either do this:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.uint8)
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

or I do:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.cast(img, dtype=tf.float32) / tf.constant(255, dtype=tf.float32) 
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

the difference is:

img = tf.cast(img, dtype=tf.float32) / tf.constant(255, dtype=tf.float32)

to norm the values between 0-1. But then the class values are totally wrong as in it makes the wrong preds.

However, if I do NOT norm the values to 01, the results look perfect (although the results are in the range [0-255] and taking the argmin of this gives me the correct class. So I think I am able to conlude from these little experiments that the normalisation line is not required. This however seems to go against the docs.

So, I am puzzled what is going on and I am looking for an official inference example from tflite model maker classification :frowning:

I am now wondering if the inference interpretter automatically does some normalisation I am unaware of and that I probably do not need to do any of these normalisations.

When I look at some randoom posts such as:

I see that the inference output is in the 0-255 range for a single image, which is the case for me.

The other thought I have is that, since I quantise the model via: model.export(export_dir='.') I think the output range is between 0-255. I got this idea from here: https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/android/EXPLORE_THE_CODE.md

So, yea, just a bit confused as to what the correct thing is to do here.

Thank you.

Seems like normalization is already in the model, so you should just resize image.

Check here to see the documentation of the Model Maker Image Classification

It seems that it is using (image-mean) / std.
You can follow the instructions here to install and use the specific class.

I am tagging also @Yuqi_Li as they are mentioned in the code.

Best

@George_Soloupis so it looks like I do not need to do any normalisation right (in inference mode, when loading the tflite quantised model)? as @Kzyh points out, just resize and feed it it?

The natural question now is:

  • should I pass the image through this class? or just resize and send the image to tflite.model (the one generated by model.export(export_dir='.'))

I think so. You can save tflite model file, then open it in Netron and check if there is any preprocessing.

@Kzyh thanks for that Netron tip. So when I load the model I see the first three steps as:

- input 1_0
- Quantize
- Mul B=127
- Add B=-128
- Conv...
...

The Mul and Add seems like some form of normalization? Is this right?

Yeah, seems it is normalization.

@Kzyh it seems a bit bizzare. Let me explain. So, when I train the model and do tokk_predicts like so on a single image in the test set:

    print(model.evaluate_tflite('./src_classify/model.tflite', test_data))
    predicts = model.predict_top_k(test_data)

I get (it is a 2 class problem):

[[('class0', 0.91340613)]]

So, now I export the model and run the same image and I get:

[241  15]
>>> 241/255
0.9450980392156862

so, the scores dont match up… really bizzare.

My input prep is:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img= tf.cast(img , tf.float32) * (1. / 255) #keep on, divide result by 256
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

what could be the problem :cry:

Not sure, but can you try moving this after resizing image?

* (1. / 255)