I’m not familiar with TF Serving, so I would appreciate any information or direction on the problem I’m facing.
I’ve trained a classification AutoML model on Vertex AI.Then I’ve downloaded the model locally and I am trying to load it and run inference with it directly. To clarify, I have been successful in running the model within the AutoML Docker image, but this is not what I want – I need to be able to load the saved model directly and run inference with it. I need to follow the same workflow one would follow with a Keras model.
I have been struggling with this for several weeks now. I even made a post on StackOverflow, but I got no replies: protocol buffers - Get predictions from Tensorflow Serve SavedModel - Stack Overflow
I have managed to solve the error somehow. I went into the AutoML Docker container, figured out (or so I think) the protobuf functions used to transform the input, and I’ve been able to feed that input to the model. However, my model is always giving the same output now. No matter what I give as input, I will always get the exact same output.
I have been thinking it might be because I’m perhaps loading the model with TF2 functions, so I tried loading it the TF1 way. I process the data in the same way, and I get the same output, always the same output. I’m at the end of my wits here, so any feedback is appreciated. I am posting the relevant part of the code here.
import numpy as np
from struct2tensor import *
import tensorflow_addons as tfa
tfa.register_all()
import tensorflow as tf
import tensorflow.compat.v1 as tf1
tf1.disable_eager_execution()
from sklearn.model_selection import train_test_split
import pandas as pd
import json
# The `translate` function is the one I found in the AutoML Docker container
import prediction.translate as translate
df = pd.read_csv('data/sample_data.csv', converters={i: str for i in range(0, 500)})
target = ['Objective']
train_x = df.drop(target, axis=1)
train_y = df[target]
variables = list(train_x.columns)
(train_x, test_x, train_y, test_y) = train_test_split(train_x, train_y, test_size = 0.3, random_state=1)
with tf1.Session() as sess:
model = tf1.saved_model.load(sess, ["serve"], '001')
sig_def = model.signature_def[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
input_name = sig_def.inputs['inputs'].name
output_name = sig_def.outputs['scores'].name
def predict(data):
res = sess.run(output_name, feed_dict={input_name: data})
print('Output with input', data, ': ', res)
for k in range(100):
data = json.dumps({"instances": [json.loads(train_x.iloc[0+k].to_json(orient='index'))]})
req, batch_size, err = translate.user_request_to_tf_serving_request(data)
model_input = req.inputs['inputs'].string_val
predict(model_input)
This snippet will print out 100 times the same output, which is in my case [[0.41441688 0.5855831 ]]
.
What I am thinking about trying next is that perhaps I should try to load the TF1 model and save it as TF2? I am not sure if this will save my problem, though. As I mentioned, I am really not familiar with TF Serving so this is quite confusing for me.
Any help is appreciated.