I have created tensorflow saved model using tf.keras.models.save_model as below -
tf.keras.models.save_model(
text_model,
export_path,
overwrite=True,
include_optimizer=True,
save_format=None,
signatures=None,
options=None
)
Then I have deployed the same model in Kubernetes cluster with tensorflow serving image - The deployment yaml looks like -
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: tensorflow-serving
name: news-classifier
spec:
selector:
matchLabels:
app: news-classifier-server
replicas: 3
template:
metadata:
labels:
app: news-classifier-server
spec:
containers:
- name: news-classifier-container
image: tensorflow/serving:latest
ports:
- containerPort: 8501
volumeMounts:
- name: news-classifier-vol
mountPath: "/models/model"
env:
- name: MODEL_NAME
value: "model"
- name: MODEL_BASE_PATH
value: "/models"
volumes:
- name: news-classifier-vol
persistentVolumeClaim:
claimName: news-classifier
---
apiVersion: v1
kind: Service
metadata:
namespace: tensorflow-serving
name: news-classifier-service
spec:
ports:
- port: 8501
targetPort: 8501
selector:
app: news-classifier-server
type: ClusterIP
I could successfully mount the saved model inside pod using PVC and the serving logs shows no error.
But when I try to use the predict method of the model using below code, it does not work.
import requests
import json
import numpy as np
sample_news = ["In the last weeks, there has been many transfer suprises in footbal. Ronaldo went back to Old Trafford.",
"while Messi went to Paris Saint Germain to join his former colleague Neymar.",
"We can't wait to see these two clubs will perform in upcoming leagues"]
data = json.dumps({"instances": sample_news})
# Define headers with content-type set to json
headers = {"content-type": "application/json"}
# Capture the response by making a request to the appropiate URL with the appropiate parameters
json_response = requests.post('http://localhost:8501/v1/models/model:predict', data=data, headers=headers)
# Parse the predictions out of the response
predictions = json.loads(json_response.text)['predictions']
Getting following errors -
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
RemoteDisconnected Traceback (most recent call last)
c:\Users\htmrhv\apps\Python\Python38\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
698 # Make the request on the httplib connection object.
--> 699 httplib_response = self._make_request(
700 conn,
c:\Users\htmrhv\apps\Python\Python38\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
444 # Otherwise it looks like a bug in the code.
--> 445 six.raise_from(e, None)
446 except (SocketTimeout, BaseSSLError, SocketError) as e:
c:\Users\htmrhv\apps\Python\Python38\lib\site-packages\urllib3\packages\six.py in raise_from(value, from_value)
c:\Users\htmrhv\apps\Python\Python38\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
439 try:
--> 440 httplib_response = conn.getresponse()
441 except BaseException as e:
c:\Users\htmrhv\apps\Python\Python38\lib\http\client.py in getresponse(self)
1343 try:
-> 1344 response.begin()
1345 except ConnectionError:
c:\Users\htmrhv\apps\Python\Python38\lib\http\client.py in begin(self)
...
--> 498 raise ConnectionError(err, request=request)
499
500 except MaxRetryError as e:
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
The pod log shows below output -
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
/usr/bin/tf_serving_entrypoint.sh: line 3: 7 Aborted tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"
Please suggest.