Hi all!
Excited to share what @deep-diver and I have been working on for the past few months. In our latest project, we show how to deploy a deep learning model with Docker + Kubernetes + GitHub Actions. We show this with two promising candidates - FastAPI (for REST) and TF Serving (for gRPC).
The idea is to poll for newly released models from the repository artifact and then build a Docker image and then deploy that on a Kubernetes cluster via GKE. All of these are automated using GitHub Actions. Additionally, we support polling for new changes to specific locations in the codebase and conditioning the deployment based on that.
We provide example notebooks, extensive instructions, and results to help the community reproduce our work as easily as possible.
- FastAPI deployment: https://github.com/sayakpaul/ml-deployment-k8s-fastapi
- TF Serving deployment: https://github.com/deep-diver/ml-deployment-k8s-tfserving
Up next: A blog post detailing the load-testing performance of FastAPI and TF Serving. Stay tuned