Deploying YOLOv8 Object Detection Model with TensorFlow Serving
In this guide, we will explain how to deploy a YOLOv8 object detection model using TensorFlow Serving. YOLOv8 is a state-of-the-art (SOTA) model that builds on the success of the previous YOLO version, providing cutting-edge performance in terms of accuracy and speed. It also can perform object detection and tracking, instance segmentation, image classification, and pose estimation tasks. TensorFlow Serving is a flexible, high-performance serving system for machine learning models in production environments. It provides features such as model versioning, canarying and A/B testing, batching, model caching, and ease of use. It is also scalable, secure, and monitorable, which enables you to deploy your model efficiently and effectively.
Getting Started
Clone the Repository
Begin by cloning the yolov8_tf-serving
repository to your local machine:
git clone https://github.com/Kawaeee/yolov8_tf-serving.git
cd yolov8_tf-serving/
Build the Docker Image
Next, build the Docker image for YOLOv8:
docker build -t yolov8conv .
Depending on your hardware, you can choose between CPU and GPU support. Use one of the following commands to access the Docker container’s bash shell:
CPU:
docker run -it -v $(pwd):/data --rm yolov8conv /bin/bash
GPU:
docker run -it -v $(pwd):/data --gpus all --rm yolov8conv /bin/bash
Run the Conversion Script
Execute the run.sh
script with your YOLOv8 model file (in .pt
format) as an argument. This script will perform the following tasks:
- Export the YOLOv8 model to ONNX format. -
export.py
- Convert the ONNX model to TensorFlow SavedModel format. -
convert.py
- Add pre-processing and post-processing layers to the TensorFlow SavedModel. -
customize.py
In this example, we will use a pre-trained YOLOv8 model (YOLOv8l) as an input:
wget -O /data/yolov8l.pt https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt
bash /app/run.sh /data/yolov8l.pt
After running the conversion script, you’ll obtain the converted model. Transfer the contents from the runs/xxxxxxxx/output
directory to the demo/models
directory.
Setting up TensorFlow Serving
Navigate to the demo/
directory first and run these commands, it will set up a Docker container running TensorFlow Serving, with your converted YOLOv8 model ready for inference.
CPU:
docker run -it -p 8501:8501 -p 8500:8500 \
--mount type=bind,source=$PWD/models/output/,target=/models/output/ \
--mount type=bind,source=$PWD/models/models.config,target=/models/models.config \
-t tensorflow/serving:latest --model_config_file=/models/models.config
GPU:
docker run --rm --gpus all -p 8501:8501 -p 8500:8500 \
--mount type=bind,source=$PWD/models/output/,target=/models/output/ \
--mount type=bind,source=$PWD/models/models.config,target=/models/models.config \
-t tensorflow/serving:latest-gpu --model_config_file=/models/models.config
Obtaining Prediction Results
After setting up the TensorFlow Serving container, you can obtain prediction results by making RESTful/GRPC requests to the TensorFlow Serving API.
- RESTful API sample request notebook:
rest.ipynb
- gRPC API sample request notebook:
grpc.ipynb
For more details and updates, check out the GitHub repository.
Additional resources: Ultralytics YOLOv8, TensorFlow Serving, onnx2tf
Thank you for reading. I hope this might help, though it might be more or less useful. Feel free to ask for clarification or more information.