...

src

f95dc37311

[Example] YOLO-Series(v5-11) ONNXRuntime Rust (#17311)

9 months ago

Cargo.toml

5a58950a7c

Improve headers and comments in TOML/YAML files (#18698)

7 months ago

README.md

5c39f5ed85

README enhancements (#19906)

4 months ago

You have to be logged in to leave a comment.

YOLOv8-ONNXRuntime-Rust for All Key YOLO Tasks

This repository provides a Rust demonstration for performing Ultralytics YOLOv8 tasks like Classification, Segmentation, Detection, Pose Estimation, and Oriented Bounding Box (OBB) detection using the ONNXRuntime.

✨ Recently Updated

Added YOLOv8-OBB demo.
Updated ONNXRuntime dependency to 1.19.x.

Newly updated YOLOv8 example code is located in this repository.

🚀 Features

Supports Classification, Segmentation, Detection, Pose(Keypoints)-Detection, and OBB tasks.
Supports FP16 & FP32 ONNX models.
Supports CPU, CUDA, and TensorRT execution providers to accelerate computation.
Supports dynamic input shapes (batch, width, height).

🛠️ Installation

1. Install Rust

Please follow the official Rust installation guide: https://www.rust-lang.org/tools/install.

2. ONNXRuntime Linking

For detailed setup instructions, refer to the ORT documentation.
For Linux or macOS Users:
- Download the ONNX Runtime package from the Releases page.
- Set up the library path by exporting the ORT_DYLIB_PATH environment variable:
```
export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0 # Adjust version/path as needed
```

3. [Optional] Install CUDA & CuDNN & TensorRT

The CUDA execution provider requires CUDA v11.6+.
The TensorRT execution provider requires CUDA v11.4+ and TensorRT v8.4+. You may also need cuDNN.

▶️ Get Started

1. Export the Ultralytics YOLOv8 ONNX Models

First, install the Ultralytics package:

pip install -U ultralytics

Then, export the desired Ultralytics YOLOv8 models to the ONNX format. See the Export documentation for more details.

# Export ONNX model with dynamic shapes (recommended for flexibility)
yolo export model=yolov8m.pt format=onnx simplify dynamic
yolo export model=yolov8m-cls.pt format=onnx simplify dynamic
yolo export model=yolov8m-pose.pt format=onnx simplify dynamic
yolo export model=yolov8m-seg.pt format=onnx simplify dynamic
# yolo export model=yolov8m-obb.pt format=onnx simplify dynamic # Add OBB export if needed

# Export ONNX model with constant shapes (if dynamic shapes are not required)
# yolo export model=yolov8m.pt format=onnx simplify
# yolo export model=yolov8m-cls.pt format=onnx simplify
# yolo export model=yolov8m-pose.pt format=onnx simplify
# yolo export model=yolov8m-seg.pt format=onnx simplify
# yolo export model=yolov8m-obb.pt format=onnx simplify

2. Run Inference

This command will perform inference using the specified ONNX model on the source image using the CPU.

cargo run --release -- --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

Using GPU Acceleration

Set --cuda to use the CUDA execution provider for faster inference on NVIDIA GPUs.

cargo run --release -- --cuda --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

Set --trt to use the TensorRT execution provider. You can also set --fp16 simultaneously to leverage the TensorRT FP16 engine for potentially even greater speed, especially on compatible hardware.

cargo run --release -- --trt --fp16 --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

Specifying Device and Batch Size

Set --device_id to select a specific GPU device. If the specified device ID is invalid (e.g., setting device_id 1 when only one GPU exists), ort will automatically fall back to the CPU execution provider without causing a panic.

cargo run --release -- --cuda --device_id 0 --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

Set --batch to perform inference with a specific batch size.

cargo run --release -- --cuda --batch 2 --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

If you're using --trt with a model exported with dynamic batch dimensions, you can explicitly specify the minimum, optimal, and maximum batch sizes for TensorRT optimization using --batch-min, --batch, and --batch-max. Refer to the TensorRT Execution Provider documentation for details.

Dynamic Image Size

Set --height and --width to perform inference with dynamic image sizes. Note: The ONNX model must have been exported with dynamic input shapes (dynamic=True).

cargo run --release -- --cuda --width 480 --height 640 --model MODEL_PATH_dynamic.onnx --source SOURCE_IMAGE.jpg

Profiling Performance

Set --profile to measure the time consumed in each stage of the inference pipeline (preprocessing, H2D transfer, inference, D2H transfer, postprocessing). Note: Models often require a few "warm-up" runs (1-3 iterations) before reaching optimal performance. Ensure you run the command enough times to get a stable performance evaluation.

cargo run --release -- --trt --fp16 --profile --model MODEL_PATH.onnx --source SOURCE_IMAGE.jpg

Example Profile Output (yolov8m.onnx, batch=1, 3 runs, trt, fp16, RTX 3060Ti):

==> 0 # Warm-up run
[Model Preprocess]: 12.75788ms
[ORT H2D]: 237.118µs
[ORT Inference]: 507.895469ms
[ORT D2H]: 191.655µs
[Model Inference]: 508.34589ms
[Model Postprocess]: 1.061122ms
==> 1 # Stable run
[Model Preprocess]: 13.658655ms
[ORT H2D]: 209.975µs
[ORT Inference]: 5.12372ms
[ORT D2H]: 182.389µs
[Model Inference]: 5.530022ms
[Model Postprocess]: 1.04851ms
==> 2 # Stable run
[Model Preprocess]: 12.475332ms
[ORT H2D]: 246.127µs
[ORT Inference]: 5.048432ms
[ORT D2H]: 187.117µs
[Model Inference]: 5.493119ms
[Model Postprocess]: 1.040906ms

Other Options

--conf: Confidence threshold for detections [default: 0.3].
--iou: IoU (Intersection over Union) threshold for Non-Maximum Suppression (NMS) [default: 0.45].
--kconf: Confidence threshold for keypoints (in Pose Estimation) [default: 0.55].
--plot: Plot the inference results with random RGB colors and save the output image to the runs directory.

You can view all available command-line arguments by running:

# Clone the repository if you haven't already
# git clone https://github.com/ultralytics/ultralytics
# cd ultralytics/examples/YOLOv8-ONNXRuntime-Rust

cargo run --release -- --help

🖼️ Examples

Classification

Running a dynamic shape ONNX classification model on the CPU with a specific image size (--height 224 --width 224). The plotted result image will be saved in the runs directory.

cargo run --release -- --model ../assets/weights/yolov8m-cls-dyn.onnx --source ../assets/images/dog.jpg --height 224 --width 224 --plot --profile

Example output:

Summary:
> Task: Classify (Ultralytics 8.0.217) # Version might differ
> EP: Cpu
> Dtype: Float32
> Batch: 1 (Dynamic), Height: 224 (Dynamic), Width: 224 (Dynamic)
> nc: 1000 nk: 0, nm: 0, conf: 0.3, kconf: 0.55, iou: 0.45

[Model Preprocess]: 16.363477ms
[ORT H2D]: 50.722µs
[ORT Inference]: 16.295808ms
[ORT D2H]: 8.37µs
[Model Inference]: 16.367046ms
[Model Postprocess]: 3.527µs
[
    YOLOResult {
        Probs(top5): Some([(208, 0.6950566), (209, 0.13823675), (178, 0.04849795), (215, 0.019029364), (212, 0.016506357)]), # Class IDs and confidences
        Bboxes: None,
        Keypoints: None,
        Masks: None,
    },
]

Object Detection

Using the CUDA execution provider and a dynamic image size (--height 640 --width 480).

cargo run --release -- --cuda --model ../assets/weights/yolov8m-dynamic.onnx --source ../assets/images/bus.jpg --plot --height 640 --width 480

Pose Detection

Using the TensorRT execution provider.

cargo run --release -- --trt --model ../assets/weights/yolov8m-pose.onnx --source ../assets/images/bus.jpg --plot

Instance Segmentation

Using the TensorRT execution provider with an FP16 model (--fp16).

cargo run --release -- --trt --fp16 --model ../assets/weights/yolov8m-seg.onnx --source ../assets/images/0172.jpg --plot

🤝 Contributing

Contributions are welcome! If you find any issues or have suggestions for improvement, please feel free to open an issue or submit a pull request to the main Ultralytics repository.

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

Ultralytics / ultralytics connected to https://github.com/ultralytics/ultralytics.git

README.md

YOLOv8-ONNXRuntime-Rust for All Key YOLO Tasks

✨ Recently Updated

🚀 Features

🛠️ Installation

1. Install Rust

2. ONNXRuntime Linking

For detailed setup instructions, refer to the ORT documentation.

For Linux or macOS Users:

3. [Optional] Install CUDA & CuDNN & TensorRT

▶️ Get Started

1. Export the Ultralytics YOLOv8 ONNX Models

2. Run Inference

Using GPU Acceleration

Specifying Device and Batch Size

Dynamic Image Size

Profiling Performance

Other Options

🖼️ Examples

Classification

Object Detection

Pose Detection

Instance Segmentation

🤝 Contributing

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Ultralytics
/
ultralytics
connected to https://github.com/ultralytics/ultralytics.git