Are you sure you want to delete this access key?
comments | description | keywords |
---|---|---|
true | Discover YOLOv10 for real-time object detection, eliminating NMS and boosting efficiency. Achieve top performance with a low computational cost. | YOLOv10, real-time object detection, NMS-free, deep learning, Tsinghua University, Ultralytics, machine learning, neural networks, performance optimization |
YOLOv10, built on the Ultralytics Python package by researchers at Tsinghua University, introduces a new approach to real-time object detection, addressing both the post-processing and model architecture deficiencies found in previous YOLO versions. By eliminating non-maximum suppression (NMS) and optimizing various model components, YOLOv10 achieves state-of-the-art performance with significantly reduced computational overhead. Extensive experiments demonstrate its superior accuracy-latency trade-offs across multiple model scales.
Watch: How to Train YOLOv10 on SKU-110k Dataset using Ultralytics | Retail Dataset
Real-time object detection aims to accurately predict object categories and positions in images with low latency. The YOLO series has been at the forefront of this research due to its balance between performance and efficiency. However, reliance on NMS and architectural inefficiencies have hindered optimal performance. YOLOv10 addresses these issues by introducing consistent dual assignments for NMS-free training and a holistic efficiency-accuracy driven model design strategy.
The architecture of YOLOv10 builds upon the strengths of previous YOLO models while introducing several key innovations. The model architecture consists of the following components:
YOLOv10 comes in various model scales to cater to different application needs:
YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in terms of accuracy and efficiency. For example, YOLOv10s is 1.8x faster than RT-DETR-R18 with similar AP on the COCO dataset, and YOLOv10b has 46% less latency and 25% fewer parameters than YOLOv9-C with the same performance.
!!! tip "Performance"
=== "Detection (COCO)"
Latency measured with TensorRT FP16 on T4 GPU.
| Model | Input Size | AP<sup>val</sup> | FLOPs (G) | Latency (ms) |
| ------------- | ---------- | ---------------- | --------- | ------------ |
| [YOLOv10n][1] | 640 | 38.5 | **6.7** | **1.84** |
| [YOLOv10s][2] | 640 | 46.3 | 21.6 | 2.49 |
| [YOLOv10m][3] | 640 | 51.1 | 59.1 | 4.74 |
| [YOLOv10b][4] | 640 | 52.5 | 92.0 | 5.74 |
| [YOLOv10l][5] | 640 | 53.2 | 120.3 | 7.28 |
| [YOLOv10x][6] | 640 | **54.4** | 160.4 | 10.70 |
YOLOv10 employs dual label assignments, combining one-to-many and one-to-one strategies during training to ensure rich supervision and efficient end-to-end deployment. The consistent matching metric aligns the supervision between both strategies, enhancing the quality of predictions during inference.
YOLOv10 has been extensively tested on standard benchmarks like COCO, demonstrating superior performance and efficiency. The model achieves state-of-the-art results across different variants, showcasing significant improvements in latency and accuracy compared to previous versions and other contemporary detectors.
Compared to other state-of-the-art detectors:
!!! tip "Performance"
=== "Detection (COCO)"
Here is a detailed comparison of YOLOv10 variants with other state-of-the-art models:
| Model | Params<br><sup>(M) | FLOPs<br><sup>(G) | mAP<sup>val<br>50-95 | Latency<br><sup>(ms) | Latency-forward<br><sup>(ms) |
| ----------------- | ------------------ | ----------------- | -------------------- | -------------------- | ---------------------------- |
| YOLOv6-3.0-N | 4.7 | 11.4 | 37.0 | 2.69 | **1.76** |
| Gold-YOLO-N | 5.6 | 12.1 | **39.6** | 2.92 | 1.82 |
| YOLOv8n | 3.2 | 8.7 | 37.3 | 6.16 | 1.77 |
| **[YOLOv10n][1]** | **2.3** | **6.7** | 39.5 | **1.84** | 1.79 |
| | | | | | |
| YOLOv6-3.0-S | 18.5 | 45.3 | 44.3 | 3.42 | 2.35 |
| Gold-YOLO-S | 21.5 | 46.0 | 45.4 | 3.82 | 2.73 |
| YOLOv8s | 11.2 | 28.6 | 44.9 | 7.07 | **2.33** |
| **[YOLOv10s][2]** | **7.2** | **21.6** | **46.8** | **2.49** | 2.39 |
| | | | | | |
| RT-DETR-R18 | 20.0 | 60.0 | 46.5 | **4.58** | **4.49** |
| YOLOv6-3.0-M | 34.9 | 85.8 | 49.1 | 5.63 | 4.56 |
| Gold-YOLO-M | 41.3 | 87.5 | 49.8 | 6.38 | 5.45 |
| YOLOv8m | 25.9 | 78.9 | 50.6 | 9.50 | 5.09 |
| **[YOLOv10m][3]** | **15.4** | **59.1** | **51.3** | 4.74 | 4.63 |
| | | | | | |
| YOLOv6-3.0-L | 59.6 | 150.7 | 51.8 | 9.02 | 7.90 |
| Gold-YOLO-L | 75.1 | 151.7 | 51.8 | 10.65 | 9.78 |
| YOLOv8l | 43.7 | 165.2 | 52.9 | 12.39 | 8.06 |
| RT-DETR-R50 | 42.0 | 136.0 | 53.1 | 9.20 | 9.07 |
| **[YOLOv10l][5]** | **24.4** | **120.3** | **53.4** | **7.28** | **7.21** |
| | | | | | |
| YOLOv8x | 68.2 | 257.8 | 53.9 | 16.86 | 12.83 |
| RT-DETR-R101 | 76.0 | 259.0 | 54.3 | 13.71 | 13.58 |
| **[YOLOv10x][6]** | **29.5** | **160.4** | **54.4** | **10.70** | **10.60** |
[1]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10n.pt
[2]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10s.pt
[3]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10m.pt
[4]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10b.pt
[5]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10l.pt
[6]: https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov10x.pt
For predicting new images with YOLOv10:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a pre-trained YOLOv10n model
model = YOLO("yolov10n.pt")
# Perform object detection on an image
results = model("image.jpg")
# Display the results
results[0].show()
```
=== "CLI"
```bash
# Load a COCO-pretrained YOLOv10n model and run inference on the 'bus.jpg' image
yolo detect predict model=yolov10n.pt source=path/to/bus.jpg
```
For training YOLOv10 on a custom dataset:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load YOLOv10n model from scratch
model = YOLO("yolov10n.yaml")
# Train the model
model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Build a YOLOv10n model from scratch and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov10n.yaml data=coco8.yaml epochs=100 imgsz=640
# Build a YOLOv10n model from scratch and run inference on the 'bus.jpg' image
yolo predict model=yolov10n.yaml source=path/to/bus.jpg
```
The YOLOv10 models series offers a range of models, each optimized for high-performance Object Detection. These models cater to varying computational needs and accuracy requirements, making them versatile for a wide array of applications.
Model | Filenames | Tasks | Inference | Validation | Training | Export |
---|---|---|---|---|---|---|
YOLOv10 | yolov10n.pt yolov10s.pt yolov10m.pt yolov10l.pt yolov10x.pt |
Object Detection | ✅ | ✅ | ✅ | ✅ |
Due to the new operations introduced with YOLOv10, not all export formats provided by Ultralytics are currently supported. The following table outlines which formats have been successfully converted using Ultralytics for YOLOv10. Feel free to open a pull request if you're able to provide a contribution change for adding export support of additional formats for YOLOv10.
Export Format | Export Support | Exported Model Inference | Notes |
---|---|---|---|
TorchScript | ✅ | ✅ | Standard PyTorch model format. |
ONNX | ✅ | ✅ | Widely supported for deployment. |
OpenVINO | ✅ | ✅ | Optimized for Intel hardware. |
TensorRT | ✅ | ✅ | Optimized for NVIDIA GPUs. |
CoreML | ✅ | ✅ | Limited to Apple devices. |
TF SavedModel | ✅ | ✅ | TensorFlow's standard model format. |
TF GraphDef | ✅ | ✅ | Legacy TensorFlow format. |
TF Lite | ✅ | ✅ | Optimized for mobile and embedded. |
TF Edge TPU | ✅ | ✅ | Specific to Google's Edge TPU devices. |
TF.js | ✅ | ✅ | JavaScript environment for browser use. |
PaddlePaddle | ❌ | ❌ | Popular in China; less global support. |
NCNN | ✅ | ❌ | Layer torch.topk not exists or registered |
YOLOv10 sets a new standard in real-time object detection by addressing the shortcomings of previous YOLO versions and incorporating innovative design strategies. Its ability to deliver high accuracy with low computational cost makes it an ideal choice for a wide range of real-world applications including manufacturing, retail, and autonomous vehicles.
We would like to acknowledge the YOLOv10 authors from Tsinghua University for their extensive research and significant contributions to the Ultralytics framework:
!!! quote ""
=== "BibTeX"
```bibtex
@article{THU-MIGyolov10,
title={YOLOv10: Real-Time End-to-End Object Detection},
author={Ao Wang, Hui Chen, Lihao Liu, et al.},
journal={arXiv preprint arXiv:2405.14458},
year={2024},
institution={Tsinghua University},
license = {AGPL-3.0}
}
```
For detailed implementation, architectural innovations, and experimental results, please refer to the YOLOv10 research paper and GitHub repository by the Tsinghua University team.
YOLOv10, developed by researchers at Tsinghua University, introduces several key innovations to real-time object detection. It eliminates the need for non-maximum suppression (NMS) by employing consistent dual assignments during training and optimized model components for superior performance with reduced computational overhead. For more details on its architecture and key features, check out the YOLOv10 overview section.
For easy inference, you can use the Ultralytics YOLO Python library or the command line interface (CLI). Below are examples of predicting new images using YOLOv10:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load the pre-trained YOLOv10n model
model = YOLO("yolov10n.pt")
results = model("image.jpg")
results[0].show()
```
=== "CLI"
```bash
yolo detect predict model=yolov10n.pt source=path/to/image.jpg
```
For more usage examples, visit our Usage Examples section.
YOLOv10 offers several model variants to cater to different use cases:
Each variant is designed for different computational needs and accuracy requirements, making them versatile for a variety of applications. Explore the Model Variants section for more information.
YOLOv10 eliminates the need for non-maximum suppression (NMS) during inference by employing consistent dual assignments for training. This approach reduces inference latency and enhances prediction efficiency. The architecture also includes a one-to-one head for inference, ensuring that each object gets a single best prediction. For a detailed explanation, see the Consistent Dual Assignments for NMS-Free Training section.
YOLOv10 supports several export formats, including TorchScript, ONNX, OpenVINO, and TensorRT. However, not all export formats provided by Ultralytics are currently supported for YOLOv10 due to its new operations. For details on the supported formats and instructions on exporting, visit the Exporting YOLOv10 section.
YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in both accuracy and efficiency. For example, YOLOv10s is 1.8x faster than RT-DETR-R18 with a similar AP on the COCO dataset. YOLOv10b shows 46% less latency and 25% fewer parameters than YOLOv9-C with the same performance. Detailed benchmarks can be found in the Comparisons section.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?