Are you sure you want to delete this access key?
comments | description | keywords |
---|---|---|
true | Learn about essential data augmentation techniques in Ultralytics YOLO. Explore various transformations, their impacts, and how to implement them effectively for improved model performance. | YOLO data augmentation, computer vision, deep learning, image transformations, model training, Ultralytics YOLO, HSV adjustments, geometric transformations, mosaic augmentation |
Data augmentation is a crucial technique in computer vision that artificially expands your training dataset by applying various transformations to existing images. When training deep learning models like Ultralytics YOLO, data augmentation helps improve model robustness, reduces overfitting, and enhances generalization to real-world scenarios.
Watch: How to use Mosaic, MixUp & more Data Augmentations to help Ultralytics YOLO Models generalize better 🚀
Data augmentation serves multiple critical purposes in training computer vision models:
Ultralytics YOLO's implementation provides a comprehensive suite of augmentation techniques, each serving specific purposes and contributing to model performance in different ways. This guide will explore each augmentation parameter in detail, helping you understand when and how to use them effectively in your projects.
You can customize each parameter using the Python API, the command line interface (CLI), or a configuration file. Below are examples of how to set up data augmentation in each method.
!!! example "Configuration Examples"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt")
# Training with custom augmentation parameters
model.train(data="coco.yaml", epochs=100, hsv_h=0.03, hsv_s=0.6, hsv_v=0.5)
# Training without any augmentations (disabled values omitted for clarity)
model.train(
data="coco.yaml",
epochs=100,
hsv_h=0.0,
hsv_s=0.0,
hsv_v=0.0,
translate=0.0,
scale=0.0,
fliplr=0.0,
mosaic=0.0,
erasing=0.0,
auto_augment=None,
)
```
=== "CLI"
```bash
# Training with custom augmentation parameters
yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 hsv_h=0.03 hsv_s=0.6 hsv_v=0.5
```
You can define all training parameters, including augmentations, in a YAML configuration file (e.g., train_custom.yaml
). The mode
parameter is only required when using the CLI. This new YAML file will then override the default one located in the ultralytics
package.
# train_custom.yaml
# 'mode' is required only for CLI usage
mode: train
data: coco8.yaml
model: yolo11n.pt
epochs: 100
hsv_h: 0.03
hsv_s: 0.6
hsv_v: 0.5
Then launch the training with the Python API:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLO11n model
model = YOLO("yolo11n.pt")
# Train the model with custom configuration
model.train(cfg="train_custom.yaml")
```
=== "CLI"
```bash
# Train the model with custom configuration
yolo detect train model="yolo11n.pt" cfg=train_custom.yaml
```
hsv_h
)0.0
- 1.0
{{ hsv_h }}
hsv_h
hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_h
and hsv_h
. For example, with hsv_h=0.3
, the shift is randomly selected within-0.3
to 0.3
. For values above 0.5
, the hue shift wraps around the color wheel, that's why the augmentations look the same between 0.5
and -0.5
.-0.5 |
-0.25 |
0.0 |
0.25 |
0.5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
hsv_s
)0.0
- 1.0
{{ hsv_s }}
hsv_h
hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_s
and hsv_s
. For example, with hsv_s=0.7
, the intensity is randomly selected within-0.7
to 0.7
.-1.0 |
-0.5 |
0.0 |
0.5 |
1.0 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
hsv_v
)0.0
- 1.0
{{ hsv_v }}
hsv_v
hyperparameter defines the shift magnitude, with the final adjustment randomly chosen between -hsv_v
and hsv_v
. For example, with hsv_v=0.4
, the intensity is randomly selected within-0.4
to 0.4
.-1.0 |
-0.5 |
0.0 |
0.5 |
1.0 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
degrees
)0.0
to 180
{{ degrees }}
degrees
hyperparameter defines the rotation angle, with the final adjustment randomly chosen between -degrees
and degrees
. For example, with degrees=10.0
, the rotation is randomly selected within-10.0
to 10.0
.-180 |
-90 |
0.0 |
90 |
180 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
translate
)0.0
- 1.0
{{ translate }}
translate
hyperparameter defines the shift magnitude, with the final adjustment randomly chosen twice (once for each axis) within the range -translate
and translate
. For example, with translate=0.5
, the translation is randomly selected within-0.5
to 0.5
on the x-asis, and another independent random value is selected within the same range on the y-axis.x
and y
axes. Values -1.0
and 1.0
are not shown as they would translate the image completely out of the frame.-0.5 |
-0.25 |
0.0 |
0.25 |
0.5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
scale
)0.0
{{ scale }}
scale
hyperparameter defines the scaling factor, with the final adjustment randomly chosen between 1-scale
and 1+scale
. For example, with scale=0.5
, the scaling is randomly selected within0.5
to 1.5
.-1.0
is not shown as it would make the image disappear, while 1.0
simply results in a 2x zoom.scale
, not the final scale factor.scale
is greater than 1.0
, the image can be either very small or flipped, as the scaling factor is randomly chosen between 1-scale
and 1+scale
. For example, with scale=3.0
, the scaling is randomly selected within-2.0
to 4.0
. If a negative value is chosen, the image is flipped.-0.5 |
-0.25 |
0.0 |
0.25 |
0.5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
shear
)-180
to +180
{{ shear }}
shear
hyperparameter defines the shear angle, with the final adjustment randomly chosen between -shear
and shear
. For example, with shear=10.0
, the shear is randomly selected within-10
to 10
on the x-asis, and another independent random value is selected within the same range on the y-axis.shear
values can rapidly distort the image, so it's recommended to start with small values and gradually increase them.-10 |
-5 |
0.0 |
5 |
10 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
perspective
)0.0
- 0.001
{{ perspective }}
perspective
hyperparameter defines the perspective magnitude, with the final adjustment randomly chosen between -perspective
and perspective
. For example, with perspective=0.001
, the perspective is randomly selected within-0.001
to 0.001
on the x-asis, and another independent random value is selected within the same range on the y-axis.-0.001 |
-0.0005 |
0.0 |
0.0005 |
0.001 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
flipud
)0.0
- 1.0
{{ flipud }}
flipud=1.0
ensuring that all images are flipped and a value of flipud=0.0
disabling the transformation entirely. For example, with flipud=0.5
, each image has a 50% chance of being flipped upside-down.flipud off |
flipud on |
---|---|
![]() |
![]() |
fliplr
)0.0
- 1.0
{{ fliplr }}
fliplr
hyperparameter defines the probability of applying the transformation, with a value of fliplr=1.0
ensuring that all images are flipped and a value of fliplr=0.0
disabling the transformation entirely. For example, with fliplr=0.5
, each image has a 50% chance of being flipped left to right.fliplr off |
fliplr on |
---|---|
![]() |
![]() |
bgr
)0.0
- 1.0
{{ bgr }}
bgr
hyperparameter defines the probability of applying the transformation, with bgr=1.0
ensuring all images undergo the channel swap and bgr=0.0
disabling it. For example, with bgr=0.5
, each image has a 50% chance of being converted from RGB to BGR.bgr off |
bgr on |
---|---|
![]() |
![]() |
mosaic
)0.0
- 1.0
{{ mosaic }}
mosaic
hyperparameter defines the probability of applying the transformation, with mosaic=1.0
ensuring that all images are combined and mosaic=0.0
disabling the transformation. For example, with mosaic=0.5
, each image has a 50% chance of being combined with three other images.mosaic
augmentation makes the model more robust, it can also make the training process more challenging.mosaic
augmentation can be disabled near the end of training by setting close_mosaic
to the number of epochs before completion when it should be turned off. For example, if epochs
is set to 200
and close_mosaic
is set to 20
, the mosaic
augmentation will be disabled after 180
epochs. If close_mosaic
is set to 0
, the mosaic
augmentation will be enabled for the entire training process.mosaic
augmentation combines 4 images picked randomly from the dataset. If the dataset is small, the same image may be used multiple times in the same mosaic.mosaic off |
mosaic on |
---|---|
![]() |
![]() |
mixup
)0.0
- 1.0
{{ mixup }}
mixup
hyperparameter defines the probability of applying the transformation, with mixup=1.0
ensuring that all images are mixed and mixup=0.0
disabling the transformation. For example, with mixup=0.5
, each image has a 50% chance of being mixed with another image.mixup
ratio is a random value picked from a np.random.beta(32.0, 32.0)
beta distribution, meaning each image contributes approximately 50%, with slight variations.First image, mixup off |
Second image, mixup off |
mixup on |
---|---|---|
![]() |
![]() |
![]() |
cutmix
)0.0
- 1.0
{{ cutmix }}
cutmix
hyperparameter defines the probability of applying the transformation, with cutmix=1.0
ensuring that all images undergo this transformation and cutmix=0.0
disabling it completely. For example, with cutmix=0.5
, each image has a 50% chance of having a region replaced with a patch from another image.cutmix
maintains the original pixel intensities within the cut regions, preserving local features.0.1
(10%) of their original area within the pasted region are preserved.0.1
by default.First image, cutmix off |
Second image, cutmix off |
cutmix on |
---|---|---|
![]() |
![]() |
![]() |
copy_paste
)0.0
- 1.0
{{ copy_paste }}
copy_paste_mode
. The copy_paste
hyperparameter defines the probability of applying the transformation, with copy_paste=1.0
ensuring that all images are copied and copy_paste=0.0
disabling the transformation. For example, with copy_paste=0.5
, each image has a 50% chance of having objects copied from another image.copy_paste
augmentation can be used to copy objects from one image to another.copy_paste_mode
, its Intersection over Area (IoA) is computed with all the object of the source image. If all the IoA are below 0.3
(30%), the object is pasted in the target image. If only one the IoA is above 0.3
, the object is not pasted in the target image.0.3
by default.copy_paste off |
copy_paste on with copy_paste_mode=flip |
Visualize the copy_paste process |
---|---|---|
![]() |
![]() |
![]() |
copy_paste_mode
)'flip'
, 'mixup'
'{{ copy_paste_mode }}'
'flip'
, the objects come from the same image, while 'mixup'
allows objects to be copied from different images.copy_paste_mode
, but the way the objects are copied is different.Reference image | Chosen image for copy_paste |
copy_paste on with copy_paste_mode=mixup |
---|---|---|
![]() |
![]() |
![]() |
auto_augment
)'randaugment'
, 'autoaugment'
, 'augmix'
, None
'{{ auto_augment }}'
'randaugment'
option uses RandAugment, 'autoaugment'
uses AutoAugment, and 'augmix'
uses AugMix. Setting to None
disables automated augmentation.erasing
)0.0
- 0.9
{{ erasing }}
erasing
hyperparameter defines the probability of applying the transformation, with erasing=0.9
ensuring that almost all images are erased and erasing=0.0
disabling the transformation. For example, with erasing=0.5
, each image has a 50% chance of having a portion erased.erasing
augmentation comes with a scale
, ratio
, and value
hyperparameters that cannot be changed with the current implementation. Their default values are (0.02, 0.33)
, (0.3, 3.3)
, and 0
, respectively, as stated in the PyTorch documentation.erasing
hyperparameter is set to 0.9
to avoid applying the transformation to all images.erasing off |
erasing on (example 1) |
erasing on (example 2) |
erasing on (example 3) |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Choosing the right augmentations depends on your specific use case and dataset. Here are a few general guidelines to help you decide:
hsv_h
, hsv_s
, and hsv_v
are a solid starting point.rotation
, translation
, scale
, shear
, or perspective
. However, if the camera angle may vary, and you need the model to be more robust, it's better to keep these augmentations.mosaic
augmentation only if having partially occluded objects or multiple objects per image is acceptable and does not change the label value. Alternatively, you can keep mosaic
active but increase the close_mosaic
value to disable it earlier in the training process.In short: keep it simple. Start with a small set of augmentations and gradually add more as needed. The goal is to improve the model's generalization and robustness, not to overcomplicate the training process. Also, make sure the augmentations you apply reflect the same data distribution your model will encounter in production.
albumentations: Blur[...]
reference. Does that mean Ultralytics YOLO runs additional augmentation like blurring?If the albumentations
package is installed, Ultralytics automatically applies a set of extra image augmentations using it. These augmentations are handled internally and require no additional configuration.
You can find the full list of applied transformations in our technical documentation, as well as in our Albumentations integration guide. Note that only the augmentations with a probability p
greater than 0
are active. These are purposefully applied at low frequencies to mimic real-world visual artifacts, such as blur or grayscale effects.
Check if the albumentations
package is installed. If not, you can install it by running pip install albumentations
. Once installed, the package should be automatically detected and used by Ultralytics.
You can customize augmentations by creating a custom dataset class and trainer. For example, you can replace the default Ultralytics classification augmentations with PyTorch's torchvision.transforms.Resize or other transforms. See the custom training example in the classification documentation for implementation details.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?