|
@@ -1,26 +1,35 @@
|
|
|
|
|
|
## Computer Vision Models - Pretrained Checkpoints
|
|
## Computer Vision Models - Pretrained Checkpoints
|
|
|
|
|
|
|
|
+You can load any of our pretrained model in 2 lines of code:
|
|
|
|
+```python
|
|
|
|
+from super_gradients.training import models
|
|
|
|
+model = models.get("yolox_s", pretrained_weights="coco")
|
|
|
|
+```
|
|
|
|
+
|
|
|
|
+All the available models are listed in the column `Model name`.
|
|
|
|
+
|
|
|
|
+
|
|
### Pretrained Classification PyTorch Checkpoints
|
|
### Pretrained Classification PyTorch Checkpoints
|
|
|
|
|
|
|
|
|
|
-| Model | Dataset | Resolution | Top-1 | Top-5 | Latency (HW)*<sub>T4</sub> | Latency (Production)**<sub>T4</sub> |Latency (HW)*<sub>Jetson Xavier NX</sub> | Latency (Production)**<sub>Jetson Xavier NX</sub> | Latency <sub>Cascade Lake</sub> |
|
|
|
|
-|------------ | ------ | ---------- |----------- | ----------- | ----------- |---------- |----------- | ----------- | :------: |
|
|
|
|
-| ViT base | ImageNet21K | 224x224 | 84.15 | - |**4.46ms** |**4.60ms** | **-** * |**-**|**57.22ms** |
|
|
|
|
-| ViT large | ImageNet21K | 224x224 | 85.64 | - |**12.81ms** |**13.19ms** | **-** * |**-**|**187.22ms** |
|
|
|
|
-| BEiT | ImageNet21K | 224x224 | - | - |**-ms** |**-ms** | **-** * |**-**|**-ms** |
|
|
|
|
-| EfficientNet B0 | ImageNet | 224x224 | 77.62 | 93.49 |**0.93ms** |**1.38ms** | **-** * |**-**|**3.44ms** |
|
|
|
|
-| RegNet Y200 | ImageNet |224x224 | 70.88 | 89.35 |**0.63ms** | **1.08ms** | **2.16ms** |**2.47ms**|**2.06ms** |
|
|
|
|
-| RegNet Y400 | ImageNet |224x224 | 74.74 | 91.46 |**0.80ms** | **1.25ms** |**2.62ms** |**2.91ms** |**2.87ms** |
|
|
|
|
-| RegNet Y600 | ImageNet |224x224 | 76.18 | 92.34 |**0.77ms** | **1.22ms** |**2.64ms** |**2.93ms** |**2.39ms** |
|
|
|
|
-| RegNet Y800 | ImageNet |224x224 | 77.07 | 93.26 |**0.74ms** | **1.19ms** |**2.77ms** |**3.04ms** |**2.81ms** |
|
|
|
|
-| ResNet 18 | ImageNet |224x224 | 70.6 | 89.64 |**0.52ms** | **0.95ms** |**2.01ms**|**2.30ms** |**4.56ms** |
|
|
|
|
-| ResNet 34 | ImageNet |224x224 | 74.13 | 91.7 |**0.92ms** |**1.34ms** |**3.57ms**|**3.87ms** | **7.64ms** |
|
|
|
|
-| ResNet 50 | ImageNet |224x224 | 81.91 | 93.0 |**1.03ms** | **1.44ms** | **4.78ms**|**5.10ms** |**9.25ms** |
|
|
|
|
-| MobileNet V3_large-150 epochs | ImageNet |224x224 | 73.79 | 91.54 |**0.67ms** | **1.11ms** |**2.42ms** |**2.71ms** |**1.76ms** |
|
|
|
|
-| MobileNet V3_large-300 epochs | ImageNet |224x224 | 74.52 | 91.92 |**0.67ms** | **1.11ms** |**2.42ms** |**2.71ms** |**1.76ms** |
|
|
|
|
-| MobileNet V3_small | ImageNet |224x224 |67.45 | 87.47 |**0.55ms** | **0.96ms** |**2.01ms** *|**2.35ms** |**1.06ms** |
|
|
|
|
-| MobileNet V2_w1 | ImageNet |224x224 | 73.08 | 91.1 |**0.46 ms**| **0.89ms** |**1.65ms** *|**1.90ms** | **1.56ms** |
|
|
|
|
|
|
+| Model | Model name | Dataset | Resolution | Top-1 | Top-5 | Latency (HW)*<sub>T4</sub> | Latency (Production)**<sub>T4</sub> | Latency (HW)*<sub>Jetson Xavier NX</sub> | Latency (Production)**<sub>Jetson Xavier NX</sub> | Latency <sub>Cascade Lake</sub> |
|
|
|
|
+|-------------------------------|--------------------|-------------|------------|---------|---------|----------------------------|-------------------------------------|------------------------------------------|---------------------------------------------------|:-------------------------------:|
|
|
|
|
+| ViT base | vit_base | ImageNet21K | 224x224 | 84.15 | - | **4.46ms** | **4.60ms** | **-** * | **-** | **57.22ms** |
|
|
|
|
+| ViT large | vit_large | ImageNet21K | 224x224 | 85.64 | - | **12.81ms** | **13.19ms** | **-** * | **-** | **187.22ms** |
|
|
|
|
+| BEiT | < NO CHECKPOINT ?> | ImageNet21K | 224x224 | - | - | **-ms** | **-ms** | **-** * | **-** | **-ms** |
|
|
|
|
+| EfficientNet B0 | efficientnet_b0 | ImageNet | 224x224 | 77.62 | 93.49 | **0.93ms** | **1.38ms** | **-** * | **-** | **3.44ms** |
|
|
|
|
+| RegNet Y200 | regnetY200 | ImageNet | 224x224 | 70.88 | 89.35 | **0.63ms** | **1.08ms** | **2.16ms** | **2.47ms** | **2.06ms** |
|
|
|
|
+| RegNet Y400 | regnetY400 | ImageNet | 224x224 | 74.74 | 91.46 | **0.80ms** | **1.25ms** | **2.62ms** | **2.91ms** | **2.87ms** |
|
|
|
|
+| RegNet Y600 | regnetY600 | ImageNet | 224x224 | 76.18 | 92.34 | **0.77ms** | **1.22ms** | **2.64ms** | **2.93ms** | **2.39ms** |
|
|
|
|
+| RegNet Y800 | regnetY800 | ImageNet | 224x224 | 77.07 | 93.26 | **0.74ms** | **1.19ms** | **2.77ms** | **3.04ms** | **2.81ms** |
|
|
|
|
+| ResNet 18 | resnet18 | ImageNet | 224x224 | 70.6 | 89.64 | **0.52ms** | **0.95ms** | **2.01ms** | **2.30ms** | **4.56ms** |
|
|
|
|
+| ResNet 34 | resnet34 | ImageNet | 224x224 | 74.13 | 91.7 | **0.92ms** | **1.34ms** | **3.57ms** | **3.87ms** | **7.64ms** |
|
|
|
|
+| ResNet 50 | resnet50 | ImageNet | 224x224 | 81.91 | 93.0 | **1.03ms** | **1.44ms** | **4.78ms** | **5.10ms** | **9.25ms** |
|
|
|
|
+| MobileNet V3_large-150 epochs | < WHY KEEP THIS?> | ImageNet | 224x224 | 73.79 | 91.54 | **0.67ms** | **1.11ms** | **2.42ms** | **2.71ms** | **1.76ms** |
|
|
|
|
+| MobileNet V3_large-300 epochs | mobilenet_v3_large | ImageNet | 224x224 | 74.52 | 91.92 | **0.67ms** | **1.11ms** | **2.42ms** | **2.71ms** | **1.76ms** |
|
|
|
|
+| MobileNet V3_small | mobilenet_v3_small | ImageNet | 224x224 | 67.45 | 87.47 | **0.55ms** | **0.96ms** | **2.01ms** * | **2.35ms** | **1.06ms** |
|
|
|
|
+| MobileNet V2_w1 | mobilenet_v2 ? | ImageNet | 224x224 | 73.08 | 91.1 | **0.46 ms** | **0.89ms** | **1.65ms** * | **1.90ms** | **1.56ms** |
|
|
> **NOTE:** <br/>
|
|
> **NOTE:** <br/>
|
|
> - Latency (HW)* - Hardware performance (not including IO)<br/>
|
|
> - Latency (HW)* - Hardware performance (not including IO)<br/>
|
|
> - Latency (Production)** - Production Performance (including IO)
|
|
> - Latency (Production)** - Production Performance (including IO)
|
|
@@ -32,16 +41,15 @@
|
|
### Pretrained Object Detection PyTorch Checkpoints
|
|
### Pretrained Object Detection PyTorch Checkpoints
|
|
|
|
|
|
|
|
|
|
-| Model | Dataset | Resolution | mAP<sup>val<br>0.5:0.95 | Latency (HW)*<sub>T4</sub> | Latency (Production)**<sub>T4</sub> |Latency (HW)*<sub>Jetson Xavier NX</sub> | Latency (Production)**<sub>Jetson Xavier NX</sub> | Latency <sub>Cascade Lake</sub> |
|
|
|
|
-|------------- |------ | ---------- |------ | -------- |------ | ---------- |------ | :------: |
|
|
|
|
-| SSD lite MobileNet v2 | COCO |320x320 |21.5 |**0.77ms** |**1.40ms**|**5.28ms** |**6.44ms** |**4.13ms**|
|
|
|
|
-| SSD lite MobileNet v1 | COCO |320x320 |24.3 |**1.55ms** |**2.84ms**|**8.07ms** |**9.14ms** |**22.76ms**|
|
|
|
|
-| YOLOX nano | COCO |640x640 |26.77|**2.47ms** |**4.09ms**|**11.49ms** |**12.97ms** |**-**|
|
|
|
|
-| YOLOX tiny | COCO |640x640 |37.18|**3.16ms** |**4.61ms**|**15.23ms** |**19.24ms** |**-**|
|
|
|
|
-| YOLOX small | COCO |640x640 |40.47 |**3.58ms** |**4.94ms**|**18.88ms** |**22.48ms** |**-**|
|
|
|
|
-| YOLOX medium| COCO |640x640 |46.4 |**6.40ms** |**7.65ms**|**39.22ms** |**44.5ms** |**-**|
|
|
|
|
-| YOLOX large | COCO |640x640 |49.25 |**10.07ms** |**11.12ms**|**68.73ms** |**77.01ms** |**-**|
|
|
|
|
-
|
|
|
|
|
|
+| Model | Model Name | Dataset | Resolution | mAP<sup>val<br>0.5:0.95 | Latency (HW)*<sub>T4</sub> | Latency (Production)**<sub>T4</sub> | Latency (HW)*<sub>Jetson Xavier NX</sub> | Latency (Production)**<sub>Jetson Xavier NX</sub> | Latency <sub>Cascade Lake</sub> |
|
|
|
|
+|-----------------------|-------------------------|---------|------------|-------------------------|----------------------------|-------------------------------------|------------------------------------------|---------------------------------------------------|:-------------------------------:|
|
|
|
|
+| SSD lite MobileNet v2 | ssd_lite_mobilenet_v2 | COCO | 320x320 | 21.5 | **0.77ms** | **1.40ms** | **5.28ms** | **6.44ms** | **4.13ms** |
|
|
|
|
+| SSD lite MobileNet v1 | ssd_mobilenet_v1 | COCO | 320x320 | 24.3 | **1.55ms** | **2.84ms** | **8.07ms** | **9.14ms** | **22.76ms** |
|
|
|
|
+| YOLOX nano | yolox_n | COCO | 640x640 | 26.77 | **2.47ms** | **4.09ms** | **11.49ms** | **12.97ms** | **-** |
|
|
|
|
+| YOLOX tiny | yolox_t | COCO | 640x640 | 37.18 | **3.16ms** | **4.61ms** | **15.23ms** | **19.24ms** | **-** |
|
|
|
|
+| YOLOX small | yolox_s | COCO | 640x640 | 40.47 | **3.58ms** | **4.94ms** | **18.88ms** | **22.48ms** | **-** |
|
|
|
|
+| YOLOX medium | yolox_m | COCO | 640x640 | 46.4 | **6.40ms** | **7.65ms** | **39.22ms** | **44.5ms** | **-** |
|
|
|
|
+| YOLOX large | yolox_l | COCO | 640x640 | 49.25 | **10.07ms** | **11.12ms** | **68.73ms** | **77.01ms** | **-** |
|
|
|
|
|
|
> **NOTE:** <br/>
|
|
> **NOTE:** <br/>
|
|
> - Latency (HW)* - Hardware performance (not including IO)<br/>
|
|
> - Latency (HW)* - Hardware performance (not including IO)<br/>
|
|
@@ -51,20 +59,20 @@
|
|
|
|
|
|
### Pretrained Semantic Segmentation PyTorch Checkpoints
|
|
### Pretrained Semantic Segmentation PyTorch Checkpoints
|
|
|
|
|
|
-| Model | Dataset | Resolution | mIoU | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> |
|
|
|
|
-|--------------------- |------ | ---------- | ------ | -------- | ----------|:-------------------------------------------------:|
|
|
|
|
-| PP-LiteSeg B50 | Cityscapes |512x1024 |76.48 |**4.18ms** |**31.22ms**|**31.69ms**|
|
|
|
|
-| PP-LiteSeg B75 | Cityscapes |768x1536 |78.52 |**6.84ms** |**33.69ms**|**49.89ms** |
|
|
|
|
-| PP-LiteSeg T50 | Cityscapes |512x1024 |74.92 |**3.26ms** |**30.33ms**|**26.20ms** |
|
|
|
|
-| PP-LiteSeg T75 | Cityscapes |768x1536 |77.56 |**5.20ms** |**32.28ms**|**38.03ms** |
|
|
|
|
-| DDRNet 23 slim | Cityscapes |1024x2048 |78.01 |**5.74ms** |**32.01ms**| **45.18ms**|
|
|
|
|
-| DDRNet 23 | Cityscapes |1024x2048 |80.26 |**12.74ms** |**39.01ms**|**106.26ms** |
|
|
|
|
-| STDC 1-Seg50 | Cityscapes | 512x1024 |75.11 |**3.34ms** |**30.12ms**| **27.54ms**|
|
|
|
|
-| STDC 1-Seg75 | Cityscapes | 768x1536 |77.8 |**5.53ms** |**32.490ms**|**43.88**|
|
|
|
|
-| STDC 2-Seg50 | Cityscapes | 512x1024 |76.44 |**4.12ms** |**30.94ms**|**32.03ms** |
|
|
|
|
-| STDC 2-Seg75 | Cityscapes | 768x1536 |78.93 |**6.95ms** |**33.89ms**|**54.48ms**|
|
|
|
|
-| RegSeg (exp48) | Cityscapes | 1024x2048 |78.15 |**12.03ms** |**38.91ms**|**78.20ms**|
|
|
|
|
-| Larger RegSeg (exp53) | Cityscapes | 1024x2048 |79.2|**22.00ms** |**48.96ms**|**150.78ms**|
|
|
|
|
|
|
+| Model | Model Name | Dataset | Resolution | mIoU | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> |
|
|
|
|
+|-----------------------|-------------------|------------|------------|-------|-------------------------|--------------------------------------|:-------------------------------------------------:|
|
|
|
|
+| PP-LiteSeg B50 | pp_lite_b_seg50 | Cityscapes | 512x1024 | 76.48 | **4.18ms** | **31.22ms** | **31.69ms** |
|
|
|
|
+| PP-LiteSeg B75 | pp_lite_b_seg75 | Cityscapes | 768x1536 | 78.52 | **6.84ms** | **33.69ms** | **49.89ms** |
|
|
|
|
+| PP-LiteSeg T50 | pp_lite_t_seg50 | Cityscapes | 512x1024 | 74.92 | **3.26ms** | **30.33ms** | **26.20ms** |
|
|
|
|
+| PP-LiteSeg T75 | pp_lite_t_seg75 | Cityscapes | 768x1536 | 77.56 | **5.20ms** | **32.28ms** | **38.03ms** |
|
|
|
|
+| DDRNet 23 slim | ddrnet_23_slim | Cityscapes | 1024x2048 | 78.01 | **5.74ms** | **32.01ms** | **45.18ms** |
|
|
|
|
+| DDRNet 23 | ddrnet_23 | Cityscapes | 1024x2048 | 80.26 | **12.74ms** | **39.01ms** | **106.26ms** |
|
|
|
|
+| STDC 1-Seg50 | stdc1_seg50 | Cityscapes | 512x1024 | 75.11 | **3.34ms** | **30.12ms** | **27.54ms** |
|
|
|
|
+| STDC 1-Seg75 | stdc1_seg75 | Cityscapes | 768x1536 | 77.8 | **5.53ms** | **32.490ms** | **43.88** |
|
|
|
|
+| STDC 2-Seg50 | stdc2_seg50 | Cityscapes | 512x1024 | 76.44 | **4.12ms** | **30.94ms** | **32.03ms** |
|
|
|
|
+| STDC 2-Seg75 | stdc2_seg75 | Cityscapes | 768x1536 | 78.93 | **6.95ms** | **33.89ms** | **54.48ms** |
|
|
|
|
+| RegSeg (exp48) | regseg48 | Cityscapes | 1024x2048 | 78.15 | **12.03ms** | **38.91ms** | **78.20ms** |
|
|
|
|
+| Larger RegSeg (exp53) | <No Checkpoint ?> | Cityscapes | 1024x2048 | 79.2 | **22.00ms** | **48.96ms** | **150.78ms** |
|
|
|
|
|
|
> **NOTE:** Performance measured on T4 GPU with TensorRT, using FP16 precision and batch size 1 (latency), and not including IO
|
|
> **NOTE:** Performance measured on T4 GPU with TensorRT, using FP16 precision and batch size 1 (latency), and not including IO
|
|
|
|
|