2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00
2025-10-18 22:03:55 +08:00

YOLOv11 re-implementation using PyTorch

fix

  • fix the label size [0,1] tensor, which have two dim not adjusted size of [1,]
    • if pic do not have object (if label is empty), the phenomenon occurs
    • find XXX to look

Installation

conda create -n YOLO python=3.10.10
conda activate YOLO
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install opencv-python
pip install PyYAML
pip install tqdm

Train

  • Configure your dataset path in main.py for training
  • Run bash main.sh $ --train for training, $ is number of GPUs
  • Run nohup bash main.sh 1 --train --epochs 300 > train.log 2>&1 & for training in background

Test

  • Configure your dataset path in main.py for testing
  • Run python main.py --test for testing

Results

Version Epochs Box mAP Download
v11_n 600 38.6 Model
v11_n* - 39.2 Model
v11_s* - 46.5 Model
v11_m* - 51.2 Model
v11_l* - 53.0 Model
v11_x* - 54.3 Model
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.386
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.551
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.415
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.196
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.420
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.569
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.321
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.533
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.588
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.361
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.646
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.777
  • * means that it is from original repository, see reference
  • In the official YOLOv11 code, mask annotation information is used, which leads to higher performance

Dataset structure

├── COCO 
    ├── images
        ├── train2017
            ├── 1111.jpg
            ├── 2222.jpg
        ├── val2017
            ├── 1111.jpg
            ├── 2222.jpg
    ├── labels
        ├── train2017
            ├── 1111.txt
            ├── 2222.txt
        ├── val2017
            ├── 1111.txt
            ├── 2222.txt

Reference

Description
YOLOv11 re-implementation using PyTorch. Fix some bugs
Readme 53 KiB
Languages
Python 98.8%
Dockerfile 1.1%
Shell 0.1%