History

yangyongjie 655d5174ea fix data preprocess bug		2020-07-03 18:48:07 +08:00
..
scripts	add Yolov3-darknet53 model	2020-06-30 09:46:58 +08:00
src	fix data preprocess bug	2020-07-03 18:48:07 +08:00
README.md	add Yolov3-darknet53 model	2020-06-30 09:46:58 +08:00
eval.py	add Yolov3-darknet53 model	2020-06-30 09:46:58 +08:00
train.py	add Yolov3-darknet53 model	2020-06-30 09:46:58 +08:00

README.md

YOLOV3-DarkNet53 Example

Description

This is an example of training YOLOV3-DarkNet53 with COCO2014 dataset in MindSpore.

Requirements

Install MindSpore.
Download the dataset COCO2014.

Unzip the COCO2014 dataset to any path you want, the folder should include train and eval dataset as follows:

.
└─dataset
    ├─train2014
    ├─val2014
    └─annotations

Structure

.
└─yolov3_darknet53      
  ├─README.md
  ├─scripts      
    ├─run_standalone_train.sh         # launch standalone training(1p)
    ├─run_distribute_train.sh         # launch distributed training(8p)
    └─run_eval.sh                     # launch evaluating
  ├─src
    ├─config.py                       # parameter configuration
    ├─darknet.py                      # backbone of network
    ├─distributed_sampler.py          # iterator of dataset
    ├─initializer.py                  # initializer of parameters
    ├─logger.py                       # log function
    ├─loss.py                         # loss function
    ├─lr_scheduler.py                 # generate learning rate
    ├─transforms.py                   # Preprocess data
    ├─util.py                         # util function
    ├─yolo.py                         # yolov3 network
    ├─yolo_dataset.py                 # create dataset for YOLOV3
  ├─eval.py                           # eval net
  └─train.py                          # train net

Running the example

Train

Usage

# distributed training
sh run_distribute_train.sh [DATASET_PATH] [PRETRAINED_BACKBONE] [MINDSPORE_HCCL_CONFIG_PATH]
 
# standalone training
sh run_standalone_train.sh [DATASET_PATH] [PRETRAINED_BACKBONE]

Launch

# distributed training example(8p)
sh run_distribute_train.sh dataset/coco2014 backbone/backbone.ckpt rank_table_8p.json

# standalone training example(1p)
sh run_standalone_train.sh dataset/coco2014 backbone/backbone.ckpt

About rank_table.json, you can refer to the distributed training tutorial.

Result

Training result will be stored in the scripts path, whose folder name begins with "train" or "train_parallel". You can find checkpoint file together with result like the followings in log.txt.

# distribute training result(8p)
epoch[0], iter[0], loss:14623.384766, 1.23 imgs/sec, lr:7.812499825377017e-05
epoch[0], iter[100], loss:1486.253051, 15.01 imgs/sec, lr:0.007890624925494194
epoch[0], iter[200], loss:288.579535, 490.41 imgs/sec, lr:0.015703124925494194
epoch[0], iter[300], loss:153.136754, 531.99 imgs/sec, lr:0.023515624925494194
epoch[1], iter[400], loss:106.429322, 405.14 imgs/sec, lr:0.03132812678813934
...
epoch[318], iter[102000], loss:34.135306, 431.06 imgs/sec, lr:9.63797629083274e-06
epoch[319], iter[102100], loss:35.652469, 449.52 imgs/sec, lr:2.409552052995423e-06
epoch[319], iter[102200], loss:34.652273, 384.02 imgs/sec, lr:2.409552052995423e-06
epoch[319], iter[102300], loss:35.430038, 423.49 imgs/sec, lr:2.409552052995423e-06
...

Infer

Usage

# infer
sh run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH]

Launch

# infer with checkpoint
sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt

checkpoint can be produced in training process.

Result

Inference result will be stored in the scripts path, whose folder name is "eval". Under this, you can find result like the followings in log.txt.

=============coco eval reulst=========
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.311
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.528
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.322
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.127
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.323
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.428
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.259
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.398
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.423
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.224
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.442
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.551