mindspore/example/yolov3_coco2017
jinyaohui 2907cf4488 remove some context param 2020-05-16 15:07:13 +08:00
..
README.md eliminate external lins to dataset 2020-05-08 17:15:25 +08:00
config.py add YOLOv3 infer scipt and change dataset to MindRecord 2020-04-03 11:35:58 +08:00
dataset.py clean pylint 2020-04-25 17:14:53 +08:00
eval.py remove some context param 2020-05-16 15:07:13 +08:00
run_distribute_train.sh change some settings in YOLOv3 2020-04-15 09:17:22 +08:00
run_eval.sh add YOLOv3 infer scipt and change dataset to MindRecord 2020-04-03 11:35:58 +08:00
run_standalone_train.sh add YOLOv3 infer scipt and change dataset to MindRecord 2020-04-03 11:35:58 +08:00
train.py remove some context param 2020-05-16 15:07:13 +08:00
util.py add YOLOv3 infer scipt and change dataset to MindRecord 2020-04-03 11:35:58 +08:00

README.md

YOLOv3 Example

Description

YOLOv3 network based on ResNet-18, with support for training and evaluation.

Requirements

  • Install MindSpore.

  • Dataset

    We use coco2017 as training dataset.

    1. The directory structure is as follows:

      .
      ├── annotations  # annotation jsons
      ├── train2017    # train dataset
      └── val2017      # infer dataset
      
    2. Organize the dataset infomation into a TXT file, each row in the file is as follows:

      train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
      

      Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. dataset.py is the parsing script, we read image from an image path joined by the image_dir(dataset directory) and the relative path in anno_path(the TXT file path), image_dir and anno_path are external inputs.

Running the Example

Training

To train the model, run train.py with the dataset image_dir, anno_path and mindrecord_dir. If the mindrecord_dir is empty, it wil generate mindrecord file by image_dir and anno_path(the absolute image path is joined by the image_dir and the relative path in anno_path). Note if mindrecord_dir isn't empty, it will use mindrecord_dir rather than image_dir and anno_path.

  • Stand alone mode

    sh run_standalone_train.sh 0 50 ./Mindrecord_train ./dataset ./dataset/train.txt
    
    

    The input variables are device id, epoch size, mindrecord directory path, dataset directory path and train TXT file path.

  • Distributed mode

    sh run_distribute_train.sh 8 150 /data/Mindrecord_train /data /data/train.txt /data/hccl.json
    

    The input variables are device numbers, epoch size, mindrecord directory path, dataset directory path, train TXT file path and hccl json configuration file. It is better to use absolute path.

You will get the loss value and time of each step as following:

epoch: 145 step: 156, loss is 12.202981
epoch time: 25599.22742843628, per step time: 164.0976117207454
epoch: 146 step: 156, loss is 16.91706
epoch time: 23199.971675872803, per step time: 148.7177671530308
epoch: 147 step: 156, loss is 13.04007
epoch time: 23801.95164680481, per step time: 152.57661312054364
epoch: 148 step: 156, loss is 10.431475
epoch time: 23634.241580963135, per step time: 151.50154859591754
epoch: 149 step: 156, loss is 14.665991
epoch time: 24118.8325881958, per step time: 154.60790120638333 
epoch: 150 step: 156, loss is 10.779521
epoch time: 25319.57221031189, per step time: 162.30495006610187

Note the results is two-classification(person and face) used our own annotations with coco2017, you can change num_classes in config.py to train your dataset. And we will suport 80 classifications in coco2017 the near future.

Evaluation

To eval, run eval.py with the dataset image_dir, anno_path(eval txt), mindrecord_dir and ckpt_path. ckpt_path is the path of checkpoint file.

sh run_eval.sh 0 yolo.ckpt ./Mindrecord_eval ./dataset ./dataset/eval.txt

The input variables are device id, checkpoint path, mindrecord directory path, dataset directory path and train TXT file path.

You will get the precision and recall value of each class:

class 0 precision is 88.18%, recall is 66.00%
class 1 precision is 85.34%, recall is 79.13%

Note the precision and recall values are results of two-classification(person and face) used our own annotations with coco2017.