!16110 SSD ResNet50 without FPN for master branch

From: @wittlu
Reviewed-by: @c_34,@oacjiewen
Signed-off-by: @c_34
This commit is contained in:
mindspore-ci-bot 2021-05-11 21:08:50 +08:00 committed by Gitee
commit 6e0246fa12
31 changed files with 3452 additions and 0 deletions

View File

@ -0,0 +1,6 @@
ARG FROM_IMAGE_NAME
FROM ${FROM_IMAGE_NAME}
RUN apt install libgl1-mesa-glx -y
COPY requirements.txt .
RUN pip3.7 install -r requirements.txt

View File

@ -0,0 +1,305 @@
# Contents
- [Contents](#contents)
- [SSD Description](#ssd-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Prepare the model](#prepare-the-model)
- [Run the scripts](#run-the-scripts)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [Training Process](#training-process)
- [Training on Ascend](#training-on-ascend)
- [Evaluation Process](#evaluation-process)
- [Evaluation on Ascend](#evaluation-on-ascend)
- [Performance](#performance)
- [Export MindIR](#export-mindir)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
## [SSD Description](#contents)
SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.
[Paper](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press).
## [Model Architecture](#contents)
The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections. The early network layers are based on a standard architecture used for high quality image classification, which is called the base network. Then add auxiliary structure to the network to produce detections.
## [Dataset](#contents)
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: [COCO2017](<http://images.cocodataset.org/>)
- Dataset size19G
- Train18G118000 images
- Val1G5000 images
- Annotations241Minstancescaptionsperson_keypoints etc
- Data formatimage and json files
- NoteData will be processed in dataset.py
## [Environment Requirements](#contents)
- Install [MindSpore](https://www.mindspore.cn/install/en).
- Download the dataset COCO2017.
- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets.
First, install Cython ,pycocotool and opencv to process data and to get evaluation result.
```shell
pip install Cython
pip install pycocotools
pip install opencv-python
```
1. If coco dataset is used. **Select dataset to coco when run script.**
Change the `coco_root` and other settings you need in `src/config.py`. The directory structure is as follows:
```shell
.
└─coco_dataset
├─annotations
├─instance_train2017.json
└─instance_val2017.json
├─val2017
└─train2017
```
2. If VOC dataset is used. **Select dataset to voc when run script.**
Change `classes`, `num_classes`, `voc_json` and `voc_root` in `src/config.py`. `voc_json` is the path of json file with coco format for evaluation, `voc_root` is the path of VOC dataset, the directory structure is as follows:
```shell
.
└─voc_dataset
└─train
├─0001.jpg
└─0001.xml
...
├─xxxx.jpg
└─xxxx.xml
└─eval
├─0001.jpg
└─0001.xml
...
├─xxxx.jpg
└─xxxx.xml
```
3. If your own dataset is used. **Select dataset to other when run script.**
Organize the dataset information into a TXT file, each row in the file is as follows:
```shell
train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
```
Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are setting in `src/config.py`.
## [Quick Start](#contents)
### Run the scripts
After installing MindSpore via the official website, you can start training and evaluation as follows:
- running on Ascend
```shell
# distributed training on Ascend
bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE]
# training on single NPU
sh run_standalone_train.sh
# run eval on Ascend
bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
```
- Run on docker
Build docker images(Change version to the one you actually used)
```shell
# build docker
docker build -t ssd:20.1.0 . --build-arg FROM_IMAGE_NAME=ascend-mindspore-arm:20.1.0
```
Create a container layer over the created image and start it
```shell
# start docker
bash scripts/docker_start.sh ssd:20.1.0 [DATA_DIR] [MODEL_DIR]
```
Then you can run everything just like on ascend.
## [Script Description](#contents)
### [Script and Sample Code](#contents)
```shell
.
└─ cv
└─ ssd
├─ README.md # descriptions about SSD
├─ scripts
├─ run_distribute_train.sh # shell script for distributed on ascend
└─ run_eval.sh # shell script for eval on ascend
├─ src
├─ __init__.py # init file
├─ box_utils.py # bbox utils
├─ eval_utils.py # metrics utils
├─ config.py # total config
├─ dataset.py # create dataset and process dataset
├─ init_params.py # parameters utils
├─ lr_schedule.py # learning ratio generator
└─ ssd.py # ssd architecture
├─ eval.py # eval scripts
├─ train.py # train scripts
├─ export.py # export mindir script
└─ mindspore_hub_conf.py # mindspore hub interface
```
### [Script Parameters](#contents)
```shell
Major parameters in train.py and config.py as follows:
"device_num": 1 # Use device nums
"lr": 0.05 # Learning rate init value
"dataset": coco # Dataset name
"epoch_size": 500 # Epoch size
"batch_size": 32 # Batch size of input tensor
"pre_trained": None # Pretrained checkpoint file path
"pre_trained_epoch_size": 0 # Pretrained epoch size
"save_checkpoint_epochs": 10 # The epoch interval between two checkpoints. By default, the checkpoint will be saved per 10 epochs
"loss_scale": 1024 # Loss scale
"filter_weight": False # Load parameters in head layer or not. If the class numbers of train dataset is different from the class numbers in pre_trained checkpoint, please set True.
"freeze_layer": "none" # Freeze the backbone parameters or not, support none and backbone.
"class_num": 81 # Dataset class number
"image_shape": [300, 300] # Image height and width used as input to the model
"mindrecord_dir": "/data/MindRecord_COCO" # MindRecord path
"coco_root": "/data/coco2017" # COCO2017 dataset path
"voc_root": "/data/voc_dataset" # VOC original dataset path
"voc_json": "annotations/voc_instances_val.json" # is the path of json file with coco format for evaluation
"image_dir": "" # Other dataset image path, if coco or voc used, it will be useless
"anno_path": "" # Other dataset annotation path, if coco or voc used, it will be useless
```
### [Training Process](#contents)
To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.**
#### Training on Ascend
- Distribute mode
```shell
bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
```
We need five or seven parameters for this scripts.
- `DEVICE_NUM`: the device number for distributed train.
- `EPOCH_NUM`: epoch num for distributed train.
- `LR`: learning rate init value for distributed train.
- `DATASET`the dataset mode for distributed train.
- `RANK_TABLE_FILE :` the path of [rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools), it is better to use absolute path.
- `PRE_TRAINED :` the path of pretrained checkpoint file, it is better to use absolute path.
- `PRE_TRAINED_EPOCH_SIZE :` the epoch num of pretrained.
Training result will be stored in the current path, whose folder name begins with "LOG". Under this, you can find checkpoint file together with result like the followings in log
```shell
epoch: 1 step: 458, loss is 3.1681802
epoch time: 228752.4654865265, per step time: 499.4595316299705
epoch: 2 step: 458, loss is 2.8847265
epoch time: 38912.93382644653, per step time: 84.96273761232868
epoch: 3 step: 458, loss is 2.8398118
epoch time: 38769.184827804565, per step time: 84.64887516987896
...
epoch: 498 step: 458, loss is 0.70908034
epoch time: 38771.079778671265, per step time: 84.65301261718616
epoch: 499 step: 458, loss is 0.7974688
epoch time: 38787.413120269775, per step time: 84.68867493508685
epoch: 500 step: 458, loss is 0.5548882
epoch time: 39064.8467540741, per step time: 85.29442522723602
```
### [Evaluation Process](#contents)
#### Evaluation on Ascend
```shell
bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
```
We need two parameters for this scripts.
- `DATASET`the dataset mode of evaluation dataset.
- `CHECKPOINT_PATH`: the absolute path for checkpoint file.
- `DEVICE_ID`: the device id for eval.
> checkpoint can be produced in training process.
Inference result will be stored in the example path, whose folder name begins with "eval". Under this, you can find result like the followings in log.
```shell
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689
========================================
mAP: 0.32719216721918915
```
### [Performance](#contents)
| 参数 | Ascend |
| ------------------- | --------------------- |
| 模型版本 | SSD resnet50 |
| 资源 | Ascend 910 |
| 上传日期 | 2020-03-29 |
| MindSpore版本 | 1.1.0 |
| 数据集 | COCO2017 |
| mAP | IoU=0.50: 32.7% |
| 模型大小 | 281M.ckpt文件 |
## [Export MindIR](#contents)
```shell
python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
```
The ckpt_file parameter is required,
`EXPORT_FORMAT` should be in ["AIR", "MINDIR"]
## [Description of Random Situation](#contents)
In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
## [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -0,0 +1,263 @@
# 目录
<!-- TOC -->
- [目录](#目录)
- [SSD说明](#ssd说明)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [Ascend上训练](#ascend上训练)
- [评估过程](#评估过程)
- [Ascend处理器环境评估](#ascend处理器环境评估)
- [性能](#性能)
- [导出MindIR](#导出MindIR)
- [随机情况说明](#随机情况说明)
- [ModelZoo主页](#modelzoo主页)
<!-- /TOC -->
# SSD说明
SSD将边界框的输出空间离散成一组默认框每个特征映射位置具有不同的纵横比和尺度。在预测时网络对每个默认框中存在的对象类别进行评分并对框进行调整以更好地匹配对象形状。此外网络将多个不同分辨率的特征映射的预测组合在一起自然处理各种大小的对象。
[论文](https://arxiv.org/abs/1512.02325) Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press).
# 模型架构
SSD方法基于前向卷积网络该网络产生固定大小的边界框集合并针对这些框内存在的对象类实例进行评分然后通过非极大值抑制步骤进行最终检测。早期的网络层基于高质量图像分类的标准体系结构被称为基础网络。后来通过向网络添加辅助结构进行检测。
# 数据集
使用的数据集: [COCO2017](<http://images.cocodataset.org/>)
- 数据集大小19 GB
- 训练集18 GB118000张图像
- 验证集1 GB5000张图像
- 标注241 MB实例字幕person_keypoints等
- 数据格式图像和json文件
- 注意数据在dataset.py中处理
# 环境要求
- 安装[MindSpore](https://www.mindspore.cn/install)。
- 下载数据集COCO2017。
- 本示例默认使用COCO2017作为训练数据集您也可以使用自己的数据集。
1. 如果使用coco数据集。**执行脚本时选择数据集coco。**
安装Cython和pycocotool也可以安装mmcv进行数据处理。
```python
pip install Cython
pip install pycocotools
```
并在`config.py`中更改COCO_ROOT和其他您需要的设置。目录结构如下
```text
.
└─cocodataset
├─annotations
├─instance_train2017.json
└─instance_val2017.json
├─val2017
└─train2017
```
2. 如果使用自己的数据集。**执行脚本时选择数据集为other。**
将数据集信息整理成TXT文件每行如下
```text
train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
```
每行是按空间分割的图像标注,第一列是图像的相对路径,其余为[xmin,ymin,xmax,ymax,class]格式的框和类信息。我们从`IMAGE_DIR`(数据集目录)和`ANNO_PATH`TXT文件路径的相对路径连接起来的图像路径中读取图像。在`config.py`中设置`IMAGE_DIR`和`ANNO_PATH`。
# 快速入门
通过官方网站安装MindSpore后您可以按照如下步骤进行训练和评估
```shell script
# Ascend分布式训练
sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE]
```
```shell script
# 单卡训练
sh run_standalone_train.sh
```
```shell script
# Ascend处理器环境运行eval
sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
```
# 脚本说明
## 脚本及样例代码
```text
.
└─ cv
└─ ssd
├─ README.md ## SSD相关说明
├─ scripts
├─ run_distribute_train.sh ## Ascend分布式shell脚本
└─ run_eval.sh ## Ascend评估shell脚本
├─ src
├─ __init__.py ## 初始化文件
├─ box_util.py ## bbox工具
├─ coco_eval.py ## coco指标工具
├─ config.py ## 总配置
├─ dataset.py ## 创建并处理数据集
├─ init_params.py ## 参数工具
├─ lr_schedule.py ## 学习率生成器
└─ ssd.py ## SSD架构
├─ eval.py ## 评估脚本
├─ train.py ## 训练脚本
└─ mindspore_hub_conf.py ## MindSpore Hub接口
```
## 脚本参数
```text
train.py和config.py中主要参数如下
"device_num": 1 # 使用设备数量
"lr": 0.05 # 学习率初始值
"dataset": coco # 数据集名称
"epoch_size": 500 # 轮次大小
"batch_size": 32 # 输入张量的批次大小
"pre_trained": None # 预训练检查点文件路径
"pre_trained_epoch_size": 0 # 预训练轮次大小
"save_checkpoint_epochs": 10 # 两个检查点之间的轮次间隔。默认情况下每10个轮次都会保存检查点。
"loss_scale": 1024 # 损失放大
"class_num": 81 # 数据集类数
"image_shape": [300, 300] # 作为模型输入的图像高和宽
"mindrecord_dir": "/data/MindRecord_COCO" # MindRecord路径
"coco_root": "/data/coco2017" # COCO2017数据集路径
"voc_root": "" # VOC原始数据集路径
"image_dir": "" # 其他数据集图片路径如果使用coco或voc此参数无效。
"anno_path": "" # 其他数据集标注路径如果使用coco或voc此参数无效。
```
## 训练过程
运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`coco数据集或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html)文件。**注意如果mindrecord_dir不为空将使用mindrecord_dir代替原始图像。**
### Ascend上训练
- 分布式
```shell script
sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
```
此脚本需要五或七个参数。
- `DEVICE_NUM`:分布式训练的设备数。
- `EPOCH_NUM`:分布式训练的轮次数。
- `LR`:分布式训练的学习率初始值。
- `DATASET`:分布式训练的数据集模式。
- `RANK_TABLE_FILE`[rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools)的路径。最好使用绝对路径。
- `PRE_TRAINED`:预训练检查点文件的路径。最好使用绝对路径。
- `PRE_TRAINED_EPOCH_SIZE`:预训练的轮次数。
训练结果保存在当前路径中,文件夹名称以"LOG"开头。 您可在此文件夹中找到检查点文件以及结果,如下所示。
```text
epoch: 1 step: 458, loss is 3.1681802
epoch time: 228752.4654865265, per step time: 499.4595316299705
epoch: 2 step: 458, loss is 2.8847265
epoch time: 38912.93382644653, per step time: 84.96273761232868
epoch: 3 step: 458, loss is 2.8398118
epoch time: 38769.184827804565, per step time: 84.64887516987896
...
epoch: 498 step: 458, loss is 0.70908034
epoch time: 38771.079778671265, per step time: 84.65301261718616
epoch: 499 step: 458, loss is 0.7974688
epoch time: 38787.413120269775, per step time: 84.68867493508685
epoch: 500 step: 458, loss is 0.5548882
epoch time: 39064.8467540741, per step time: 85.29442522723602
```
## 评估过程
### Ascend处理器环境评估
```shell script
sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
```
此脚本需要两个参数。
- `DATASET`:评估数据集的模式。
- `CHECKPOINT_PATH`:检查点文件的绝对路径。
- `DEVICE_ID`: 评估的设备ID。
> 在训练过程中可以生成检查点。
推理结果保存在示例路径中文件夹名称以“eval”开头。您可以在日志中找到类似以下的结果。
```text
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689
========================================
mAP: 0.32719216721918915
```
### 性能
| 参数 | Ascend |
| ------------------- | --------------------- |
| 模型版本 | SSD resnet50 |
| 资源 | Ascend 910 |
| 上传日期 | 2021-03-29 |
| MindSpore版本 | 1.1.0 |
| 数据集 | COCO2017 |
| mAP | IoU=0.50: 32.7% |
| 模型大小 | 281M.ckpt文件 |
## 导出MindIR
```shell
python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
```
参数ckpt_file为必填项
`EXPORT_FORMAT` 必须在 ["AIR", "MINDIR"]中选择。
# 随机情况说明
dataset.py中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
# ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。

View File

@ -0,0 +1,14 @@
cmake_minimum_required(VERSION 3.14.1)
project(Ascend310Infer)
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
option(MINDSPORE_PATH "mindspore install path" "")
include_directories(${MINDSPORE_PATH})
include_directories(${MINDSPORE_PATH}/include)
include_directories(${PROJECT_SRC_ROOT})
find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
add_executable(main src/main.cc src/utils.cc)
target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags)

View File

@ -0,0 +1,26 @@
aipp_op {
aipp_mode : static
input_format : YUV420SP_U8
related_input_rank : 0
csc_switch : true
rbuv_swap_switch : false
matrix_r0c0 : 256
matrix_r0c1 : 0
matrix_r0c2 : 359
matrix_r1c0 : 256
matrix_r1c1 : -88
matrix_r1c2 : -183
matrix_r2c0 : 256
matrix_r2c1 : 454
matrix_r2c2 : 0
input_bias_0 : 0
input_bias_1 : 128
input_bias_2 : 128
mean_chn_0 : 124
mean_chn_1 : 117
mean_chn_2 : 104
var_reci_chn_0 : 0.0171247538316637
var_reci_chn_1 : 0.0175070028011204
var_reci_chn_2 : 0.0174291938997821
}

View File

@ -0,0 +1,28 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ ! -d out ]; then
mkdir out
fi
cd out
if [ -f "Makefile" ]; then
make clean
fi
cmake .. \
-DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
make

View File

@ -0,0 +1,32 @@
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef MINDSPORE_INFERENCE_UTILS_H_
#define MINDSPORE_INFERENCE_UTILS_H_
#include <sys/stat.h>
#include <dirent.h>
#include <vector>
#include <string>
#include <memory>
#include "include/api/types.h"
std::vector<std::string> GetAllFiles(std::string_view dirName);
DIR *OpenDir(std::string_view dirName);
std::string RealPath(std::string_view path);
mindspore::MSTensor ReadFileToTensor(const std::string &file);
int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs);
#endif

View File

@ -0,0 +1,162 @@
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <sys/time.h>
#include <gflags/gflags.h>
#include <dirent.h>
#include <iostream>
#include <string>
#include <algorithm>
#include <iosfwd>
#include <vector>
#include <fstream>
#include "include/api/model.h"
#include "include/api/context.h"
#include "include/api/types.h"
#include "include/api/serialization.h"
#include "include/minddata/dataset/include/vision_ascend.h"
#include "include/minddata/dataset/include/execute.h"
#include "include/minddata/dataset/include/vision.h"
#include "inc/utils.h"
using mindspore::GlobalContext;
using mindspore::Serialization;
using mindspore::Model;
using mindspore::ModelContext;
using mindspore::Status;
using mindspore::ModelType;
using mindspore::GraphCell;
using mindspore::kSuccess;
using mindspore::MSTensor;
using mindspore::dataset::Execute;
using mindspore::dataset::TensorTransform;
using mindspore::dataset::vision::DvppDecodeResizeJpeg;
using mindspore::dataset::vision::Resize;
using mindspore::dataset::vision::HWC2CHW;
using mindspore::dataset::vision::Normalize;
using mindspore::dataset::vision::Decode;
DEFINE_string(mindir_path, "", "mindir path");
DEFINE_string(dataset_path, ".", "dataset path");
DEFINE_int32(device_id, 0, "device id");
DEFINE_string(aipp_path, "./aipp.cfg", "aipp path");
DEFINE_string(cpu_dvpp, "DVPP", "cpu or dvpp process");
DEFINE_int32(image_height, 640, "image height");
DEFINE_int32(image_width, 640, "image width");
int main(int argc, char **argv) {
gflags::ParseCommandLineFlags(&argc, &argv, true);
if (RealPath(FLAGS_mindir_path).empty()) {
std::cout << "Invalid mindir" << std::endl;
return 1;
}
GlobalContext::SetGlobalDeviceTarget(mindspore::kDeviceTypeAscend310);
GlobalContext::SetGlobalDeviceID(FLAGS_device_id);
auto graph = Serialization::LoadModel(FLAGS_mindir_path, ModelType::kMindIR);
auto model_context = std::make_shared<mindspore::ModelContext>();
if (FLAGS_cpu_dvpp == "DVPP") {
if (RealPath(FLAGS_aipp_path).empty()) {
std::cout << "Invalid aipp path" << std::endl;
return 1;
} else {
ModelContext::SetInsertOpConfigPath(model_context, FLAGS_aipp_path);
}
}
Model model(GraphCell(graph), model_context);
Status ret = model.Build();
if (ret != kSuccess) {
std::cout << "ERROR: Build failed." << std::endl;
return 1;
}
auto all_files = GetAllFiles(FLAGS_dataset_path);
if (all_files.empty()) {
std::cout << "ERROR: no input data." << std::endl;
return 1;
}
std::map<double, double> costTime_map;
size_t size = all_files.size();
for (size_t i = 0; i < size; ++i) {
struct timeval start = {0};
struct timeval end = {0};
double startTimeMs;
double endTimeMs;
std::vector<MSTensor> inputs;
std::vector<MSTensor> outputs;
std::cout << "Start predict input files:" << all_files[i] << std::endl;
if (FLAGS_cpu_dvpp == "DVPP") {
auto resizeShape = {static_cast <uint32_t>(FLAGS_image_height), static_cast <uint32_t>(FLAGS_image_width)};
Execute resize_op(std::shared_ptr<DvppDecodeResizeJpeg>(new DvppDecodeResizeJpeg(resizeShape)));
auto imgDvpp = std::make_shared<MSTensor>();
resize_op(ReadFileToTensor(all_files[i]), imgDvpp.get());
inputs.emplace_back(imgDvpp->Name(), imgDvpp->DataType(), imgDvpp->Shape(),
imgDvpp->Data().get(), imgDvpp->DataSize());
} else {
std::shared_ptr<TensorTransform> decode(new Decode());
std::shared_ptr<TensorTransform> hwc2chw(new HWC2CHW());
std::shared_ptr<TensorTransform> normalize(
new Normalize({123.675, 116.28, 103.53}, {58.395, 57.120, 57.375}));
auto resizeShape = {FLAGS_image_height, FLAGS_image_width};
std::shared_ptr<TensorTransform> resize(new Resize(resizeShape));
Execute composeDecode({decode, resize, normalize, hwc2chw});
auto img = MSTensor();
auto image = ReadFileToTensor(all_files[i]);
composeDecode(image, &img);
std::vector<MSTensor> model_inputs = model.GetInputs();
if (model_inputs.empty()) {
std::cout << "Invalid model, inputs is empty." << std::endl;
return 1;
}
inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(),
img.Data().get(), img.DataSize());
}
gettimeofday(&start, nullptr);
ret = model.Predict(inputs, &outputs);
gettimeofday(&end, nullptr);
if (ret != kSuccess) {
std::cout << "Predict " << all_files[i] << " failed." << std::endl;
return 1;
}
startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
costTime_map.insert(std::pair<double, double>(startTimeMs, endTimeMs));
WriteResult(all_files[i], outputs);
}
double average = 0.0;
int inferCount = 0;
char tmpCh[256] = {0};
for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
double diff = 0.0;
diff = iter->second - iter->first;
average += diff;
inferCount++;
}
average = average / inferCount;
snprintf(tmpCh, sizeof(tmpCh), \
"NN inference cost average time: %4.3f ms of infer_count %d \n", average, inferCount);
std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << inferCount << std::endl;
std::string fileName = "./time_Result" + std::string("/test_perform_static.txt");
std::ofstream fileStream(fileName.c_str(), std::ios::trunc);
fileStream << tmpCh;
fileStream.close();
costTime_map.clear();
return 0;
}

View File

@ -0,0 +1,129 @@
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <fstream>
#include <algorithm>
#include <iostream>
#include "inc/utils.h"
using mindspore::MSTensor;
using mindspore::DataType;
std::vector<std::string> GetAllFiles(std::string_view dirName) {
struct dirent *filename;
DIR *dir = OpenDir(dirName);
if (dir == nullptr) {
return {};
}
std::vector<std::string> res;
while ((filename = readdir(dir)) != nullptr) {
std::string dName = std::string(filename->d_name);
if (dName == "." || dName == ".." || filename->d_type != DT_REG) {
continue;
}
res.emplace_back(std::string(dirName) + "/" + filename->d_name);
}
std::sort(res.begin(), res.end());
for (auto &f : res) {
std::cout << "image file: " << f << std::endl;
}
return res;
}
int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) {
std::string homePath = "./result_Files";
for (size_t i = 0; i < outputs.size(); ++i) {
size_t outputSize;
std::shared_ptr<const void> netOutput;
netOutput = outputs[i].Data();
outputSize = outputs[i].DataSize();
int pos = imageFile.rfind('/');
std::string fileName(imageFile, pos + 1);
fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
std::string outFileName = homePath + "/" + fileName;
FILE * outputFile = fopen(outFileName.c_str(), "wb");
fwrite(netOutput.get(), outputSize, sizeof(char), outputFile);
fclose(outputFile);
outputFile = nullptr;
}
return 0;
}
mindspore::MSTensor ReadFileToTensor(const std::string &file) {
if (file.empty()) {
std::cout << "Pointer file is nullptr" << std::endl;
return mindspore::MSTensor();
}
std::ifstream ifs(file);
if (!ifs.good()) {
std::cout << "File: " << file << " is not exist" << std::endl;
return mindspore::MSTensor();
}
if (!ifs.is_open()) {
std::cout << "File: " << file << "open failed" << std::endl;
return mindspore::MSTensor();
}
ifs.seekg(0, std::ios::end);
size_t size = ifs.tellg();
mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
ifs.seekg(0, std::ios::beg);
ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
ifs.close();
return buffer;
}
DIR *OpenDir(std::string_view dirName) {
if (dirName.empty()) {
std::cout << " dirName is null ! " << std::endl;
return nullptr;
}
std::string realPath = RealPath(dirName);
struct stat s;
lstat(realPath.c_str(), &s);
if (!S_ISDIR(s.st_mode)) {
std::cout << "dirName is not a valid directory !" << std::endl;
return nullptr;
}
DIR *dir;
dir = opendir(realPath.c_str());
if (dir == nullptr) {
std::cout << "Can not open dir " << dirName << std::endl;
return nullptr;
}
std::cout << "Successfully opened the dir " << dirName << std::endl;
return dir;
}
std::string RealPath(std::string_view path) {
char realPathMem[PATH_MAX] = {0};
char *realPathRet = nullptr;
realPathRet = realpath(path.data(), realPathMem);
if (realPathRet == nullptr) {
std::cout << "File: " << path << " is not exist.";
return "";
}
std::string realPath(realPathMem);
std::cout << path << " realpath is: " << realPath << std::endl;
return realPath;
}

View File

@ -0,0 +1,98 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Evaluation for SSD"""
import os
import argparse
import time
import numpy as np
from mindspore import context, Tensor
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from src.ssd import SsdInferWithDecoder, ssd_resnet50
from src.dataset import create_ssd_dataset, create_mindrecord
from src.config import config
from src.eval_utils import metrics
from src.box_utils import default_boxes
def ssd_eval(dataset_path, ckpt_path, anno_json):
"""SSD evaluation."""
batch_size = 1
ds = create_ssd_dataset(dataset_path, batch_size=batch_size, repeat_num=1,
is_training=False, use_multiprocessing=False)
if config.model == "ssd_resnet50":
net = ssd_resnet50(config=config)
else:
raise ValueError(f'config.model: {config.model} is not supported')
net = SsdInferWithDecoder(net, Tensor(default_boxes), config)
print("Load Checkpoint!")
param_dict = load_checkpoint(ckpt_path)
net.init_parameters_data()
load_param_into_net(net, param_dict)
net.set_train(False)
i = batch_size
total = ds.get_dataset_size() * batch_size
start = time.time()
pred_data = []
print("\n========================================\n")
print("total images num: ", total)
print("Processing, please wait a moment.")
for data in ds.create_dict_iterator(output_numpy=True, num_epochs=1):
img_id = data['img_id']
img_np = data['image']
image_shape = data['image_shape']
output = net(Tensor(img_np))
for batch_idx in range(img_np.shape[0]):
pred_data.append({"boxes": output[0].asnumpy()[batch_idx],
"box_scores": output[1].asnumpy()[batch_idx],
"img_id": int(np.squeeze(img_id[batch_idx])),
"image_shape": image_shape[batch_idx]})
percent = round(i / total * 100., 2)
print(f' {str(percent)} [{i}/{total}]', end='\r')
i += batch_size
cost_time = int((time.time() - start) * 1000)
print(f' 100% [{total}/{total}] cost {cost_time} ms')
mAP = metrics(pred_data, anno_json)
print("\n========================================\n")
print(f"mAP: {mAP}")
def get_eval_args():
parser = argparse.ArgumentParser(description='SSD evaluation')
parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.")
parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.")
parser.add_argument("--checkpoint_path", type=str, required=True, help="Checkpoint file path.")
parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"),
help="run platform, support Ascend ,GPU and CPU.")
return parser.parse_args()
if __name__ == '__main__':
args_opt = get_eval_args()
if args_opt.dataset == "coco":
json_path = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type))
elif args_opt.dataset == "voc":
json_path = os.path.join(config.voc_root, config.voc_json)
else:
raise ValueError('SSD eval only support dataset mode is coco and voc!')
context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id)
mindrecord_file = create_mindrecord(args_opt.dataset, "ssd_eval.mindrecord", False)
print("Start Eval!")
ssd_eval(mindrecord_file, args_opt.checkpoint_path, json_path)

View File

@ -0,0 +1,52 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""export"""
import argparse
import numpy as np
from mindspore import context, Tensor, dtype
from mindspore.train.serialization import load_checkpoint, load_param_into_net, export
from src.ssd import SsdInferWithDecoder, ssd_resnet50
from src.config import config
from src.box_utils import default_boxes
parser = argparse.ArgumentParser(description='SSD export')
parser.add_argument("--device_id", type=int, default=0, help="Device id")
parser.add_argument("--batch_size", type=int, default=1, help="batch size")
parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
parser.add_argument("--file_name", type=str, default="ssd", help="output file name.")
parser.add_argument('--file_format', type=str, choices=["AIR", "MINDIR"], default='AIR', help='file format')
parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="Ascend",
help="device target")
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
if args.device_target == "Ascend":
context.set_context(device_id=args.device_id)
if __name__ == '__main__':
if config.model == "ssd_resnet50":
net = ssd_resnet50(config=config)
else:
raise ValueError(f'config.model: {config.model} is not supported')
net = SsdInferWithDecoder(net, Tensor(default_boxes), config)
param_dict = load_checkpoint(args.ckpt_file)
net.init_parameters_data()
load_param_into_net(net, param_dict)
net.set_train(False)
input_shp = [args.batch_size, 3] + config.img_shape
input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), dtype.float32)
export(net, input_array, file_name=args.file_name, file_format=args.file_format)

View File

@ -0,0 +1,24 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""hub config."""
from src.ssd import SSD300, ssd_mobilenet_v2
from src.config import config
def create_network(name, *args, **kwargs):
if name == "ssd300":
backbone = ssd_mobilenet_v2()
ssd = SSD300(backbone=backbone, config=config, *args, **kwargs)
return ssd
raise NotImplementedError(f"{name} is not implemented in the repo")

View File

@ -0,0 +1,90 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""post process for 310 inference"""
import os
import argparse
import numpy as np
from PIL import Image
from src.config import config
from src.eval_utils import metrics
batch_size = 1
parser = argparse.ArgumentParser(description="ssd acc calculation")
parser.add_argument("--result_path", type=str, required=True, help="result files path.")
parser.add_argument("--img_path", type=str, required=True, help="image file path.")
parser.add_argument("--drop", action="store_true", help="drop iscrowd images or not.")
args = parser.parse_args()
def get_imgSize(file_name):
img = Image.open(file_name)
return img.size
def get_result(result_path, img_id_file_path):
"""print the mAP"""
anno_json = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type))
if args.drop:
from pycocotools.coco import COCO
train_cls = config.classes
train_cls_dict = {}
for i, cls in enumerate(train_cls):
train_cls_dict[cls] = i
coco = COCO(anno_json)
classs_dict = {}
cat_ids = coco.loadCats(coco.getCatIds())
for cat in cat_ids:
classs_dict[cat["id"]] = cat["name"]
files = os.listdir(img_id_file_path)
pred_data = []
for file in files:
img_ids_name = file.split('.')[0]
img_id = int(np.squeeze(img_ids_name))
if args.drop:
anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None)
anno = coco.loadAnns(anno_ids)
annos = []
iscrowd = False
for label in anno:
bbox = label["bbox"]
class_name = classs_dict[label["category_id"]]
iscrowd = iscrowd or label["iscrowd"]
if class_name in train_cls:
x_min, x_max = bbox[0], bbox[0] + bbox[2]
y_min, y_max = bbox[1], bbox[1] + bbox[3]
annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]])
if iscrowd or (not annos):
continue
img_size = get_imgSize(os.path.join(img_id_file_path, file))
image_shape = np.array([img_size[1], img_size[0]])
result_path_0 = os.path.join(result_path, img_ids_name + "_0.bin")
result_path_1 = os.path.join(result_path, img_ids_name + "_1.bin")
boxes = np.fromfile(result_path_0, dtype=np.float32).reshape(config.num_ssd_boxes, 4)
box_scores = np.fromfile(result_path_1, dtype=np.float32).reshape(config.num_ssd_boxes, config.num_classes)
pred_data.append({
"boxes": boxes,
"box_scores": box_scores,
"img_id": img_id,
"image_shape": image_shape
})
mAP = metrics(pred_data, anno_json)
print(f" mAP:{mAP}")
if __name__ == '__main__':
get_result(args.result_path, args.img_path)

View File

@ -0,0 +1,4 @@
pycocotools
opencv-python
xml-python
Pillow

View File

@ -0,0 +1,29 @@
#!/bin/bash
docker_image=$1
data_dir=$2
model_dir=$3
docker run -it --ipc=host \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci2 \
--device=/dev/davinci3 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
--privileged \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons \
-v ${data_dir}:${data_dir} \
-v ${model_dir}:${model_dir} \
-v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf \
-v /var/log/npu/slog/:/var/log/npu/slog/ \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog ${docker_image} \
/bin/bash

View File

@ -0,0 +1,84 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "=============================================================================================================="
echo "Please run the script as: "
echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE"
echo "for example: sh run_distribute_train.sh 8 500 0.2 coco /data/hccl.json /opt/ssd-300.ckpt(optional) 200(optional)"
echo "It is better to use absolute path."
echo "================================================================================================================="
if [ $# != 5 ] && [ $# != 7 ]
then
echo "Usage: sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] \
[RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)"
exit 1
fi
# Before start distribute train, first create mindrecord files.
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
cd $BASE_PATH/../ || exit
python train.py --only_create_dataset=True --dataset=$4
echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt"
export RANK_SIZE=$1
EPOCH_SIZE=$2
LR=$3
DATASET=$4
PRE_TRAINED=$6
PRE_TRAINED_EPOCH_SIZE=$7
export RANK_TABLE_FILE=$5
start_device=0
for((i=0;i<RANK_SIZE;i++))
do
let device_id=i+start_device
export DEVICE_ID=$device_id
rm -rf LOG$i
mkdir ./LOG$i
cp ./*.py ./LOG$i
cp -r ./src ./LOG$i
cd ./LOG$i || exit
export RANK_ID=$i
echo "start training for rank $i, device $DEVICE_ID"
env > env.log
if [ $# == 5 ]
then
python train.py \
--distribute=True \
--lr=$LR \
--dataset=$DATASET \
--device_num=$RANK_SIZE \
--device_id=$DEVICE_ID \
--epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
fi
if [ $# == 7 ]
then
python train.py \
--distribute=True \
--lr=$LR \
--dataset=$DATASET \
--device_num=$RANK_SIZE \
--device_id=$DEVICE_ID \
--pre_trained=$PRE_TRAINED \
--pre_trained_epoch_size=$PRE_TRAINED_EPOCH_SIZE \
--epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
fi
cd ../
done

View File

@ -0,0 +1,65 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 3 ]
then
echo "Usage: sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
DATASET=$1
CHECKPOINT_PATH=$(get_real_path $2)
echo $DATASET
echo $CHECKPOINT_PATH
if [ ! -f $CHECKPOINT_PATH ]
then
echo "error: CHECKPOINT_PATH=$PATH2 is not a file"
exit 1
fi
export DEVICE_NUM=1
export DEVICE_ID=$3
export RANK_SIZE=$DEVICE_NUM
export RANK_ID=0
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
cd $BASE_PATH/../ || exit
if [ -d "eval$3" ];
then
rm -rf ./eval$3
fi
mkdir ./eval$3
cp ./*.py ./eval$3
cp -r ./src ./eval$3
cd ./eval$3 || exit
env > env.log
echo "start inferring for device $DEVICE_ID"
python eval.py \
--dataset=$DATASET \
--checkpoint_path=$CHECKPOINT_PATH \
--device_id=$3 > log.txt 2>&1 &
cd ..

View File

@ -0,0 +1,25 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
python train.py \
--distribute=False \
--lr=0.02 \
--dataset=coco \
--device_num=1 \
--device_id=1 \
--epoch_size=12 \
--save_checkpoint_epochs=2 \
> log.txt 2>&1 &

View File

@ -0,0 +1,94 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Anchor Generator"""
import numpy as np
class GridAnchorGenerator:
"""
Anchor Generator
"""
def __init__(self, image_shape, scale, scales_per_octave, aspect_ratios):
super(GridAnchorGenerator, self).__init__()
self.scale = scale
self.scales_per_octave = scales_per_octave
self.aspect_ratios = aspect_ratios
self.image_shape = image_shape
def generate(self, step):
"""generate anchors for one layer"""
scales = np.array([2**(float(scale) / self.scales_per_octave)
for scale in range(self.scales_per_octave)]).astype(np.float32)
aspects = np.array(list(self.aspect_ratios)).astype(np.float32)
scales_grid, aspect_ratios_grid = np.meshgrid(scales, aspects)
scales_grid = scales_grid.reshape([-1])
aspect_ratios_grid = aspect_ratios_grid.reshape([-1])
feature_size = [self.image_shape[0] / step, self.image_shape[1] / step]
grid_height, grid_width = feature_size
base_size = np.array([self.scale * step, self.scale * step]).astype(np.float32)
anchor_offset = step / 2.0
ratio_sqrt = np.sqrt(aspect_ratios_grid)
heights = scales_grid / ratio_sqrt * base_size[0]
widths = scales_grid * ratio_sqrt * base_size[1]
y_centers = np.arange(grid_height).astype(np.float32)
y_centers = y_centers * step + anchor_offset
x_centers = np.arange(grid_width).astype(np.float32)
x_centers = x_centers * step + anchor_offset
x_centers, y_centers = np.meshgrid(x_centers, y_centers)
x_centers_shape = x_centers.shape
y_centers_shape = y_centers.shape
widths_grid, x_centers_grid = np.meshgrid(widths, x_centers.reshape([-1]))
heights_grid, y_centers_grid = np.meshgrid(heights, y_centers.reshape([-1]))
x_centers_grid = x_centers_grid.reshape(*x_centers_shape, -1)
y_centers_grid = y_centers_grid.reshape(*y_centers_shape, -1)
widths_grid = widths_grid.reshape(-1, *x_centers_shape)
heights_grid = heights_grid.reshape(-1, *y_centers_shape)
bbox_centers = np.stack([y_centers_grid, x_centers_grid], axis=3)
bbox_sizes = np.stack([heights_grid, widths_grid], axis=3)
bbox_centers = bbox_centers.reshape([-1, 2])
bbox_sizes = bbox_sizes.reshape([-1, 2])
bbox_corners = np.concatenate([bbox_centers - 0.5 * bbox_sizes, bbox_centers + 0.5 * bbox_sizes], axis=1)
self.bbox_corners = bbox_corners / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
self.bbox_centers = np.concatenate([bbox_centers, bbox_sizes], axis=1)
self.bbox_centers = self.bbox_centers / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
print(self.bbox_centers.shape)
return self.bbox_centers, self.bbox_corners
def generate_multi_levels(self, steps):
"""generate anchor for multi layer"""
bbox_centers_list = []
bbox_corners_list = []
for step in steps:
bbox_centers, bbox_corners = self.generate(step)
bbox_centers_list.append(bbox_centers)
bbox_corners_list.append(bbox_corners)
self.bbox_centers = np.concatenate(bbox_centers_list, axis=0)
self.bbox_corners = np.concatenate(bbox_corners_list, axis=0)
return self.bbox_centers, self.bbox_corners

View File

@ -0,0 +1,170 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Bbox utils"""
import math
import itertools as it
import numpy as np
from .config import config
from .anchor_generator import GridAnchorGenerator
class GeneratDefaultBoxes():
"""
Generate Default boxes for SSD, follows the order of (W, H, archor_sizes).
`self.default_boxes` has a shape of [archor_sizes, H, W, 4], the last dimension is [y, x, h, w].
`self.default_boxes_tlbr` has a shape as `self.default_boxes`, the last dimension is [y1, x1, y2, x2].
"""
def __init__(self):
fk = config.img_shape[0] / np.array(config.steps)
scale_rate = (config.max_scale - config.min_scale) / (len(config.num_default) - 1)
scales = [config.min_scale + scale_rate * i for i in range(len(config.num_default))] + [1.0]
self.default_boxes = []
for idex, feature_size in enumerate(config.feature_size):
sk1 = scales[idex]
sk2 = scales[idex + 1]
sk3 = math.sqrt(sk1 * sk2)
if idex == 0 and not config.aspect_ratios[idex]:
w, h = sk1 * math.sqrt(2), sk1 / math.sqrt(2)
all_sizes = [(0.1, 0.1), (w, h), (h, w)]
else:
all_sizes = [(sk1, sk1)]
for aspect_ratio in config.aspect_ratios[idex]:
w, h = sk1 * math.sqrt(aspect_ratio), sk1 / math.sqrt(aspect_ratio)
all_sizes.append((w, h))
all_sizes.append((h, w))
all_sizes.append((sk3, sk3))
assert len(all_sizes) == config.num_default[idex]
for i, j in it.product(range(feature_size), repeat=2):
for w, h in all_sizes:
cx, cy = (j + 0.5) / fk[idex], (i + 0.5) / fk[idex]
self.default_boxes.append([cy, cx, h, w])
def to_tlbr(cy, cx, h, w):
return cy - h / 2, cx - w / 2, cy + h / 2, cx + w / 2
# For IoU calculation
self.default_boxes_tlbr = np.array(tuple(to_tlbr(*i) for i in self.default_boxes), dtype='float32')
self.default_boxes = np.array(self.default_boxes, dtype='float32')
if 'use_anchor_generator' in config and config.use_anchor_generator:
generator = GridAnchorGenerator(config.img_shape, 4, 2, [1.0, 2.0, 0.5])
default_boxes, default_boxes_tlbr = generator.generate_multi_levels(config.steps)
else:
default_boxes_tlbr = GeneratDefaultBoxes().default_boxes_tlbr
default_boxes = GeneratDefaultBoxes().default_boxes
y1, x1, y2, x2 = np.split(default_boxes_tlbr[:, :4], 4, axis=-1)
vol_anchors = (x2 - x1) * (y2 - y1)
matching_threshold = config.match_threshold
def ssd_bboxes_encode(boxes):
"""
Labels anchors with ground truth inputs.
Args:
boxex: ground truth with shape [N, 5], for each row, it stores [y, x, h, w, cls].
Returns:
gt_loc: location ground truth with shape [num_anchors, 4].
gt_label: class ground truth with shape [num_anchors, 1].
num_matched_boxes: number of positives in an image.
"""
def jaccard_with_anchors(bbox):
"""Compute jaccard score a box and the anchors."""
# Intersection bbox and volume.
ymin = np.maximum(y1, bbox[0])
xmin = np.maximum(x1, bbox[1])
ymax = np.minimum(y2, bbox[2])
xmax = np.minimum(x2, bbox[3])
w = np.maximum(xmax - xmin, 0.)
h = np.maximum(ymax - ymin, 0.)
# Volumes.
inter_vol = h * w
union_vol = vol_anchors + (bbox[2] - bbox[0]) * (bbox[3] - bbox[1]) - inter_vol
jaccard = inter_vol / union_vol
return np.squeeze(jaccard)
pre_scores = np.zeros((config.num_ssd_boxes), dtype=np.float32)
t_boxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
t_label = np.zeros((config.num_ssd_boxes), dtype=np.int64)
for bbox in boxes:
label = int(bbox[4])
scores = jaccard_with_anchors(bbox)
idx = np.argmax(scores)
scores[idx] = 2.0
mask = (scores > matching_threshold)
mask = mask & (scores > pre_scores)
pre_scores = np.maximum(pre_scores, scores * mask)
t_label = mask * label + (1 - mask) * t_label
for i in range(4):
t_boxes[:, i] = mask * bbox[i] + (1 - mask) * t_boxes[:, i]
index = np.nonzero(t_label)
# Transform to tlbr.
bboxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
bboxes[:, [0, 1]] = (t_boxes[:, [0, 1]] + t_boxes[:, [2, 3]]) / 2
bboxes[:, [2, 3]] = t_boxes[:, [2, 3]] - t_boxes[:, [0, 1]]
# Encode features.
bboxes_t = bboxes[index]
default_boxes_t = default_boxes[index]
bboxes_t[:, :2] = (bboxes_t[:, :2] - default_boxes_t[:, :2]) / (default_boxes_t[:, 2:] * config.prior_scaling[0])
tmp = np.maximum(bboxes_t[:, 2:4] / default_boxes_t[:, 2:4], 0.000001)
bboxes_t[:, 2:4] = np.log(tmp) / config.prior_scaling[1]
bboxes[index] = bboxes_t
num_match = np.array([len(np.nonzero(t_label)[0])], dtype=np.int32)
return bboxes, t_label.astype(np.int32), num_match
def ssd_bboxes_decode(boxes):
"""Decode predict boxes to [y, x, h, w]"""
boxes_t = boxes.copy()
default_boxes_t = default_boxes.copy()
boxes_t[:, :2] = boxes_t[:, :2] * config.prior_scaling[0] * default_boxes_t[:, 2:] + default_boxes_t[:, :2]
boxes_t[:, 2:4] = np.exp(boxes_t[:, 2:4] * config.prior_scaling[1]) * default_boxes_t[:, 2:4]
bboxes = np.zeros((len(boxes_t), 4), dtype=np.float32)
bboxes[:, [0, 1]] = boxes_t[:, [0, 1]] - boxes_t[:, [2, 3]] / 2
bboxes[:, [2, 3]] = boxes_t[:, [0, 1]] + boxes_t[:, [2, 3]] / 2
return np.clip(bboxes, 0, 1)
def intersect(box_a, box_b):
"""Compute the intersect of two sets of boxes."""
max_yx = np.minimum(box_a[:, 2:4], box_b[2:4])
min_yx = np.maximum(box_a[:, :2], box_b[:2])
inter = np.clip((max_yx - min_yx), a_min=0, a_max=np.inf)
return inter[:, 0] * inter[:, 1]
def jaccard_numpy(box_a, box_b):
"""Compute the jaccard overlap of two sets of boxes."""
inter = intersect(box_a, box_b)
area_a = ((box_a[:, 2] - box_a[:, 0]) *
(box_a[:, 3] - box_a[:, 1]))
area_b = ((box_b[2] - box_b[0]) *
(box_b[3] - box_b[1]))
union = area_a + area_b - inter
return inter / union

View File

@ -0,0 +1,36 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Config parameters for SSD models."""
from .config_ssd_resnet50 import config as config_ssd_resnet50
using_model = "ssd_resnet50"
config_map = {
"ssd_resnet50": config_ssd_resnet50
}
print("...............using "+using_model+" model................")
config = config_map[using_model]
if config.num_ssd_boxes == -1:
num = 0
h, w = config.img_shape
for i in range(len(config.steps)):
num += (h // config.steps[i]) * (w // config.steps[i]) * config.num_default[i]
config.num_ssd_boxes = num

View File

@ -0,0 +1,88 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#" ============================================================================
"""Config parameters for SSD models."""
from easydict import EasyDict as ed
config = ed({
"model": "ssd_resnet50",
"img_shape": [640, 640],
"num_ssd_boxes": -1,
"match_threshold": 0.5,
"nms_threshold": 0.6,
"min_score": 0.1,
"max_boxes": 100,
# learning rate settings
"global_step": 0,
"lr_init": 0.01333,
"lr_end_rate": 0.0,
"warmup_epochs": 2,
"weight_decay": 4e-4,
"momentum": 0.9,
# network
"num_default": [6, 6, 6, 6, 6],
"extras_in_channels": [256, 512, 1024, 256, 256],
"extras_out_channels": [512, 1024, 512, 256, 256],
# "extras_out_channels": [256, 256, 256, 256, 256],
"extras_strides": [1, 1, 2, 2, 2, 2],
"extras_ratio": [0.2, 0.2, 0.2, 0.25, 0.5, 0.25],
"feature_size": [80, 40, 20, 10, 5],
"min_scale": 0.2,
"max_scale": 0.95,
"aspect_ratios": [(2, 3), (2, 3), (2, 3), (2, 3), (2, 3), (2, 3)],
"steps": (8, 16, 32, 64, 128),
"prior_scaling": (0.1, 0.2),
"gamma": 2.0,
"alpha": 0.25,
"num_addition_layers": 4,
"use_anchor_generator": True,
"use_global_norm": True,
"use_float16": True,
# `mindrecord_dir` and `coco_root` are better to use absolute path.
"feature_extractor_base_param": "/ckpt/resnet50.ckpt",
"checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'],
"mindrecord_dir": "/data/MindRecord_COCO",
"coco_root": "/data/coco2017",
"train_data_type": "train2017",
"val_data_type": "val2017",
"instances_set": "annotations/instances_{}.json",
"classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
'kite', 'baseball bat', 'baseball glove', 'skateboard',
'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
'refrigerator', 'book', 'clock', 'vase', 'scissors',
'teddy bear', 'hair drier', 'toothbrush'),
"num_classes": 81,
# The annotation.json position of voc validation dataset.
"voc_json": "annotations/voc_instances_val.json",
# voc original dataset.
"voc_root": "/data/voc_dataset",
# if coco or voc used, `image_dir` and `anno_path` are useless.
"image_dir": "",
"anno_path": ""
})

View File

@ -0,0 +1,453 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""SSD dataset"""
from __future__ import division
import os
import json
import xml.etree.ElementTree as et
import numpy as np
import cv2
import mindspore.dataset as de
import mindspore.dataset.vision.c_transforms as C
from mindspore.mindrecord import FileWriter
from .config import config
from .box_utils import jaccard_numpy, ssd_bboxes_encode
def _rand(a=0., b=1.):
"""Generate random."""
return np.random.rand() * (b - a) + a
def get_imageId_from_fileName(filename, id_iter):
"""Get imageID from fileName if fileName is int, else return id_iter."""
filename = os.path.splitext(filename)[0]
if filename.isdigit():
return int(filename)
return id_iter
def random_sample_crop(image, boxes):
"""Random Crop the image and boxes"""
height, width, _ = image.shape
min_iou = np.random.choice([None, 0.1, 0.3, 0.5, 0.7, 0.9])
if min_iou is None:
return image, boxes
# max trails (50)
for _ in range(50):
image_t = image
w = _rand(0.3, 1.0) * width
h = _rand(0.3, 1.0) * height
# aspect ratio constraint b/t .5 & 2
if h / w < 0.5 or h / w > 2:
continue
left = _rand() * (width - w)
top = _rand() * (height - h)
rect = np.array([int(top), int(left), int(top + h), int(left + w)])
overlap = jaccard_numpy(boxes, rect)
# dropout some boxes
drop_mask = overlap > 0
if not drop_mask.any():
continue
if overlap[drop_mask].min() < min_iou and overlap[drop_mask].max() > (min_iou + 0.2):
continue
image_t = image_t[rect[0]:rect[2], rect[1]:rect[3], :]
centers = (boxes[:, :2] + boxes[:, 2:4]) / 2.0
m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])
m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])
# mask in that both m1 and m2 are true
mask = m1 * m2 * drop_mask
# have any valid boxes? try again if not
if not mask.any():
continue
# take only matching gt boxes
boxes_t = boxes[mask, :].copy()
boxes_t[:, :2] = np.maximum(boxes_t[:, :2], rect[:2])
boxes_t[:, :2] -= rect[:2]
boxes_t[:, 2:4] = np.minimum(boxes_t[:, 2:4], rect[2:4])
boxes_t[:, 2:4] -= rect[:2]
return image_t, boxes_t
return image, boxes
def preprocess_fn(img_id, image, box, is_training):
"""Preprocess function for dataset."""
cv2.setNumThreads(2)
def _infer_data(image, input_shape):
img_h, img_w, _ = image.shape
input_h, input_w = input_shape
image = cv2.resize(image, (input_w, input_h))
# When the channels of image is 1
if len(image.shape) == 2:
image = np.expand_dims(image, axis=-1)
image = np.concatenate([image, image, image], axis=-1)
return img_id, image, np.array((img_h, img_w), np.float32)
def _data_aug(image, box, is_training, image_size=(300, 300)):
"""Data augmentation function."""
ih, iw, _ = image.shape
h, w = image_size
if not is_training:
return _infer_data(image, image_size)
# Random crop
box = box.astype(np.float32)
image, box = random_sample_crop(image, box)
ih, iw, _ = image.shape
# Resize image
image = cv2.resize(image, (w, h))
# Flip image or not
flip = _rand() < .5
if flip:
image = cv2.flip(image, 1, dst=None)
# When the channels of image is 1
if len(image.shape) == 2:
image = np.expand_dims(image, axis=-1)
image = np.concatenate([image, image, image], axis=-1)
box[:, [0, 2]] = box[:, [0, 2]] / ih
box[:, [1, 3]] = box[:, [1, 3]] / iw
if flip:
box[:, [1, 3]] = 1 - box[:, [3, 1]]
box, label, num_match = ssd_bboxes_encode(box)
return image, box, label, num_match
return _data_aug(image, box, is_training, image_size=config.img_shape)
def create_voc_label(is_training):
"""Get image path and annotation from VOC."""
voc_root = config.voc_root
cls_map = {name: i for i, name in enumerate(config.classes)}
sub_dir = 'train' if is_training else 'eval'
voc_dir = os.path.join(voc_root, sub_dir)
if not os.path.isdir(voc_dir):
raise ValueError(f'Cannot find {sub_dir} dataset path.')
image_dir = anno_dir = voc_dir
if os.path.isdir(os.path.join(voc_dir, 'Images')):
image_dir = os.path.join(voc_dir, 'Images')
if os.path.isdir(os.path.join(voc_dir, 'Annotations')):
anno_dir = os.path.join(voc_dir, 'Annotations')
if not is_training:
json_file = os.path.join(config.voc_root, config.voc_json)
file_dir = os.path.split(json_file)[0]
if not os.path.isdir(file_dir):
os.makedirs(file_dir)
json_dict = {"images": [], "type": "instances", "annotations": [],
"categories": []}
bnd_id = 1
image_files_dict = {}
image_anno_dict = {}
images = []
id_iter = 0
for anno_file in os.listdir(anno_dir):
print(anno_file)
if not anno_file.endswith('xml'):
continue
tree = et.parse(os.path.join(anno_dir, anno_file))
root_node = tree.getroot()
file_name = root_node.find('filename').text
img_id = get_imageId_from_fileName(file_name, id_iter)
id_iter += 1
image_path = os.path.join(image_dir, file_name)
print(image_path)
if not os.path.isfile(image_path):
print(f'Cannot find image {file_name} according to annotations.')
continue
labels = []
for obj in root_node.iter('object'):
cls_name = obj.find('name').text
if cls_name not in cls_map:
print(f'Label "{cls_name}" not in "{config.classes}"')
continue
bnd_box = obj.find('bndbox')
x_min = int(float(bnd_box.find('xmin').text)) - 1
y_min = int(float(bnd_box.find('ymin').text)) - 1
x_max = int(float(bnd_box.find('xmax').text)) - 1
y_max = int(float(bnd_box.find('ymax').text)) - 1
labels.append([y_min, x_min, y_max, x_max, cls_map[cls_name]])
if not is_training:
o_width = abs(x_max - x_min)
o_height = abs(y_max - y_min)
ann = {'area': o_width * o_height, 'iscrowd': 0, 'image_id': \
img_id, 'bbox': [x_min, y_min, o_width, o_height], \
'category_id': cls_map[cls_name], 'id': bnd_id, \
'ignore': 0, \
'segmentation': []}
json_dict['annotations'].append(ann)
bnd_id = bnd_id + 1
if labels:
images.append(img_id)
image_files_dict[img_id] = image_path
image_anno_dict[img_id] = np.array(labels)
if not is_training:
size = root_node.find("size")
width = int(size.find('width').text)
height = int(size.find('height').text)
image = {'file_name': file_name, 'height': height, 'width': width,
'id': img_id}
json_dict['images'].append(image)
if not is_training:
for cls_name, cid in cls_map.items():
cat = {'supercategory': 'none', 'id': cid, 'name': cls_name}
json_dict['categories'].append(cat)
json_fp = open(json_file, 'w')
json_str = json.dumps(json_dict)
json_fp.write(json_str)
json_fp.close()
return images, image_files_dict, image_anno_dict
def create_coco_label(is_training):
"""Get image path and annotation from COCO."""
from pycocotools.coco import COCO
coco_root = config.coco_root
data_type = config.val_data_type
if is_training:
data_type = config.train_data_type
# Classes need to train or test.
train_cls = config.classes
train_cls_dict = {}
for i, cls in enumerate(train_cls):
train_cls_dict[cls] = i
anno_json = os.path.join(coco_root, config.instances_set.format(data_type))
coco = COCO(anno_json)
classs_dict = {}
cat_ids = coco.loadCats(coco.getCatIds())
for cat in cat_ids:
classs_dict[cat["id"]] = cat["name"]
image_ids = coco.getImgIds()
images = []
image_path_dict = {}
image_anno_dict = {}
for img_id in image_ids:
image_info = coco.loadImgs(img_id)
file_name = image_info[0]["file_name"]
anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None)
anno = coco.loadAnns(anno_ids)
image_path = os.path.join(coco_root, data_type, file_name)
annos = []
iscrowd = False
for label in anno:
bbox = label["bbox"]
class_name = classs_dict[label["category_id"]]
iscrowd = iscrowd or label["iscrowd"]
if class_name in train_cls:
x_min, x_max = bbox[0], bbox[0] + bbox[2]
y_min, y_max = bbox[1], bbox[1] + bbox[3]
annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]])
if not is_training and iscrowd:
continue
if len(annos) >= 1:
images.append(img_id)
image_path_dict[img_id] = image_path
image_anno_dict[img_id] = np.array(annos)
return images, image_path_dict, image_anno_dict
def anno_parser(annos_str):
"""Parse annotation from string to list."""
annos = []
for anno_str in annos_str:
anno = list(map(int, anno_str.strip().split(',')))
annos.append(anno)
return annos
def filter_valid_data(image_dir, anno_path):
"""Filter valid image file, which both in image_dir and anno_path."""
images = []
image_path_dict = {}
image_anno_dict = {}
if not os.path.isdir(image_dir):
raise RuntimeError("Path given is not valid.")
if not os.path.isfile(anno_path):
raise RuntimeError("Annotation file is not valid.")
with open(anno_path, "rb") as f:
lines = f.readlines()
for img_id, line in enumerate(lines):
line_str = line.decode("utf-8").strip()
line_split = str(line_str).split(' ')
file_name = line_split[0]
image_path = os.path.join(image_dir, file_name)
if os.path.isfile(image_path):
images.append(img_id)
image_path_dict[img_id] = image_path
image_anno_dict[img_id] = anno_parser(line_split[1:])
return images, image_path_dict, image_anno_dict
def voc_data_to_mindrecord(mindrecord_dir, is_training, prefix="ssd.mindrecord", file_num=8):
"""Create MindRecord file by image_dir and anno_path."""
mindrecord_path = os.path.join(mindrecord_dir, prefix)
writer = FileWriter(mindrecord_path, file_num)
images, image_path_dict, image_anno_dict = create_voc_label(is_training)
ssd_json = {
"img_id": {"type": "int32", "shape": [1]},
"image": {"type": "bytes"},
"annotation": {"type": "int32", "shape": [-1, 5]},
}
writer.add_schema(ssd_json, "ssd_json")
for img_id in images:
image_path = image_path_dict[img_id]
with open(image_path, 'rb') as f:
img = f.read()
annos = np.array(image_anno_dict[img_id], dtype=np.int32)
img_id = np.array([img_id], dtype=np.int32)
row = {"img_id": img_id, "image": img, "annotation": annos}
writer.write_raw_data([row])
writer.commit()
def data_to_mindrecord_byte_image(dataset="coco", is_training=True, prefix="ssd.mindrecord", file_num=8):
"""Create MindRecord file."""
mindrecord_dir = config.mindrecord_dir
mindrecord_path = os.path.join(mindrecord_dir, prefix)
writer = FileWriter(mindrecord_path, file_num)
if dataset == "coco":
images, image_path_dict, image_anno_dict = create_coco_label(is_training)
else:
images, image_path_dict, image_anno_dict = filter_valid_data(config.image_dir, config.anno_path)
ssd_json = {
"img_id": {"type": "int32", "shape": [1]},
"image": {"type": "bytes"},
"annotation": {"type": "int32", "shape": [-1, 5]},
}
writer.add_schema(ssd_json, "ssd_json")
for img_id in images:
image_path = image_path_dict[img_id]
with open(image_path, 'rb') as f:
img = f.read()
annos = np.array(image_anno_dict[img_id], dtype=np.int32)
img_id = np.array([img_id], dtype=np.int32)
row = {"img_id": img_id, "image": img, "annotation": annos}
writer.write_raw_data([row])
writer.commit()
def create_ssd_dataset(mindrecord_file, batch_size=32, repeat_num=10, device_num=1, rank=0,
is_training=True, num_parallel_workers=4, use_multiprocessing=True):
"""Create SSD dataset with MindDataset."""
ds = de.MindDataset(mindrecord_file, columns_list=["img_id", "image", "annotation"], num_shards=device_num,
shard_id=rank, num_parallel_workers=num_parallel_workers, shuffle=is_training)
decode = C.Decode()
ds = ds.map(operations=decode, input_columns=["image"])
change_swap_op = C.HWC2CHW()
normalize_op = C.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
std=[0.229 * 255, 0.224 * 255, 0.225 * 255])
color_adjust_op = C.RandomColorAdjust(brightness=0.4, contrast=0.4, saturation=0.4)
compose_map_func = (lambda img_id, image, annotation: preprocess_fn(img_id, image, annotation, is_training))
if is_training:
output_columns = ["image", "box", "label", "num_match"]
trans = [color_adjust_op, normalize_op, change_swap_op]
else:
output_columns = ["img_id", "image", "image_shape"]
trans = [normalize_op, change_swap_op]
ds = ds.map(operations=compose_map_func, input_columns=["img_id", "image", "annotation"],
output_columns=output_columns, column_order=output_columns,
python_multiprocessing=use_multiprocessing,
num_parallel_workers=num_parallel_workers)
ds = ds.map(operations=trans, input_columns=["image"], python_multiprocessing=use_multiprocessing,
num_parallel_workers=num_parallel_workers)
ds = ds.batch(batch_size, drop_remainder=True)
ds = ds.repeat(repeat_num)
return ds
def create_mindrecord(dataset="coco", prefix="ssd.mindrecord", is_training=True):
""" It will generate mindrecord file in config.mindrecord_dir,
and the file name is ssd.mindrecord0, 1, ... file_num.
"""
print("Start create dataset!")
mindrecord_dir = config.mindrecord_dir
mindrecord_file = os.path.join(mindrecord_dir, prefix + "0")
if not os.path.exists(mindrecord_file):
if not os.path.isdir(mindrecord_dir):
os.makedirs(mindrecord_dir)
if dataset == "coco":
if os.path.isdir(config.coco_root):
print("Create Mindrecord.")
data_to_mindrecord_byte_image("coco", is_training, prefix)
print("Create Mindrecord Done, at {}".format(mindrecord_dir))
else:
print("coco_root not exits.")
elif dataset == "voc":
if os.path.isdir(config.voc_root):
print("Create Mindrecord.")
voc_data_to_mindrecord(mindrecord_dir, is_training, prefix)
print("Create Mindrecord Done, at {}".format(mindrecord_dir))
else:
print("voc_root not exits.")
else:
if os.path.isdir(config.image_dir) and os.path.exists(config.anno_path):
print("Create Mindrecord.")
data_to_mindrecord_byte_image("other", is_training, prefix)
print("Create Mindrecord Done, at {}".format(mindrecord_dir))
else:
print("image_dir or anno_path not exits.")
return mindrecord_file

View File

@ -0,0 +1,119 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Coco metrics utils"""
import json
import numpy as np
from .config import config
def apply_nms(all_boxes, all_scores, thres, max_boxes):
"""Apply NMS to bboxes."""
y1 = all_boxes[:, 0]
x1 = all_boxes[:, 1]
y2 = all_boxes[:, 2]
x2 = all_boxes[:, 3]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = all_scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
if len(keep) >= max_boxes:
break
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thres)[0]
order = order[inds + 1]
return keep
def metrics(pred_data, anno_json):
"""Calculate mAP of predicted bboxes."""
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
num_classes = config.num_classes
#Classes need to train or test.
val_cls = config.classes
val_cls_dict = {}
for i, cls in enumerate(val_cls):
val_cls_dict[i] = cls
coco_gt = COCO(anno_json)
classs_dict = {}
cat_ids = coco_gt.loadCats(coco_gt.getCatIds())
for cat in cat_ids:
classs_dict[cat["name"]] = cat["id"]
predictions = []
img_ids = []
for sample in pred_data:
pred_boxes = sample['boxes']
box_scores = sample['box_scores']
img_id = sample['img_id']
h, w = sample['image_shape']
final_boxes = []
final_label = []
final_score = []
img_ids.append(img_id)
for c in range(1, num_classes):
class_box_scores = box_scores[:, c]
score_mask = class_box_scores > config.min_score
class_box_scores = class_box_scores[score_mask]
class_boxes = pred_boxes[score_mask] * [h, w, h, w]
if score_mask.any():
nms_index = apply_nms(class_boxes, class_box_scores, config.nms_threshold, config.max_boxes)
class_boxes = class_boxes[nms_index]
class_box_scores = class_box_scores[nms_index]
final_boxes += class_boxes.tolist()
final_score += class_box_scores.tolist()
final_label += [classs_dict[val_cls_dict[c]]] * len(class_box_scores)
for loc, label, score in zip(final_boxes, final_label, final_score):
res = {}
res['image_id'] = img_id
res['bbox'] = [loc[1], loc[0], loc[3] - loc[1], loc[2] - loc[0]]
res['score'] = score
res['category_id'] = label
predictions.append(res)
with open('predictions.json', 'w') as f:
json.dump(predictions, f)
coco_dt = coco_gt.loadRes('predictions.json')
E = COCOeval(coco_gt, coco_dt, iouType='bbox')
E.params.imgIds = img_ids
E.evaluate()
E.accumulate()
E.summarize()
return E.stats[0]

View File

@ -0,0 +1,50 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Parameters utils"""
from mindspore.common.initializer import initializer, TruncatedNormal
def init_net_param(network, initialize_mode='TruncatedNormal'):
"""Init the parameters in net."""
params = network.trainable_params()
for p in params:
if 'beta' not in p.name and 'gamma' not in p.name and 'bias' not in p.name:
if initialize_mode == 'TruncatedNormal':
p.set_data(initializer(TruncatedNormal(0.02), p.data.shape, p.data.dtype))
else:
p.set_data(initialize_mode, p.data.shape, p.data.dtype)
def load_backbone_params(network, param_dict):
"""Init the parameters from pre-train model, default is mobilenetv2."""
for _, param in network.parameters_and_names():
param_name = param.name.replace('network.backbone.', '')
name_split = param_name.split('.')
if 'features_1' in param_name:
param_name = param_name.replace('features_1', 'features')
if 'features_2' in param_name:
param_name = '.'.join(['features', str(int(name_split[1]) + 14)] + name_split[2:])
if param_name in param_dict:
param.set_data(param_dict[param_name].data)
def filter_checkpoint_parameter_by_list(param_dict, filter_list):
"""remove useless parameters according to filter_list"""
for key in list(param_dict.keys()):
for name in filter_list:
if name in key:
print("Delete parameter from checkpoint: ", key)
del param_dict[key]
break

View File

@ -0,0 +1,55 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Learning rate schedule"""
import math
import numpy as np
def get_lr(global_step, lr_init, lr_end, lr_max, warmup_epochs, total_epochs, steps_per_epoch):
"""
generate learning rate array
Args:
global_step(int): total steps of the training
lr_init(float): init learning rate
lr_end(float): end learning rate
lr_max(float): max learning rate
warmup_epochs(float): number of warmup epochs
total_epochs(int): total epoch of training
steps_per_epoch(int): steps of one epoch
Returns:
np.array, learning rate array
"""
lr_each_step = []
total_steps = steps_per_epoch * total_epochs
warmup_steps = steps_per_epoch * warmup_epochs
for i in range(total_steps):
if i < warmup_steps:
lr = lr_init + (lr_max - lr_init) * i / warmup_steps
else:
lr = lr_end + \
(lr_max - lr_end) * \
(1. + math.cos(math.pi * (i - warmup_steps) / (total_steps - warmup_steps))) / 2.
if lr < 0.0:
lr = 0.0
lr_each_step.append(lr)
current_step = global_step
lr_each_step = np.array(lr_each_step).astype(np.float32)
learning_rate = lr_each_step[current_step:]
return learning_rate

View File

@ -0,0 +1,222 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""ResNet."""
import mindspore.nn as nn
from mindspore.ops import operations as P
def _conv3x3(in_channel, out_channel, stride=1):
return nn.Conv2d(in_channel, out_channel,
kernel_size=3, stride=stride, padding=0, pad_mode='same')
def _conv1x1(in_channel, out_channel, stride=1):
return nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=stride, padding=0, pad_mode='same')
def _conv7x7(in_channel, out_channel, stride=1):
return nn.Conv2d(in_channel, out_channel, kernel_size=7, stride=stride, padding=0, pad_mode='same')
def _bn(channel):
return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997,
gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1)
def _bn_last(channel):
return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997,
gamma_init=0, beta_init=0, moving_mean_init=0, moving_var_init=1)
class ResidualBlock(nn.Cell):
"""
ResNet V1 residual block definition.
Args:
in_channel (int): Input channel.
out_channel (int): Output channel.
stride (int): Stride size for the first convolutional layer. Default: 1.
Returns:
Tensor, output tensor.
Examples:
>>> ResidualBlock(3, 256, stride=2)
"""
expansion = 4
def __init__(self,
in_channel,
out_channel,
stride=1):
super(ResidualBlock, self).__init__()
self.stride = stride
channel = out_channel // self.expansion
self.conv1 = _conv1x1(in_channel, channel, stride=1)
self.bn1 = _bn(channel)
self.conv2 = _conv3x3(channel, channel, stride=stride)
self.bn2 = _bn(channel)
self.conv3 = _conv1x1(channel, out_channel, stride=1)
self.bn3 = _bn_last(out_channel)
self.relu = nn.ReLU()
self.down_sample = False
if stride != 1 or in_channel != out_channel:
self.down_sample = True
self.down_sample_layer = None
if self.down_sample:
self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride), _bn(out_channel)])
self.add = P.Add()
def construct(self, x):
"""
Forward
"""
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.down_sample:
identity = self.down_sample_layer(identity)
out = self.add(out, identity)
out = self.relu(out)
return out
class ResNet(nn.Cell):
"""
ResNet architecture.
Args:
block (Cell): Block for network.
layer_nums (list): Numbers of block in different layers.
in_channels (list): Input channel in each layer.
out_channels (list): Output channel in each layer.
strides (list): Stride size in each layer.
num_classes (int): The number of classes that the training images are belonging to.
Returns:
Tensor, output tensor.
Examples:
>>> ResNet(ResidualBlock,
>>> [3, 4, 6, 3],
>>> [64, 256, 512, 1024],
>>> [256, 512, 1024, 2048],
>>> [1, 2, 2, 2],
>>> 10)
"""
def __init__(self,
block,
layer_nums,
in_channels,
out_channels,
strides):
super(ResNet, self).__init__()
if not len(layer_nums) == len(in_channels) == len(out_channels) == 4:
raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!")
self.conv1 = _conv7x7(3, 64, stride=2)
self.bn1 = _bn(64)
self.relu = P.ReLU()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same")
self.layer1 = self._make_layer(block,
layer_nums[0],
in_channel=in_channels[0],
out_channel=out_channels[0],
stride=strides[0])
self.layer2 = self._make_layer(block,
layer_nums[1],
in_channel=in_channels[1],
out_channel=out_channels[1],
stride=strides[1])
self.layer3 = self._make_layer(block,
layer_nums[2],
in_channel=in_channels[2],
out_channel=out_channels[2],
stride=strides[2])
self.layer4 = self._make_layer(block,
layer_nums[3],
in_channel=in_channels[3],
out_channel=out_channels[3],
stride=strides[3])
def _make_layer(self, block, layer_num, in_channel, out_channel, stride):
"""
Make stage network of ResNet.
Args:
block (Cell): Resnet block.
layer_num (int): Layer number.
in_channel (int): Input channel.
out_channel (int): Output channel.
stride (int): Stride size for the first convolutional layer.
Returns:
SequentialCell, the output layer.
Examples:
>>> _make_layer(ResidualBlock, 3, 128, 256, 2)
"""
layers = []
resnet_block = block(in_channel, out_channel, stride=stride)
layers.append(resnet_block)
for _ in range(1, layer_num):
resnet_block = block(out_channel, out_channel, stride=1)
layers.append(resnet_block)
return nn.SequentialCell(layers)
def construct(self, x):
"""
Forward
"""
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
c1 = self.maxpool(x)
c2 = self.layer1(c1)
c3 = self.layer2(c2)
c4 = self.layer3(c3)
c5 = self.layer4(c4)
return c1, c2, c3, c4, c5
def resnet50():
"""
Get ResNet50 neural network.
Returns:
Cell, cell instance of ResNet50 neural network.
Examples:
>>> net = resnet50()
"""
return ResNet(ResidualBlock,
[3, 4, 6, 3],
[64, 256, 512, 1024],
[256, 512, 1024, 2048],
[1, 2, 2, 2])

View File

@ -0,0 +1,68 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""resnet extractor"""
import mindspore.nn as nn
from .resnet import resnet50
def conv_bn_relu(in_channel, out_channel, kernel_size, stride, depthwise, activation='relu6'):
output = []
output.append(nn.Conv2d(in_channel, out_channel, kernel_size, stride, pad_mode="same",
group=1 if not depthwise else in_channel))
output.append(nn.BatchNorm2d(out_channel))
if activation:
output.append(nn.get_activation(activation))
return nn.SequentialCell(output)
class ExtraLayer(nn.Cell):
"""
extra feature extractor
"""
def __init__(self, levels, res_channels, channels, kernel_size, stride):
super(ExtraLayer, self).__init__()
self.levels = levels
self.Channel_cover = conv_bn_relu(512, channels, kernel_size, 1, False)
bottom_up_cells = [
conv_bn_relu(channels, channels, kernel_size, stride, False) for x in range(self.levels)
]
self.blocks = nn.CellList(bottom_up_cells)
def construct(self, features):
"""
Forward
"""
mid_feature = self.Channel_cover(features[-1])
features = features + (self.blocks[0](mid_feature),)
features = features + (self.blocks[1](features[-1]),)
return features
class resnet50_extra(nn.Cell):
"""
ResNet with extra feature.
"""
def __init__(self):
super(resnet50_extra, self).__init__()
self.resnet = resnet50()
self.extra = ExtraLayer(2, 512, 256, 3, 2)
self.Channel_cover = conv_bn_relu(2048, 512, 3, 1, False)
def construct(self, x):
"""
Forward
"""
_, _, c3, c4, c5 = self.resnet(x)
c5 = self.Channel_cover(c5)
features = self.extra((c3, c4, c5))
return features

View File

@ -0,0 +1,501 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""SSD net based MobilenetV2."""
import mindspore.common.dtype as mstype
import mindspore as ms
import mindspore.nn as nn
from mindspore import context, Tensor
from mindspore.context import ParallelMode
from mindspore.parallel._auto_parallel_context import auto_parallel_context
from mindspore.communication.management import get_group_size
from mindspore.ops import operations as P
from mindspore.ops import functional as F
from mindspore.ops import composite as C
from .resnet_extra import resnet50_extra
def _make_divisible(v, divisor, min_value=None):
"""nsures that all layers have a channel number that is divisible by 8."""
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
# Make sure that round down does not go down by more than 10%.
if new_v < 0.9 * v:
new_v += divisor
return new_v
def _conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same'):
return nn.Conv2d(in_channel, out_channel, kernel_size=kernel_size, stride=stride,
padding=0, pad_mode=pad_mod, has_bias=True)
def _bn(channel):
return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.97,
gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1)
def _last_conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same', pad=0):
in_channels = in_channel
out_channels = in_channel
depthwise_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same',
padding=pad, group=in_channels)
conv = _conv2d(in_channel, out_channel, kernel_size=1)
return nn.SequentialCell([depthwise_conv, _bn(in_channel), nn.ReLU6(), conv])
class ConvBNReLU(nn.Cell):
"""
Convolution/Depthwise fused with Batchnorm and ReLU block definition.
Args:
in_planes (int): Input channel.
out_planes (int): Output channel.
kernel_size (int): Input kernel size.
stride (int): Stride size for the first convolutional layer. Default: 1.
groups (int): channel group. Convolution is 1 while Depthiwse is input channel. Default: 1.
shared_conv(Cell): Use the weight shared conv, default: None.
Returns:
Tensor, output tensor.
Examples:
>>> ConvBNReLU(16, 256, kernel_size=1, stride=1, groups=1)
"""
def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, shared_conv=None):
super(ConvBNReLU, self).__init__()
padding = 0
in_channels = in_planes
out_channels = out_planes
if shared_conv is None:
if groups == 1:
conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', padding=padding)
else:
out_channels = in_planes
conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same',
padding=padding, group=in_channels)
layers = [conv, _bn(out_planes), nn.ReLU6()]
else:
layers = [shared_conv, _bn(out_planes), nn.ReLU6()]
self.features = nn.SequentialCell(layers)
def construct(self, x):
output = self.features(x)
return output
class InvertedResidual(nn.Cell):
"""
Residual block definition.
Args:
inp (int): Input channel.
oup (int): Output channel.
stride (int): Stride size for the first convolutional layer. Default: 1.
expand_ratio (int): expand ration of input channel
Returns:
Tensor, output tensor.
Examples:
>>> ResidualBlock(3, 256, 1, 1)
"""
def __init__(self, inp, oup, stride, expand_ratio, last_relu=False):
super(InvertedResidual, self).__init__()
assert stride in [1, 2]
hidden_dim = int(round(inp * expand_ratio))
self.use_res_connect = stride == 1 and inp == oup
layers = []
if expand_ratio != 1:
layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
layers.extend([
# dw
ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
# pw-linear
nn.Conv2d(hidden_dim, oup, kernel_size=1, stride=1, has_bias=False),
_bn(oup),
])
self.conv = nn.SequentialCell(layers)
self.add = P.Add()
self.cast = P.Cast()
self.last_relu = last_relu
self.relu = nn.ReLU6()
def construct(self, x):
identity = x
x = self.conv(x)
if self.use_res_connect:
x = self.add(identity, x)
if self.last_relu:
x = self.relu(x)
return x
class FlattenConcat(nn.Cell):
"""
Concatenate predictions into a single tensor.
Args:
config (dict): The default config of SSD.
Returns:
Tensor, flatten predictions.
"""
def __init__(self, config):
super(FlattenConcat, self).__init__()
self.num_ssd_boxes = config.num_ssd_boxes
self.concat = P.Concat(axis=1)
self.transpose = P.Transpose()
def construct(self, inputs):
output = ()
batch_size = F.shape(inputs[0])[0]
for x in inputs:
x = self.transpose(x, (0, 2, 3, 1))
output += (F.reshape(x, (batch_size, -1)),)
res = self.concat(output)
return F.reshape(res, (batch_size, self.num_ssd_boxes, -1))
class MultiBox(nn.Cell):
"""
Multibox conv layers. Each multibox layer contains class conf scores and localization predictions.
Args:
config (dict): The default config of SSD.
Returns:
Tensor, localization predictions.
Tensor, class conf scores.
"""
def __init__(self, config):
super(MultiBox, self).__init__()
num_classes = config.num_classes
out_channels = config.extras_out_channels
num_default = config.num_default
loc_layers = []
cls_layers = []
for k, out_channel in enumerate(out_channels):
loc_layers += [_last_conv2d(out_channel, 4 * num_default[k],
kernel_size=3, stride=1, pad_mod='same', pad=0)]
cls_layers += [_last_conv2d(out_channel, num_classes * num_default[k],
kernel_size=3, stride=1, pad_mod='same', pad=0)]
self.multi_loc_layers = nn.layer.CellList(loc_layers)
self.multi_cls_layers = nn.layer.CellList(cls_layers)
self.flatten_concat = FlattenConcat(config)
def construct(self, inputs):
loc_outputs = ()
cls_outputs = ()
for i in range(len(self.multi_loc_layers)):
loc_outputs += (self.multi_loc_layers[i](inputs[i]),)
cls_outputs += (self.multi_cls_layers[i](inputs[i]),)
return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs)
class WeightSharedMultiBox(nn.Cell):
"""
Weight shared Multi-box conv layers. Each multi-box layer contains class conf scores and localization predictions.
All box predictors shares the same conv weight in different features.
Args:
config (dict): The default config of SSD.
loc_cls_shared_addition(bool): Whether the location predictor and classifier prediction share the
same addition layer.
Returns:
Tensor, localization predictions.
Tensor, class conf scores.
"""
def __init__(self, config, loc_cls_shared_addition=False):
super(WeightSharedMultiBox, self).__init__()
num_classes = config.num_classes
out_channels = config.extras_out_channels[0]
num_default = config.num_default[0]
num_features = len(config.feature_size)
num_addition_layers = config.num_addition_layers
self.loc_cls_shared_addition = loc_cls_shared_addition
if not loc_cls_shared_addition:
loc_convs = [
_conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers)
]
cls_convs = [
_conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers)
]
addition_loc_layer_list = []
addition_cls_layer_list = []
for _ in range(num_features):
addition_loc_layer = [
ConvBNReLU(out_channels, out_channels, 3, 1, 1, loc_convs[x]) for x in range(num_addition_layers)
]
addition_cls_layer = [
ConvBNReLU(out_channels, out_channels, 3, 1, 1, cls_convs[x]) for x in range(num_addition_layers)
]
addition_loc_layer_list.append(nn.SequentialCell(addition_loc_layer))
addition_cls_layer_list.append(nn.SequentialCell(addition_cls_layer))
self.addition_layer_loc = nn.CellList(addition_loc_layer_list)
self.addition_layer_cls = nn.CellList(addition_cls_layer_list)
else:
convs = [
_conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers)
]
addition_layer_list = []
for _ in range(num_features):
addition_layers = [
ConvBNReLU(out_channels, out_channels, 3, 1, 1, convs[x]) for x in range(num_addition_layers)
]
addition_layer_list.append(nn.SequentialCell(addition_layers))
self.addition_layer = nn.SequentialCell(addition_layer_list)
loc_layers = [_conv2d(out_channels, 4 * num_default,
kernel_size=3, stride=1, pad_mod='same')]
cls_layers = [_conv2d(out_channels, num_classes * num_default,
kernel_size=3, stride=1, pad_mod='same')]
self.loc_layers = nn.SequentialCell(loc_layers)
self.cls_layers = nn.SequentialCell(cls_layers)
self.flatten_concat = FlattenConcat(config)
def construct(self, inputs):
"""
Forward
"""
loc_outputs = ()
cls_outputs = ()
num_heads = len(inputs)
for i in range(num_heads):
if self.loc_cls_shared_addition:
features = self.addition_layer[i](inputs[i])
loc_outputs += (self.loc_layers(features),)
cls_outputs += (self.cls_layers(features),)
else:
features = self.addition_layer_loc[i](inputs[i])
loc_outputs += (self.loc_layers(features),)
features = self.addition_layer_cls[i](inputs[i])
cls_outputs += (self.cls_layers(features),)
return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs)
class SsdResNet50(nn.Cell):
"""
SSD Network using ResNet50 to extract features
Args:
config (dict): The default config of SSD.
Returns:
Tensor, localization predictions.
Tensor, class conf scores.
Examples:backbone
SsdResNet50(config).
"""
def __init__(self, config):
super(SsdResNet50, self).__init__()
self.multi_box = MultiBox(config)
self.activation = P.Sigmoid()
self.feature_extractor = resnet50_extra()
def construct(self, x):
"""
Forward
"""
features = self.feature_extractor(x)
pred_loc, pred_label = self.multi_box(features)
if not self.training:
pred_label = self.activation(pred_label)
pred_loc = F.cast(pred_loc, mstype.float32)
pred_label = F.cast(pred_label, mstype.float32)
return pred_loc, pred_label
class SigmoidFocalClassificationLoss(nn.Cell):
""""
Sigmoid focal-loss for classification.
Args:
gamma (float): Hyper-parameter to balance the easy and hard examples. Default: 2.0
alpha (float): Hyper-parameter to balance the positive and negative example. Default: 0.25
Returns:
Tensor, the focal loss.
"""
def __init__(self, gamma=2.0, alpha=0.25):
super(SigmoidFocalClassificationLoss, self).__init__()
self.sigmiod_cross_entropy = P.SigmoidCrossEntropyWithLogits()
self.sigmoid = P.Sigmoid()
self.pow = P.Pow()
self.onehot = P.OneHot()
self.on_value = Tensor(1.0, mstype.float32)
self.off_value = Tensor(0.0, mstype.float32)
self.gamma = gamma
self.alpha = alpha
def construct(self, logits, label):
"""
Forward
"""
label = self.onehot(label, F.shape(logits)[-1], self.on_value, self.off_value)
sigmiod_cross_entropy = self.sigmiod_cross_entropy(logits, label)
sigmoid = self.sigmoid(logits)
label = F.cast(label, mstype.float32)
p_t = label * sigmoid + (1 - label) * (1 - sigmoid)
modulating_factor = self.pow(1 - p_t, self.gamma)
alpha_weight_factor = label * self.alpha + (1 - label) * (1 - self.alpha)
focal_loss = modulating_factor * alpha_weight_factor * sigmiod_cross_entropy
return focal_loss
class SSDWithLossCell(nn.Cell):
""""
Provide SSD training loss through network.
Args:
network (Cell): The training network.
config (dict): SSD config.
Returns:
Tensor, the loss of the network.
"""
def __init__(self, network, config):
super(SSDWithLossCell, self).__init__()
self.network = network
self.less = P.Less()
self.tile = P.Tile()
self.reduce_sum = P.ReduceSum()
self.expand_dims = P.ExpandDims()
self.class_loss = SigmoidFocalClassificationLoss(config.gamma, config.alpha)
self.loc_loss = nn.SmoothL1Loss()
def construct(self, x, gt_loc, gt_label, num_matched_boxes):
"""
Forward
"""
pred_loc, pred_label = self.network(x)
mask = F.cast(self.less(0, gt_label), mstype.float32)
num_matched_boxes = self.reduce_sum(F.cast(num_matched_boxes, mstype.float32))
# Localization Loss
mask_loc = self.tile(self.expand_dims(mask, -1), (1, 1, 4))
smooth_l1 = self.loc_loss(pred_loc, gt_loc) * mask_loc
loss_loc = self.reduce_sum(self.reduce_sum(smooth_l1, -1), -1)
# Classification Loss
loss_cls = self.class_loss(pred_label, gt_label)
loss_cls = self.reduce_sum(loss_cls, (1, 2))
return self.reduce_sum((loss_cls + loss_loc) / num_matched_boxes)
grad_scale = C.MultitypeFuncGraph("grad_scale")
@grad_scale.register("Tensor", "Tensor")
def tensor_grad_scale(scale, grad):
return grad * P.Reciprocal()(scale)
class TrainingWrapper(nn.Cell):
"""
Encapsulation class of SSD network training.
Append an optimizer to the training network after that the construct
function can be called to create the backward graph.
Args:
network (Cell): The training network. Note that loss function should have been added.
optimizer (Optimizer): Optimizer for updating the weights.
sens (Number): The adjust parameter. Default: 1.0.
use_global_nrom(bool): Whether apply global norm before optimizer. Default: False
"""
def __init__(self, network, optimizer, sens=1.0, use_global_norm=False):
super(TrainingWrapper, self).__init__(auto_prefix=False)
self.network = network
self.network.set_grad()
self.weights = ms.ParameterTuple(network.trainable_params())
self.optimizer = optimizer
self.grad = C.GradOperation(get_by_list=True, sens_param=True)
self.sens = sens
self.reducer_flag = False
self.grad_reducer = None
self.use_global_norm = use_global_norm
self.parallel_mode = context.get_auto_parallel_context("parallel_mode")
if self.parallel_mode in [ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL]:
self.reducer_flag = True
if self.reducer_flag:
mean = context.get_auto_parallel_context("gradients_mean")
if auto_parallel_context().get_device_num_is_set():
degree = context.get_auto_parallel_context("device_num")
else:
degree = get_group_size()
self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree)
self.hyper_map = C.HyperMap()
def construct(self, *args):
"""
Forward
"""
weights = self.weights
loss = self.network(*args)
sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
grads = self.grad(self.network, weights)(*args, sens)
if self.reducer_flag:
# apply grad reducer on grads
grads = self.grad_reducer(grads)
if self.use_global_norm:
grads = self.hyper_map(F.partial(grad_scale, F.scalar_to_array(self.sens)), grads)
grads = C.clip_by_global_norm(grads)
return F.depend(loss, self.optimizer(grads))
class SsdInferWithDecoder(nn.Cell):
"""
SSD Infer wrapper to decode the bbox locations.
Args:
network (Cell): the origin ssd infer network without bbox decoder.
default_boxes (Tensor): the default_boxes from anchor generator
config (dict): ssd config
Returns:
Tensor, the locations for bbox after decoder representing (y0,x0,y1,x1)
Tensor, the prediction labels.
"""
def __init__(self, network, default_boxes, config):
super(SsdInferWithDecoder, self).__init__()
self.network = network
self.default_boxes = default_boxes
self.prior_scaling_xy = config.prior_scaling[0]
self.prior_scaling_wh = config.prior_scaling[1]
def construct(self, x):
"""
Forward
"""
pred_loc, pred_label = self.network(x)
default_bbox_xy = self.default_boxes[..., :2]
default_bbox_wh = self.default_boxes[..., 2:]
pred_xy = pred_loc[..., :2] * self.prior_scaling_xy * default_bbox_wh + default_bbox_xy
pred_wh = P.Exp()(pred_loc[..., 2:] * self.prior_scaling_wh) * default_bbox_wh
pred_xy_0 = pred_xy - pred_wh / 2.0
pred_xy_1 = pred_xy + pred_wh / 2.0
pred_xy = P.Concat(-1)((pred_xy_0, pred_xy_1))
pred_xy = P.Maximum()(pred_xy, 0)
pred_xy = P.Minimum()(pred_xy, 1)
return pred_xy, pred_label
def ssd_resnet50(**kwargs):
return SsdResNet50(**kwargs)

View File

@ -0,0 +1,160 @@
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Train SSD and get checkpoint files."""
import argparse
import ast
import mindspore.nn as nn
from mindspore import context, Tensor
from mindspore.communication.management import init, get_rank
from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, LossMonitor, TimeMonitor
from mindspore.train import Model
from mindspore.context import ParallelMode
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from mindspore.common import set_seed, dtype
from src.ssd import SSDWithLossCell, TrainingWrapper, ssd_resnet50
from src.config import config
from src.dataset import create_ssd_dataset, create_mindrecord
from src.lr_schedule import get_lr
from src.init_params import init_net_param, filter_checkpoint_parameter_by_list
set_seed(1)
def get_args():
"""
get args
"""
parser = argparse.ArgumentParser(description="SSD training")
parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"),
help="run platform, support Ascend, GPU and CPU.")
parser.add_argument("--only_create_dataset", type=ast.literal_eval, default=False,
help="If set it true, only create Mindrecord, default is False.")
parser.add_argument("--distribute", type=ast.literal_eval, default=False,
help="Run distribute, default is False.")
parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.")
parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.")
parser.add_argument("--lr", type=float, default=0.05, help="Learning rate, default is 0.05.")
parser.add_argument("--mode", type=str, default="sink", help="Run sink mode or not, default is sink.")
parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.")
parser.add_argument("--epoch_size", type=int, default=500, help="Epoch size, default is 500.")
parser.add_argument("--batch_size", type=int, default=32, help="Batch size, default is 32.")
parser.add_argument("--pre_trained", type=str, default=None, help="Pretrained Checkpoint file path.")
parser.add_argument("--pre_trained_epoch_size", type=int, default=0, help="Pretrained epoch size.")
parser.add_argument("--save_checkpoint_epochs", type=int, default=10, help="Save checkpoint epochs, default is 10.")
parser.add_argument("--loss_scale", type=int, default=1024, help="Loss scale, default is 1024.")
parser.add_argument("--filter_weight", type=ast.literal_eval, default=False,
help="Filter head weight parameters, default is False.")
parser.add_argument('--freeze_layer', type=str, default="none", choices=["none", "backbone"],
help="freeze the weights of network, support freeze the backbone's weights, "
"default is not freezing.")
args_opt = parser.parse_args()
return args_opt
def ssd_model_build(args_opt):
"""
build ssd model
"""
if config.model == "ssd_resnet50":
ssd = ssd_resnet50(config=config)
init_net_param(ssd)
if config.feature_extractor_base_param != "":
param_dict = load_checkpoint(config.feature_extractor_base_param)
for x in list(param_dict.keys()):
param_dict["network.feature_extractor.resnet." + x] = param_dict[x]
del param_dict[x]
load_param_into_net(ssd.feature_extractor.resnet, param_dict)
else:
raise ValueError(f'config.model: {config.model} is not supported')
return ssd
def main():
args_opt = get_args()
rank = 0
device_num = 1
if args_opt.run_platform == "CPU":
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")
else:
context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id)
if args_opt.distribute:
device_num = args_opt.device_num
context.reset_auto_parallel_context()
context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True,
device_num=device_num)
init()
context.set_auto_parallel_context(all_reduce_fusion_config=[29, 58, 89])
rank = get_rank()
mindrecord_file = create_mindrecord(args_opt.dataset, "ssd.mindrecord", True)
if args_opt.only_create_dataset:
return
loss_scale = float(args_opt.loss_scale)
if args_opt.run_platform == "CPU":
loss_scale = 1.0
# When create MindDataset, using the fitst mindrecord file, such as ssd.mindrecord0.
use_multiprocessing = (args_opt.run_platform != "CPU")
dataset = create_ssd_dataset(mindrecord_file, repeat_num=1, batch_size=args_opt.batch_size,
device_num=device_num, rank=rank, use_multiprocessing=use_multiprocessing)
dataset_size = dataset.get_dataset_size()
print(f"Create dataset done! dataset size is {dataset_size}")
ssd = ssd_model_build(args_opt)
print("finish ssd model building ...............")
if ("use_float16" in config and config.use_float16) or args_opt.run_platform == "GPU":
ssd.to_float(dtype.float16)
net = SSDWithLossCell(ssd, config)
# checkpoint
ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs)
save_ckpt_path = './ckpt_' + config.model + '_' + str(rank) + '/'
ckpoint_cb = ModelCheckpoint(prefix="ssd", directory=save_ckpt_path, config=ckpt_config)
if args_opt.pre_trained:
param_dict = load_checkpoint(args_opt.pre_trained)
if args_opt.filter_weight:
filter_checkpoint_parameter_by_list(param_dict, config.checkpoint_filter_list)
load_param_into_net(net, param_dict, True)
lr = Tensor(get_lr(global_step=args_opt.pre_trained_epoch_size * dataset_size,
lr_init=config.lr_init, lr_end=config.lr_end_rate * args_opt.lr, lr_max=args_opt.lr,
warmup_epochs=config.warmup_epochs,
total_epochs=args_opt.epoch_size,
steps_per_epoch=dataset_size))
if "use_global_norm" in config and config.use_global_norm:
opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr,
config.momentum, config.weight_decay, 1.0)
net = TrainingWrapper(net, opt, loss_scale, True)
else:
opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr,
config.momentum, config.weight_decay, loss_scale)
net = TrainingWrapper(net, opt, loss_scale)
callback = [TimeMonitor(data_size=dataset_size), LossMonitor(), ckpoint_cb]
model = Model(net)
dataset_sink_mode = False
if args_opt.mode == "sink" and args_opt.run_platform != "CPU":
print("In sink mode, one epoch return a loss.")
dataset_sink_mode = True
print("Start train SSD, the first epoch will be slower because of the graph compilation.")
model.train(args_opt.epoch_size, dataset, callbacks=callback, dataset_sink_mode=dataset_sink_mode)
if __name__ == '__main__':
main()