forked from mindspore-Ecosystem/mindspore
!21946 [昇腾众智] RetinaFace网络:众智RetinaFace_ResNet50、GPU征集活动RetinaFace_MobileNet0.25
Merge pull request !21946 from yexijoe/RetinaFace_ResNet50
This commit is contained in:
commit
1b4f0bd033
|
@ -0,0 +1,451 @@
|
|||
# 目录
|
||||
|
||||
<!-- TOC -->
|
||||
|
||||
- [目录](#目录)
|
||||
- [retinaface描述](#retinaface描述)
|
||||
- [预训练模型](#预训练模型)
|
||||
- [模型架构](#模型架构)
|
||||
- [数据集](#数据集)
|
||||
- [环境要求](#环境要求)
|
||||
- [快速入门](#快速入门)
|
||||
- [脚本说明](#脚本说明)
|
||||
- [脚本及样例代码](#脚本及样例代码)
|
||||
- [脚本参数](#脚本参数)
|
||||
- [训练过程](#训练过程)
|
||||
- [用法](#用法)
|
||||
- [分布式训练](#分布式训练)
|
||||
- [评估过程](#评估过程)
|
||||
- [评估](#评估)
|
||||
- [导出过程](#导出过程)
|
||||
- [导出](#导出)
|
||||
- [推理过程](#推理过程)
|
||||
- [推理](#推理)
|
||||
- [模型描述](#模型描述)
|
||||
- [性能](#性能)
|
||||
- [评估性能](#评估性能)
|
||||
- [WIDERFACE上的retinaface](#WIDERFACE上的retinaface)
|
||||
- [推理性能](#推理性能)
|
||||
- [WIDERFACE上的retinaface](#WIDERFACE上的retinaface)
|
||||
- [随机情况说明](#随机情况说明)
|
||||
- [ModelZoo主页](#modelzoo主页)
|
||||
|
||||
<!-- /TOC -->
|
||||
|
||||
# retinaface描述
|
||||
|
||||
Retinaface人脸检测模型于2019年提出,应用于WIDER FACE数据集时效果最佳。RetinaFace论文:RetinaFace: Single-stage Dense Face Localisation in the Wild。与S3FD和MTCNN相比,RetinaFace显著提上了小脸召回率,但不适合多尺度人脸检测。为了解决这些问题,RetinaFace采用RetinaFace特征金字塔结构进行不同尺度间的特征融合,并增加了SSH模块。
|
||||
|
||||
[论文](https://arxiv.org/abs/1905.00641v2): Jiankang Deng, Jia Guo, Yuxiang Zhou, Jinke Yu, Irene Kotsia, Stefanos Zafeiriou. "RetinaFace: Single-stage Dense Face Localisation in the Wild". 2019.
|
||||
|
||||
# 预训练模型
|
||||
|
||||
RetinaFace可以使用ResNet50或MobileNet0.25骨干提取图像特征进行检测。使用ResNet50充当backbone时需要使用./src/resnet.py作为模型文件,然后从ModelZoo中获取ResNet50的训练脚本(使用默认的参数配置)在ImageNet2012上训练得到ResNet50的预训练模型。
|
||||
|
||||
# 模型架构
|
||||
|
||||
具体来说,RetinaFace是基于RetinaNet的网络,采用了RetinaNet的特性金字塔结构,并增加了SSH结构。网络中除了传统的检测分支外,还增加了关键点预测分支和自监控分支。结果表明,这两个分支可以提高模型的性能。这里我们不介绍自我监控分支。
|
||||
|
||||
# 数据集
|
||||
|
||||
使用的数据集: [WIDERFACE](<http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>)
|
||||
|
||||
获取数据集:
|
||||
|
||||
1. 点击[此处](<https://github.com/peteryuX/retinaface-tf2>)获取数据集和标注。
|
||||
2. 点击[此处](<https://github.com/peteryuX/retinaface-tf2/tree/master/widerface_evaluate/ground_truth>)获取评估地面真值标签。
|
||||
|
||||
- 数据集大小:3.42G,32203张彩色图像
|
||||
- 训练集:1.36G,12800张图像
|
||||
- 验证集:345.95M,3226张图像
|
||||
- 测试集:1.72G,16177张图像
|
||||
|
||||
- 数据集目录结构如下所示:
|
||||
|
||||
```bash
|
||||
├── data/
|
||||
├── widerface/
|
||||
├── ground_truth/
|
||||
│ ├──wider_easy_val.mat
|
||||
│ ├──wider_face_val.mat
|
||||
│ ├──wider_hard_val.mat
|
||||
│ ├──wider_medium_val.mat
|
||||
├── train/
|
||||
│ ├──images/
|
||||
│ │ ├──0--Parade/
|
||||
│ │ │ ├──0_Parade_marchingband_1_5.jpg
|
||||
│ │ │ ├──...
|
||||
│ │ ├──.../
|
||||
│ ├──label.txt
|
||||
├── val/
|
||||
│ ├──images/
|
||||
│ │ ├──0--Parade/
|
||||
│ │ │ ├──0_Parade_marchingband_1_20.jpg
|
||||
│ │ │ ├──...
|
||||
│ │ ├──.../
|
||||
│ ├──label.txt
|
||||
```
|
||||
|
||||
# 环境要求
|
||||
|
||||
- 硬件(Ascend、GPU)
|
||||
- 使用ResNet50作为backbone时用Ascend来搭建硬件环境。
|
||||
- 使用MobileNet0.25作为backbone时用GPU来搭建硬件环境。
|
||||
- 框架
|
||||
- [MindSpore](https://www.mindspore.cn/install)
|
||||
- 如需查看详情,请参见如下资源:
|
||||
- [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
|
||||
|
||||
# 快速入门
|
||||
|
||||
通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估:
|
||||
|
||||
- Ascend处理器环境运行(使用ResNet50作为backbone)
|
||||
|
||||
```python
|
||||
# 训练示例
|
||||
python train.py --backbone_name 'ResNet50' > train.log 2>&1 &
|
||||
OR
|
||||
bash ./scripts/run_standalone_train_ascend.sh
|
||||
|
||||
# 分布式训练示例
|
||||
bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE_FILE]
|
||||
|
||||
# 评估示例
|
||||
python eval.py --backbone_name 'ResNet50' --val_model [CKPT_FILE] > ./eval.log 2>&1 &
|
||||
OR
|
||||
bash ./scripts/run_standalone_eval_ascend.sh './train_parallel3/checkpoint/ckpt_3/RetinaFace-56_201.ckpt'
|
||||
|
||||
# 推理示例
|
||||
bash run_infer_310.sh ../retinaface.mindir /home/dataset/widerface/val/ 0
|
||||
```
|
||||
|
||||
- GPU处理器环境运行(使用MobileNet0.25作为backbone)
|
||||
|
||||
```python
|
||||
# 训练示例
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python train.py --backbone_name 'MobileNet025' > train.log 2>&1 &
|
||||
|
||||
# 分布式训练示例
|
||||
bash scripts/run_distribution_train_gpu.sh 2 0,1
|
||||
|
||||
# 评估示例
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python eval.py --backbone_name 'MobileNet025' --val_model [CKPT_FILE] > eval.log 2>&1 &
|
||||
OR
|
||||
bash scripts/run_standalone_eval_gpu.sh 0 './checkpoint/ckpt_0/RetinaFace-117_804.ckpt'
|
||||
```
|
||||
|
||||
# 脚本说明
|
||||
|
||||
## 脚本及样例代码
|
||||
|
||||
```bash
|
||||
├── model_zoo
|
||||
├── README.md // 所有模型的说明
|
||||
├── retinaface
|
||||
├── README_CN.md // Retinaface相关说明
|
||||
├── ascend310_infer // 实现310推理源代码
|
||||
├── scripts
|
||||
│ ├──run_distribution_train_ascend.sh // 分布式到Ascend的shell脚本
|
||||
│ ├──run_distribution_train_gpu.sh // 分布式到GPU的shell脚本
|
||||
│ ├──run_infer_310.sh // Ascend推理的shell脚本(使用ResNet50作为backbone时)
|
||||
│ ├──run_standalone_eval_ascend.sh // Ascend评估的shell脚本
|
||||
│ ├──run_standalone_eval_gpu.sh // GPU评估的shell脚本
|
||||
│ ├──run_standalone_train_ascend.sh // Ascend单卡训练的shell脚本
|
||||
├── src
|
||||
│ ├──augmentation.py // 数据增强方法
|
||||
│ ├──config.py // 参数配置
|
||||
│ ├──dataset.py // 创建数据集
|
||||
│ ├──loss.py // 损失函数
|
||||
│ ├──lr_schedule.py // 学习率衰减策略
|
||||
│ ├──network_with_mobilenet.py // 使用MobileNet0.25作为backbone的RetinaFace架构
|
||||
│ ├──network_with_resnet.py // 使用ResNet50作为backbone的RetinaFace架构
|
||||
│ ├──resnet.py // 使用ResNet50作为backbone时预训练要用到的ResNet50架构
|
||||
│ ├──utils.py // 数据预处理
|
||||
├── data
|
||||
│ ├──widerface // 数据集
|
||||
│ ├──resnet-90_625.ckpt // ResNet50 ImageNet预训练模型
|
||||
│ ├──ground_truth // 评估标签
|
||||
├── eval.py // 评估脚本
|
||||
├── export.py // 将checkpoint文件导出到air/mindir(使用ResNet50作为backbone时)
|
||||
├── postprocess.py // 310推理后处理脚本
|
||||
├── preprocess.py // 310推理前处理脚本
|
||||
├── train.py // 训练脚本
|
||||
```
|
||||
|
||||
## 脚本参数
|
||||
|
||||
在config.py中可以同时配置训练参数和评估参数。
|
||||
|
||||
- 配置使用ResNet50作为backbone的RetinaFace和WIDER FACE数据集
|
||||
|
||||
```python
|
||||
'variance': [0.1, 0.2], # 方差
|
||||
'clip': False, # 裁剪
|
||||
'loc_weight': 2.0, # Bbox回归损失权重
|
||||
'class_weight': 1.0, # 置信度/类回归损失权重
|
||||
'landm_weight': 1.0, # 地标回归损失权重
|
||||
'batch_size': 8, # 训练批次大小
|
||||
'num_workers': 16, # 数据集加载数据的线程数量
|
||||
'num_anchor': 29126, # 矩形框数量,取决于图片大小
|
||||
'nnpu': 8, # 训练的NPU数量
|
||||
'image_size': 840, # 训练图像大小
|
||||
'match_thresh': 0.35, # 匹配框阈值
|
||||
'optim': 'sgd', # 优化器类型
|
||||
'momentum': 0.9, # 优化器动量
|
||||
'weight_decay': 1e-4, # 优化器权重衰减
|
||||
'epoch': 60, # 训练轮次数量
|
||||
'decay1': 20, # 首次权重衰减的轮次数
|
||||
'decay2': 40, # 二次权重衰减的轮次数
|
||||
'initial_lr':0.04 # 初始学习率,八卡并行训练时设置为0.04
|
||||
'warmup_epoch': -1, # 热身大小,-1表示无热身
|
||||
'gamma': 0.1, # 学习率衰减比
|
||||
'ckpt_path': './checkpoint/', # 模型保存路径
|
||||
'keep_checkpoint_max': 8, # 预留检查点数量
|
||||
'resume_net': None, # 重启网络,默认为None
|
||||
'training_dataset': '../data/widerface/train/label.txt', # 训练数据集标签路径
|
||||
'pretrain': True, # 是否基于预训练骨干进行训练
|
||||
'pretrain_path': '../data/resnet-90_625.ckpt', # 预训练的骨干检查点路径
|
||||
# 验证
|
||||
'val_model': './train_parallel3/checkpoint/ckpt_3/RetinaFace-56_201.ckpt', # 验证模型路径
|
||||
'val_dataset_folder': './data/widerface/val/', # 验证数据集路径
|
||||
'val_origin_size': True, # 是否使用全尺寸验证
|
||||
'val_confidence_threshold': 0.02, # 验证置信度阈值
|
||||
'val_nms_threshold': 0.4, # 验证NMS阈值
|
||||
'val_iou_threshold': 0.5, # 验证IOU阈值
|
||||
'val_save_result': False, # 是否保存结果
|
||||
'val_predict_save_folder': './widerface_result', # 结果保存路径
|
||||
'val_gt_dir': './data/ground_truth/', # 验证集ground_truth路径
|
||||
# 推理
|
||||
'infer_dataset_folder': '/home/dataset/widerface/val/', # 310进行推理时验证数据集路径
|
||||
'infer_gt_dir': '/home/dataset/widerface/ground_truth/', # 310进行推理时验证集ground_truth路径
|
||||
```
|
||||
|
||||
- 配置使用MobileNet0.25作为backbone的RetinaFace和WIDER FACE数据集
|
||||
|
||||
```python
|
||||
'name': 'MobileNet025', # 骨干名称
|
||||
'variance': [0.1, 0.2], # 方差
|
||||
'clip': False, # 裁剪
|
||||
'loc_weight': 2.0, # Bbox回归损失权重
|
||||
'class_weight': 1.0, # 置信度/类回归损失权重
|
||||
'landm_weight': 1.0, # 地标回归损失权重
|
||||
'batch_size': 8, # 训练批次大小
|
||||
'num_workers': 12, # 数据集加载数据的线程数量
|
||||
'num_anchor': 16800, # 矩形框数量,取决于图片大小
|
||||
'ngpu': 2, # 训练的GPU数量
|
||||
'epoch': 120, # 训练轮次数量
|
||||
'decay1': 70, # 首次权重衰减的轮次数
|
||||
'decay2': 90, # 二次权重衰减的轮次数
|
||||
'image_size': 640, # 训练图像大小
|
||||
'match_thresh': 0.35, # 匹配框阈值
|
||||
'optim': 'sgd', # 优化器类型
|
||||
'momentum': 0.9, # 优化器动量
|
||||
'weight_decay': 5e-4, # 优化器权重衰减
|
||||
'initial_lr': 0.02, # 学习率
|
||||
'warmup_epoch': 5, # 热身大小,-1表示无热身
|
||||
'gamma': 0.1, # 学习率衰减比
|
||||
'ckpt_path': './checkpoint/', # 模型保存路径
|
||||
'save_checkpoint_steps': 2000, # 保存检查点迭代
|
||||
'keep_checkpoint_max': 3, # 预留检查点数量
|
||||
'resume_net': None, # 重启网络,默认为None
|
||||
'training_dataset': '', # 训练数据集标签路径,如data/widerface/train/label.txt
|
||||
'pretrain': False, # 是否基于预训练骨干进行训练
|
||||
'pretrain_path': './data/mobilenetv1-90_5004.ckpt', # 预训练的骨干检查点路径
|
||||
# 验证
|
||||
'val_model': './checkpoint/ckpt_0/RetinaFace-117_804.ckpt', # 验证模型路径
|
||||
'val_dataset_folder': './data/widerface/val/', # 验证数据集路径
|
||||
'val_origin_size': False, # 是否使用全尺寸验证
|
||||
'val_confidence_threshold': 0.02, # 验证置信度阈值
|
||||
'val_nms_threshold': 0.4, # 验证NMS阈值
|
||||
'val_iou_threshold': 0.5, # 验证IOU阈值
|
||||
'val_save_result': False, # 是否保存结果
|
||||
'val_predict_save_folder': './widerface_result', # 结果保存路径
|
||||
'val_gt_dir': './data/ground_truth/', # 验证集ground_truth路径
|
||||
```
|
||||
|
||||
## 训练过程
|
||||
|
||||
### 用法
|
||||
|
||||
- Ascend处理器环境运行(使用ResNet50作为backbone)
|
||||
|
||||
```bash
|
||||
python train.py --backbone_name 'ResNet50' > train.log 2>&1 &
|
||||
OR
|
||||
bash ./scripts/run_standalone_train_ascend.sh
|
||||
```
|
||||
|
||||
上述python命令在后台运行,可通过`train.log`文件查看结果。
|
||||
|
||||
训练结束后,可以得到损失值:
|
||||
|
||||
```bash
|
||||
epoch: 7 step: 1609, loss is 5.327434
|
||||
epoch time: 466281.709 ms, per step time: 289.796 ms
|
||||
epoch: 8 step: 1609, loss is 4.7512465
|
||||
epoch time: 466995.237 ms, per step time: 290.239 ms
|
||||
```
|
||||
|
||||
- GPU处理器环境运行(使用MobileNet0.25作为backbone)
|
||||
|
||||
```bash
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python train.py --backbone_name 'MobileNet025' > train.log 2>&1 &
|
||||
```
|
||||
|
||||
上述python命令在后台运行,可通过`train.log`文件查看结果。
|
||||
|
||||
训练结束后,可在默认文件夹`./checkpoint/`中找到检查点文件。
|
||||
|
||||
### 分布式训练
|
||||
|
||||
- Ascend处理器环境运行(使用ResNet50作为backbone)
|
||||
|
||||
```bash
|
||||
bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE_FILE]
|
||||
```
|
||||
|
||||
上述shell脚本在后台运行分布式训练,可通过`train_parallel0/log`文件查看结果。
|
||||
|
||||
训练结束后,可以得到损失值:
|
||||
|
||||
```bash
|
||||
epoch: 4 step: 201, loss is 4.870843
|
||||
epoch time: 60460.177 ms, per step time: 300.797 ms
|
||||
epoch: 5 step: 201, loss is 4.649786
|
||||
epoch time: 60527.898 ms, per step time: 301.134 ms
|
||||
```
|
||||
|
||||
- GPU处理器环境运行(使用MobileNet0.25作为backbone)
|
||||
|
||||
```bash
|
||||
bash scripts/run_distribute_gpu_train.sh 2 0,1
|
||||
```
|
||||
|
||||
上述shell脚本在后台运行分布式训练,可通过`train/train.log`文件查看结果。
|
||||
|
||||
训练结束后,可在默认文件夹`./checkpoint/ckpt_0/`中找到检查点文件。
|
||||
|
||||
## 评估过程
|
||||
|
||||
### 评估
|
||||
|
||||
- Ascend环境运行评估WIDER FACE数据集(使用ResNet50作为backbone)
|
||||
|
||||
CKPT_FILE是用于评估的检查点路径。如'./train_parallel3/checkpoint/ckpt_3/RetinaFace-56_201.ckpt'。
|
||||
|
||||
```bash
|
||||
python eval.py --backbone_name 'ResNet50' --val_model [CKPT_FILE] > ./eval.log 2>&1 &
|
||||
OR
|
||||
bash run_standalone_eval_ascend.sh [CKPT_FILE]
|
||||
```
|
||||
|
||||
上述python命令在后台运行,可通过"eval.log"文件查看结果。测试数据集的准确率如下:
|
||||
|
||||
```python
|
||||
# grep "Val AP" eval.log
|
||||
Easy Val AP : 0.9516
|
||||
Medium Val AP : 0.9381
|
||||
Hard Val AP : 0.8403
|
||||
```
|
||||
|
||||
- GPU处理器环境运行评估WIDER FACE数据集(使用MobileNet0.25作为backbone)
|
||||
|
||||
CKPT_FILE是用于评估的检查点路径。如'./checkpoint/ckpt_0/RetinaFace-117_804.ckpt'。
|
||||
|
||||
```bash
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python eval.py --backbone_name 'MobileNet025' --val_model [CKPT_FILE] > eval.log 2>&1 &
|
||||
```
|
||||
|
||||
上述python命令在后台运行,可通过"eval.log"文件查看结果。测试数据集的准确率如下:
|
||||
|
||||
```python
|
||||
# grep "Val AP" eval.log
|
||||
Easy Val AP : 0.8877
|
||||
Medium Val AP : 0.8698
|
||||
Hard Val AP : 0.8005
|
||||
```
|
||||
|
||||
## 导出过程
|
||||
|
||||
### 导出
|
||||
|
||||
将checkpoint文件导出成mindir格式模型。(使用ResNet50作为backbone)
|
||||
|
||||
```shell
|
||||
python export.py --ckpt_file [CKPT_FILE]
|
||||
```
|
||||
|
||||
## 推理过程
|
||||
|
||||
### 推理
|
||||
|
||||
在进行推理之前我们需要先导出模型。mindir可以在任意环境上导出,air模型只能在昇腾910环境上导出。以下展示了使用mindir模型执行推理的示例。
|
||||
|
||||
- 在昇腾310上使用WIDER FACE数据集进行推理(使用ResNet50作为backbone)
|
||||
|
||||
执行推理的命令如下所示,其中'MINDIR_PATH'是mindir文件路径;'DATASET_PATH'是使用的推理数据集所在路径,如'/home/dataset/widerface/val/';'DEVICE_ID'可选,默认值为0。
|
||||
|
||||
```shell
|
||||
# Ascend310 inference
|
||||
bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID]
|
||||
```
|
||||
|
||||
推理的精度结果保存在scripts目录下,在acc.log日志文件中可以找到类似以下的分类准确率结果。推理的性能结果保存在scripts/time_Result目录下,在test_perform_static.txt文件中可以找到类似以下的性能结果。
|
||||
|
||||
```bash
|
||||
Easy Val AP : 0.9498
|
||||
Medium Val AP : 0.9351
|
||||
Hard Val AP : 0.8306
|
||||
NN inference cost average time: 365.584 ms of infer_count 3226
|
||||
```
|
||||
|
||||
# 模型描述
|
||||
|
||||
## 性能
|
||||
|
||||
### 评估性能
|
||||
|
||||
#### WIDERFACE上的retinaface
|
||||
|
||||
| 参数 | Ascend | GPU |
|
||||
| -------------------------- | -------------------------------------------------------------| -------------------------------------------------------------|
|
||||
| 模型版本 | RetinaFace + ResNet50 | RetinaFace + MobileNet0.25 |
|
||||
| 资源 | Ascend 910 | Tesla V100-32G |
|
||||
| 上传日期 | 2021-08-17 | 2021-08-16 |
|
||||
| MindSpore版本 | 1.2.0 | 1.4.0 |
|
||||
| 数据集 | WIDERFACE |
|
||||
| 训练参数 | epoch=60, steps=201, batch_size=8, lr=0.04(8卡为0.04,单卡可设为0.01) | epoch=120, steps=804, batch_size=8, initial_lr=0.02 |
|
||||
| 优化器 | SGD | SGD |
|
||||
| 损失函数 | MultiBoxLoss + Softmax交叉熵 | MultiBoxLoss + Softmax交叉熵 |
|
||||
| 输出 |边界框 + 置信度 + 地标 |边界框 + 置信度 + 地标 |
|
||||
| 准确率 | Easy:0.9516;Medium:0.9381;Hard:0.8403 | Easy:0.8877;Medium:0.8698;Hard:0.8005 |
|
||||
| 速度 | 单卡:290ms/step;8卡:301ms/step | 2卡:435ms/step |
|
||||
| 总时长 | 8卡:1.05小时 | 2卡:11.74小时 |
|
||||
|
||||
### 推理性能
|
||||
|
||||
#### WIDERFACE上的retinaface
|
||||
|
||||
| 参数 | Ascend |
|
||||
| -------------------------- | ----------------------------------------------------------- |
|
||||
| 模型版本 | RetinaFace + ResNet50 |
|
||||
| 资源 | Ascend 310 |
|
||||
| 上传日期 | 2021-08-17 |
|
||||
| MindSpore版本 | 1.4.0.20210805 |
|
||||
| 数据集 | WIDERFACE |
|
||||
| 准确率 | Easy:0.9498;Medium:0.9351;Hard:0.8306 |
|
||||
| 速度 | NN inference cost average time: 365.584 ms of infer_count 3226 |
|
||||
|
||||
# 随机情况说明
|
||||
|
||||
在train.py中使用mindspore.common.seed.set_seed()函数设置种子。
|
||||
|
||||
# ModelZoo主页
|
||||
|
||||
请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。
|
|
@ -0,0 +1,14 @@
|
|||
cmake_minimum_required(VERSION 3.14.1)
|
||||
project(Ascend310Infer)
|
||||
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
|
||||
set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
|
||||
option(MINDSPORE_PATH "mindspore install path" "")
|
||||
include_directories(${MINDSPORE_PATH})
|
||||
include_directories(${MINDSPORE_PATH}/include)
|
||||
include_directories(${PROJECT_SRC_ROOT})
|
||||
find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
|
||||
file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
|
||||
|
||||
add_executable(main src/main.cc src/utils.cc)
|
||||
target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags)
|
|
@ -0,0 +1,29 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
if [ -d out ]; then
|
||||
rm -rf out
|
||||
fi
|
||||
|
||||
mkdir out
|
||||
cd out || exit
|
||||
|
||||
if [ -f "Makefile" ]; then
|
||||
make clean
|
||||
fi
|
||||
|
||||
cmake .. \
|
||||
-DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
|
||||
make
|
|
@ -0,0 +1,35 @@
|
|||
/**
|
||||
* Copyright 2021 Huawei Technologies Co., Ltd
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#ifndef MINDSPORE_INFERENCE_UTILS_H_
|
||||
#define MINDSPORE_INFERENCE_UTILS_H_
|
||||
|
||||
#include <sys/stat.h>
|
||||
#include <dirent.h>
|
||||
#include <vector>
|
||||
#include <string>
|
||||
#include <memory>
|
||||
#include "include/api/types.h"
|
||||
|
||||
std::vector<std::string> GetAllFiles(std::string_view dirName);
|
||||
DIR *OpenDir(std::string_view dirName);
|
||||
std::string RealPath(std::string_view path);
|
||||
mindspore::MSTensor ReadFileToTensor(const std::string &file);
|
||||
int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs);
|
||||
std::vector<std::string> GetAllFiles(std::string dir_name);
|
||||
std::vector<std::vector<std::string>> GetAllInputData(std::string dir_name);
|
||||
|
||||
#endif
|
|
@ -0,0 +1,190 @@
|
|||
/**
|
||||
* Copyright 2021 Huawei Technologies Co., Ltd
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
#include <sys/time.h>
|
||||
#include <gflags/gflags.h>
|
||||
#include <dirent.h>
|
||||
#include <iostream>
|
||||
#include <string>
|
||||
#include <algorithm>
|
||||
#include <iosfwd>
|
||||
#include <vector>
|
||||
#include <fstream>
|
||||
#include <sstream>
|
||||
|
||||
#include "include/api/model.h"
|
||||
#include "include/api/context.h"
|
||||
#include "include/api/types.h"
|
||||
#include "include/api/serialization.h"
|
||||
#include "include/dataset/vision_ascend.h"
|
||||
#include "include/dataset/execute.h"
|
||||
#include "include/dataset/transforms.h"
|
||||
#include "include/dataset/vision.h"
|
||||
#include "inc/utils.h"
|
||||
|
||||
using mindspore::Context;
|
||||
using mindspore::Serialization;
|
||||
using mindspore::Model;
|
||||
using mindspore::Status;
|
||||
using mindspore::ModelType;
|
||||
using mindspore::GraphCell;
|
||||
using mindspore::kSuccess;
|
||||
using mindspore::MSTensor;
|
||||
using mindspore::dataset::Execute;
|
||||
using mindspore::dataset::vision::Decode;
|
||||
using mindspore::dataset::vision::Resize;
|
||||
using mindspore::dataset::vision::CenterCrop;
|
||||
using mindspore::dataset::vision::Normalize;
|
||||
using mindspore::dataset::vision::HWC2CHW;
|
||||
|
||||
|
||||
DEFINE_string(mindir_path, "", "mindir path");
|
||||
DEFINE_string(input0_path, ".", "input0 path");
|
||||
DEFINE_string(dataset_name, "widerface", "dataset name");
|
||||
DEFINE_int32(device_id, 0, "device id");
|
||||
|
||||
int load_model(Model *model, std::vector<MSTensor> *model_inputs, std::string mindir_path, int device_id) {
|
||||
if (RealPath(mindir_path).empty()) {
|
||||
std::cout << "Invalid mindir" << std::endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
auto context = std::make_shared<Context>();
|
||||
auto ascend310 = std::make_shared<mindspore::Ascend310DeviceInfo>();
|
||||
ascend310->SetDeviceID(device_id);
|
||||
context->MutableDeviceInfo().push_back(ascend310);
|
||||
mindspore::Graph graph;
|
||||
Serialization::Load(mindir_path, ModelType::kMindIR, &graph);
|
||||
|
||||
Status ret = model->Build(GraphCell(graph), context);
|
||||
if (ret != kSuccess) {
|
||||
std::cout << "ERROR: Build failed." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
*model_inputs = model->GetInputs();
|
||||
if (model_inputs->empty()) {
|
||||
std::cout << "Invalid model, inputs is empty." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
gflags::ParseCommandLineFlags(&argc, &argv, true);
|
||||
|
||||
Model model;
|
||||
std::vector<MSTensor> model_inputs;
|
||||
load_model(&model, &model_inputs, FLAGS_mindir_path, FLAGS_device_id);
|
||||
|
||||
std::map<double, double> costTime_map;
|
||||
struct timeval start = {0};
|
||||
struct timeval end = {0};
|
||||
double startTimeMs;
|
||||
double endTimeMs;
|
||||
|
||||
if (FLAGS_dataset_name == "widerface") {
|
||||
auto input0_files = GetAllFiles(FLAGS_input0_path);
|
||||
if (input0_files.empty()) {
|
||||
std::cout << "ERROR: no input data." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
size_t size = input0_files.size();
|
||||
for (size_t i = 0; i < size; ++i) {
|
||||
std::vector<MSTensor> inputs;
|
||||
std::vector<MSTensor> outputs;
|
||||
std::cout << "Start predict input files:" << input0_files[i] <<std::endl;
|
||||
auto input0 = ReadFileToTensor(input0_files[i]);
|
||||
inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(),
|
||||
input0.Data().get(), input0.DataSize());
|
||||
|
||||
gettimeofday(&start, nullptr);
|
||||
Status ret = model.Predict(inputs, &outputs);
|
||||
gettimeofday(&end, nullptr);
|
||||
if (ret != kSuccess) {
|
||||
std::cout << "Predict " << input0_files[i] << " failed." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
|
||||
endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
|
||||
costTime_map.insert(std::pair<double, double>(startTimeMs, endTimeMs));
|
||||
int rst = WriteResult(input0_files[i], outputs);
|
||||
if (rst != 0) {
|
||||
std::cout << "write result failed." << std::endl;
|
||||
return rst;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
auto input0_files = GetAllInputData(FLAGS_input0_path);
|
||||
if (input0_files.empty()) {
|
||||
std::cout << "ERROR: no input data." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
size_t size = input0_files.size();
|
||||
for (size_t i = 0; i < size; ++i) {
|
||||
for (size_t j = 0; j < input0_files[i].size(); ++j) {
|
||||
std::vector<MSTensor> inputs;
|
||||
std::vector<MSTensor> outputs;
|
||||
std::cout << "Start predict input files:" << input0_files[i][j] <<std::endl;
|
||||
auto decode = Decode();
|
||||
auto resize = Resize({256, 256});
|
||||
auto centercrop = CenterCrop({224, 224});
|
||||
auto normalize = Normalize({123.675, 116.28, 103.53}, {58.395, 57.12, 57.375});
|
||||
auto hwc2chw = HWC2CHW();
|
||||
|
||||
Execute SingleOp({decode, resize, centercrop, normalize, hwc2chw});
|
||||
auto imgDvpp = std::make_shared<MSTensor>();
|
||||
SingleOp(ReadFileToTensor(input0_files[i][j]), imgDvpp.get());
|
||||
inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(),
|
||||
imgDvpp->Data().get(), imgDvpp->DataSize());
|
||||
gettimeofday(&start, nullptr);
|
||||
Status ret = model.Predict(inputs, &outputs);
|
||||
gettimeofday(&end, nullptr);
|
||||
if (ret != kSuccess) {
|
||||
std::cout << "Predict " << input0_files[i][j] << " failed." << std::endl;
|
||||
return 1;
|
||||
}
|
||||
startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
|
||||
endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
|
||||
costTime_map.insert(std::pair<double, double>(startTimeMs, endTimeMs));
|
||||
int rst = WriteResult(input0_files[i][j], outputs);
|
||||
if (rst != 0) {
|
||||
std::cout << "write result failed." << std::endl;
|
||||
return rst;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
double average = 0.0;
|
||||
int inferCount = 0;
|
||||
|
||||
for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
|
||||
double diff = 0.0;
|
||||
diff = iter->second - iter->first;
|
||||
average += diff;
|
||||
inferCount++;
|
||||
}
|
||||
average = average / inferCount;
|
||||
std::stringstream timeCost;
|
||||
timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << inferCount << std::endl;
|
||||
std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << inferCount << std::endl;
|
||||
std::string fileName = "./time_Result" + std::string("/test_perform_static.txt");
|
||||
std::ofstream fileStream(fileName.c_str(), std::ios::trunc);
|
||||
fileStream << timeCost.str();
|
||||
fileStream.close();
|
||||
costTime_map.clear();
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,197 @@
|
|||
/**
|
||||
* Copyright 2021 Huawei Technologies Co., Ltd
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#include <fstream>
|
||||
#include <algorithm>
|
||||
#include <iostream>
|
||||
#include "inc/utils.h"
|
||||
|
||||
using mindspore::MSTensor;
|
||||
using mindspore::DataType;
|
||||
|
||||
std::vector<std::vector<std::string>> GetAllInputData(std::string dir_name) {
|
||||
std::vector<std::vector<std::string>> ret;
|
||||
|
||||
DIR *dir = OpenDir(dir_name);
|
||||
if (dir == nullptr) {
|
||||
return {};
|
||||
}
|
||||
struct dirent *filename;
|
||||
/* read all the files in the dir ~ */
|
||||
std::vector<std::string> sub_dirs;
|
||||
while ((filename = readdir(dir)) != nullptr) {
|
||||
std::string d_name = std::string(filename->d_name);
|
||||
// get rid of "." and ".."
|
||||
if (d_name == "." || d_name == ".." || d_name.empty()) {
|
||||
continue;
|
||||
}
|
||||
std::string dir_path = RealPath(std::string(dir_name) + "/" + filename->d_name);
|
||||
struct stat s;
|
||||
lstat(dir_path.c_str(), &s);
|
||||
if (!S_ISDIR(s.st_mode)) {
|
||||
continue;
|
||||
}
|
||||
|
||||
sub_dirs.emplace_back(dir_path);
|
||||
}
|
||||
std::sort(sub_dirs.begin(), sub_dirs.end());
|
||||
|
||||
(void)std::transform(sub_dirs.begin(), sub_dirs.end(), std::back_inserter(ret),
|
||||
[](const std::string &d) { return GetAllFiles(d); });
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
||||
std::vector<std::string> GetAllFiles(std::string dir_name) {
|
||||
struct dirent *filename;
|
||||
DIR *dir = OpenDir(dir_name);
|
||||
if (dir == nullptr) {
|
||||
return {};
|
||||
}
|
||||
|
||||
std::vector<std::string> res;
|
||||
while ((filename = readdir(dir)) != nullptr) {
|
||||
std::string d_name = std::string(filename->d_name);
|
||||
if (d_name == "." || d_name == ".." || d_name.size() <= 3) {
|
||||
continue;
|
||||
}
|
||||
res.emplace_back(std::string(dir_name) + "/" + filename->d_name);
|
||||
}
|
||||
std::sort(res.begin(), res.end());
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
|
||||
std::vector<std::string> GetAllFiles(std::string_view dirName) {
|
||||
struct dirent *filename;
|
||||
DIR *dir = OpenDir(dirName);
|
||||
if (dir == nullptr) {
|
||||
return {};
|
||||
}
|
||||
std::vector<std::string> res;
|
||||
while ((filename = readdir(dir)) != nullptr) {
|
||||
std::string dName = std::string(filename->d_name);
|
||||
if (dName == "." || dName == ".." || filename->d_type != DT_REG) {
|
||||
continue;
|
||||
}
|
||||
res.emplace_back(std::string(dirName) + "/" + filename->d_name);
|
||||
}
|
||||
std::sort(res.begin(), res.end());
|
||||
for (auto &f : res) {
|
||||
std::cout << "image file: " << f << std::endl;
|
||||
}
|
||||
return res;
|
||||
}
|
||||
|
||||
|
||||
int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) {
|
||||
std::string homePath = "./result_Files";
|
||||
const int INVALID_POINTER = -1;
|
||||
const int ERROR = -2;
|
||||
for (size_t i = 0; i < outputs.size(); ++i) {
|
||||
size_t outputSize;
|
||||
std::shared_ptr<const void> netOutput;
|
||||
netOutput = outputs[i].Data();
|
||||
outputSize = outputs[i].DataSize();
|
||||
int pos = imageFile.rfind('/');
|
||||
std::string fileName(imageFile, pos + 1);
|
||||
fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
|
||||
std::string outFileName = homePath + "/" + fileName;
|
||||
FILE *outputFile = fopen(outFileName.c_str(), "wb");
|
||||
if (outputFile == nullptr) {
|
||||
std::cout << "open result file " << outFileName << " failed" << std::endl;
|
||||
return INVALID_POINTER;
|
||||
}
|
||||
size_t size = fwrite(netOutput.get(), sizeof(char), outputSize, outputFile);
|
||||
if (size != outputSize) {
|
||||
fclose(outputFile);
|
||||
outputFile = nullptr;
|
||||
std::cout << "write result file " << outFileName << " failed, write size[" << size <<
|
||||
"] is smaller than output size[" << outputSize << "], maybe the disk is full." << std::endl;
|
||||
return ERROR;
|
||||
}
|
||||
fclose(outputFile);
|
||||
outputFile = nullptr;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
mindspore::MSTensor ReadFileToTensor(const std::string &file) {
|
||||
if (file.empty()) {
|
||||
std::cout << "Pointer file is nullptr" << std::endl;
|
||||
return mindspore::MSTensor();
|
||||
}
|
||||
|
||||
std::ifstream ifs(file);
|
||||
if (!ifs.good()) {
|
||||
std::cout << "File: " << file << " is not exist" << std::endl;
|
||||
return mindspore::MSTensor();
|
||||
}
|
||||
|
||||
if (!ifs.is_open()) {
|
||||
std::cout << "File: " << file << "open failed" << std::endl;
|
||||
return mindspore::MSTensor();
|
||||
}
|
||||
|
||||
ifs.seekg(0, std::ios::end);
|
||||
size_t size = ifs.tellg();
|
||||
mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
|
||||
|
||||
ifs.seekg(0, std::ios::beg);
|
||||
ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
|
||||
ifs.close();
|
||||
|
||||
return buffer;
|
||||
}
|
||||
|
||||
|
||||
DIR *OpenDir(std::string_view dirName) {
|
||||
if (dirName.empty()) {
|
||||
std::cout << " dirName is null ! " << std::endl;
|
||||
return nullptr;
|
||||
}
|
||||
std::string realPath = RealPath(dirName);
|
||||
struct stat s;
|
||||
lstat(realPath.c_str(), &s);
|
||||
if (!S_ISDIR(s.st_mode)) {
|
||||
std::cout << "dirName is not a valid directory !" << std::endl;
|
||||
return nullptr;
|
||||
}
|
||||
DIR *dir;
|
||||
dir = opendir(realPath.c_str());
|
||||
if (dir == nullptr) {
|
||||
std::cout << "Can not open dir " << dirName << std::endl;
|
||||
return nullptr;
|
||||
}
|
||||
std::cout << "Successfully opened the dir " << dirName << std::endl;
|
||||
return dir;
|
||||
}
|
||||
|
||||
std::string RealPath(std::string_view path) {
|
||||
char realPathMem[PATH_MAX] = {0};
|
||||
char *realPathRet = nullptr;
|
||||
realPathRet = realpath(path.data(), realPathMem);
|
||||
if (realPathRet == nullptr) {
|
||||
std::cout << "File: " << path << " is not exist.";
|
||||
return "";
|
||||
}
|
||||
|
||||
std::string realPath(realPathMem);
|
||||
std::cout << path << " realpath is: " << realPath << std::endl;
|
||||
return realPath;
|
||||
}
|
|
@ -0,0 +1,558 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Eval Retinaface_resnet50_or_mobilenet0.25."""
|
||||
import argparse
|
||||
import os
|
||||
import time
|
||||
import datetime
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
from mindspore import Tensor, context
|
||||
from mindspore.train.serialization import load_checkpoint, load_param_into_net
|
||||
|
||||
from src.config import cfg_res50, cfg_mobile025
|
||||
from src.utils import decode_bbox, prior_box
|
||||
|
||||
class Timer():
|
||||
def __init__(self):
|
||||
self.start_time = 0.
|
||||
self.diff = 0.
|
||||
|
||||
def start(self):
|
||||
self.start_time = time.time()
|
||||
|
||||
def end(self):
|
||||
self.diff = time.time() - self.start_time
|
||||
|
||||
class DetectionEngine:
|
||||
"""DetectionEngine"""
|
||||
def __init__(self, cfg):
|
||||
self.results = {}
|
||||
self.nms_thresh = cfg['val_nms_threshold']
|
||||
self.conf_thresh = cfg['val_confidence_threshold']
|
||||
self.iou_thresh = cfg['val_iou_threshold']
|
||||
self.var = cfg['variance']
|
||||
self.save_prefix = cfg['val_predict_save_folder']
|
||||
self.gt_dir = cfg['val_gt_dir']
|
||||
|
||||
def _iou(self, a, b):
|
||||
"""_iou"""
|
||||
A = a.shape[0]
|
||||
B = b.shape[0]
|
||||
max_xy = np.minimum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 2:4], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 2:4], 0), [A, B, 2]))
|
||||
min_xy = np.maximum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 0:2], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 0:2], 0), [A, B, 2]))
|
||||
inter = np.maximum((max_xy - min_xy + 1), np.zeros_like(max_xy - min_xy))
|
||||
inter = inter[:, :, 0] * inter[:, :, 1]
|
||||
|
||||
area_a = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(a[:, 2] - a[:, 0] + 1) * (a[:, 3] - a[:, 1] + 1), 1),
|
||||
np.shape(inter))
|
||||
area_b = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(b[:, 2] - b[:, 0] + 1) * (b[:, 3] - b[:, 1] + 1), 0),
|
||||
np.shape(inter))
|
||||
union = area_a + area_b - inter
|
||||
return inter / union
|
||||
|
||||
def _nms(self, boxes, threshold=0.5):
|
||||
"""_nms"""
|
||||
x1 = boxes[:, 0]
|
||||
y1 = boxes[:, 1]
|
||||
x2 = boxes[:, 2]
|
||||
y2 = boxes[:, 3]
|
||||
scores = boxes[:, 4]
|
||||
|
||||
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
||||
order = scores.argsort()[::-1]
|
||||
|
||||
reserved_boxes = []
|
||||
while order.size > 0:
|
||||
i = order[0]
|
||||
reserved_boxes.append(i)
|
||||
max_x1 = np.maximum(x1[i], x1[order[1:]])
|
||||
max_y1 = np.maximum(y1[i], y1[order[1:]])
|
||||
min_x2 = np.minimum(x2[i], x2[order[1:]])
|
||||
min_y2 = np.minimum(y2[i], y2[order[1:]])
|
||||
|
||||
intersect_w = np.maximum(0.0, min_x2 - max_x1 + 1)
|
||||
intersect_h = np.maximum(0.0, min_y2 - max_y1 + 1)
|
||||
intersect_area = intersect_w * intersect_h
|
||||
ovr = intersect_area / (areas[i] + areas[order[1:]] - intersect_area)
|
||||
|
||||
indices = np.where(ovr <= threshold)[0]
|
||||
order = order[indices + 1]
|
||||
|
||||
return reserved_boxes
|
||||
|
||||
def write_result(self):
|
||||
"""write_result"""
|
||||
# save result to file.
|
||||
import json
|
||||
t = datetime.datetime.now().strftime('_%Y_%m_%d_%H_%M_%S')
|
||||
try:
|
||||
if not os.path.isdir(self.save_prefix):
|
||||
os.makedirs(self.save_prefix)
|
||||
|
||||
self.file_path = self.save_prefix + '/predict' + t + '.json'
|
||||
f = open(self.file_path, 'w')
|
||||
json.dump(self.results, f)
|
||||
except IOError as e:
|
||||
raise RuntimeError("Unable to open json file to dump. What(): {}".format(str(e)))
|
||||
else:
|
||||
f.close()
|
||||
return self.file_path
|
||||
|
||||
def detect(self, boxes, confs, resize, scale, image_path, priors):
|
||||
"""detect"""
|
||||
if boxes.shape[0] == 0:
|
||||
# add to result
|
||||
event_name, img_name = image_path.split('/')
|
||||
self.results[event_name][img_name[:-4]] = {'img_path': image_path,
|
||||
'bboxes': []}
|
||||
return
|
||||
|
||||
boxes = decode_bbox(np.squeeze(boxes.asnumpy(), 0), priors, self.var)
|
||||
boxes = boxes * scale / resize
|
||||
|
||||
scores = np.squeeze(confs.asnumpy(), 0)[:, 1]
|
||||
# ignore low scores
|
||||
inds = np.where(scores > self.conf_thresh)[0]
|
||||
boxes = boxes[inds]
|
||||
scores = scores[inds]
|
||||
|
||||
# keep top-K before NMS
|
||||
order = scores.argsort()[::-1]
|
||||
boxes = boxes[order]
|
||||
scores = scores[order]
|
||||
|
||||
# do NMS
|
||||
dets = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
||||
keep = self._nms(dets, self.nms_thresh)
|
||||
dets = dets[keep, :]
|
||||
|
||||
dets[:, 2:4] = (dets[:, 2:4].astype(np.int) - dets[:, 0:2].astype(np.int)).astype(np.float) # int
|
||||
dets[:, 0:4] = dets[:, 0:4].astype(np.int).astype(np.float) # int
|
||||
|
||||
|
||||
# add to result
|
||||
event_name, img_name = image_path.split('/')
|
||||
if event_name not in self.results.keys():
|
||||
self.results[event_name] = {}
|
||||
self.results[event_name][img_name[:-4]] = {'img_path': image_path,
|
||||
'bboxes': dets[:, :5].astype(np.float).tolist()}
|
||||
|
||||
def _get_gt_boxes(self):
|
||||
"""_get_gt_boxes"""
|
||||
from scipy.io import loadmat
|
||||
gt = loadmat(os.path.join(self.gt_dir, 'wider_face_val.mat'))
|
||||
hard = loadmat(os.path.join(self.gt_dir, 'wider_hard_val.mat'))
|
||||
medium = loadmat(os.path.join(self.gt_dir, 'wider_medium_val.mat'))
|
||||
easy = loadmat(os.path.join(self.gt_dir, 'wider_easy_val.mat'))
|
||||
|
||||
faceboxes = gt['face_bbx_list']
|
||||
events = gt['event_list']
|
||||
files = gt['file_list']
|
||||
|
||||
hard_gt_list = hard['gt_list']
|
||||
medium_gt_list = medium['gt_list']
|
||||
easy_gt_list = easy['gt_list']
|
||||
|
||||
return faceboxes, events, files, hard_gt_list, medium_gt_list, easy_gt_list
|
||||
|
||||
def _norm_pre_score(self):
|
||||
"""_norm_pre_score"""
|
||||
max_score = 0
|
||||
min_score = 1
|
||||
|
||||
for event in self.results:
|
||||
for name in self.results[event].keys():
|
||||
bbox = np.array(self.results[event][name]['bboxes']).astype(np.float)
|
||||
if bbox.shape[0] <= 0:
|
||||
continue
|
||||
max_score = max(max_score, np.max(bbox[:, -1]))
|
||||
min_score = min(min_score, np.min(bbox[:, -1]))
|
||||
|
||||
length = max_score - min_score
|
||||
for event in self.results:
|
||||
for name in self.results[event].keys():
|
||||
bbox = np.array(self.results[event][name]['bboxes']).astype(np.float)
|
||||
if bbox.shape[0] <= 0:
|
||||
continue
|
||||
bbox[:, -1] -= min_score
|
||||
bbox[:, -1] /= length
|
||||
self.results[event][name]['bboxes'] = bbox.tolist()
|
||||
|
||||
def _image_eval(self, predict, gt, keep, iou_thresh, section_num):
|
||||
"""_image_eval"""
|
||||
_predict = predict.copy()
|
||||
_gt = gt.copy()
|
||||
|
||||
image_p_right = np.zeros(_predict.shape[0])
|
||||
image_gt_right = np.zeros(_gt.shape[0])
|
||||
proposal = np.ones(_predict.shape[0])
|
||||
|
||||
# x1y1wh -> x1y1x2y2
|
||||
_predict[:, 2:4] = _predict[:, 0:2] + _predict[:, 2:4]
|
||||
_gt[:, 2:4] = _gt[:, 0:2] + _gt[:, 2:4]
|
||||
|
||||
ious = self._iou(_predict[:, 0:4], _gt[:, 0:4])
|
||||
for i in range(_predict.shape[0]):
|
||||
gt_ious = ious[i, :]
|
||||
max_iou, max_index = gt_ious.max(), gt_ious.argmax()
|
||||
if max_iou >= iou_thresh:
|
||||
if keep[max_index] == 0:
|
||||
image_gt_right[max_index] = -1
|
||||
proposal[i] = -1
|
||||
elif image_gt_right[max_index] == 0:
|
||||
image_gt_right[max_index] = 1
|
||||
|
||||
right_index = np.where(image_gt_right == 1)[0]
|
||||
image_p_right[i] = len(right_index)
|
||||
|
||||
|
||||
|
||||
image_pr = np.zeros((section_num, 2), dtype=np.float)
|
||||
for section in range(section_num):
|
||||
_thresh = 1 - (section + 1)/section_num
|
||||
over_score_index = np.where(predict[:, 4] >= _thresh)[0]
|
||||
if over_score_index.shape[0] <= 0:
|
||||
image_pr[section, 0] = 0
|
||||
image_pr[section, 1] = 0
|
||||
else:
|
||||
index = over_score_index[-1]
|
||||
p_num = len(np.where(proposal[0:(index+1)] == 1)[0])
|
||||
image_pr[section, 0] = p_num
|
||||
image_pr[section, 1] = image_p_right[index]
|
||||
|
||||
return image_pr
|
||||
|
||||
|
||||
def get_eval_result(self):
|
||||
"""get_eval_result"""
|
||||
self._norm_pre_score()
|
||||
facebox_list, event_list, file_list, hard_gt_list, medium_gt_list, easy_gt_list = self._get_gt_boxes()
|
||||
section_num = 1000
|
||||
sets = ['easy', 'medium', 'hard']
|
||||
set_gts = [easy_gt_list, medium_gt_list, hard_gt_list]
|
||||
ap_key_dict = {0: "Easy Val AP : ", 1: "Medium Val AP : ", 2: "Hard Val AP : ",}
|
||||
ap_dict = {}
|
||||
for _set in range(len(sets)):
|
||||
gt_list = set_gts[_set]
|
||||
count_gt = 0
|
||||
pr_curve = np.zeros((section_num, 2), dtype=np.float)
|
||||
for i, _ in enumerate(event_list):
|
||||
event = str(event_list[i][0][0])
|
||||
image_list = file_list[i][0]
|
||||
event_predict_dict = self.results[event]
|
||||
event_gt_index_list = gt_list[i][0]
|
||||
event_gt_box_list = facebox_list[i][0]
|
||||
|
||||
for j, _ in enumerate(image_list):
|
||||
predict = np.array(event_predict_dict[str(image_list[j][0][0])]['bboxes']).astype(np.float)
|
||||
gt_boxes = event_gt_box_list[j][0].astype('float')
|
||||
keep_index = event_gt_index_list[j][0]
|
||||
count_gt += len(keep_index)
|
||||
|
||||
if gt_boxes.shape[0] <= 0 or predict.shape[0] <= 0:
|
||||
continue
|
||||
keep = np.zeros(gt_boxes.shape[0])
|
||||
if keep_index.shape[0] > 0:
|
||||
keep[keep_index-1] = 1
|
||||
|
||||
image_pr = self._image_eval(predict, gt_boxes, keep,
|
||||
iou_thresh=self.iou_thresh,
|
||||
section_num=section_num)
|
||||
pr_curve += image_pr
|
||||
|
||||
precision = pr_curve[:, 1] / pr_curve[:, 0]
|
||||
recall = pr_curve[:, 1] / count_gt
|
||||
|
||||
precision = np.concatenate((np.array([0.]), precision, np.array([0.])))
|
||||
recall = np.concatenate((np.array([0.]), recall, np.array([1.])))
|
||||
for i in range(precision.shape[0]-1, 0, -1):
|
||||
precision[i-1] = np.maximum(precision[i-1], precision[i])
|
||||
index = np.where(recall[1:] != recall[:-1])[0]
|
||||
ap = np.sum((recall[index + 1] - recall[index]) * precision[index + 1])
|
||||
|
||||
|
||||
print(ap_key_dict[_set] + '{:.4f}'.format(ap))
|
||||
|
||||
return ap_dict
|
||||
|
||||
|
||||
def val_with_resnet(args_opt):
|
||||
"""val_with_resnet"""
|
||||
from src.network_with_resnet import RetinaFace, resnet50
|
||||
cfg = cfg_res50
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target='Ascend', device_id=cfg['device_id'], save_graphs=False)
|
||||
|
||||
backbone = resnet50(1001)
|
||||
network = RetinaFace(phase='predict', backbone=backbone)
|
||||
backbone.set_train(False)
|
||||
network.set_train(False)
|
||||
|
||||
# load checkpoint
|
||||
assert args_opt.val_model is not None, 'val_model is None.'
|
||||
param_dict = load_checkpoint(args_opt.val_model)
|
||||
print('Load trained model done. {}'.format(args_opt.val_model))
|
||||
network.init_parameters_data()
|
||||
load_param_into_net(network, param_dict)
|
||||
|
||||
# testing dataset
|
||||
testset_folder = cfg['val_dataset_folder']
|
||||
testset_label_path = cfg['val_dataset_folder'] + "label.txt"
|
||||
with open(testset_label_path, 'r') as f:
|
||||
_test_dataset = f.readlines()
|
||||
test_dataset = []
|
||||
for im_path in _test_dataset:
|
||||
if im_path.startswith('# '):
|
||||
test_dataset.append(im_path[2:-1]) # delete '# ...\n'
|
||||
|
||||
num_images = len(test_dataset)
|
||||
|
||||
timers = {'forward_time': Timer(), 'misc': Timer()}
|
||||
|
||||
if cfg['val_origin_size']:
|
||||
h_max, w_max = 0, 0
|
||||
for img_name in test_dataset:
|
||||
image_path = os.path.join(testset_folder, 'images', img_name)
|
||||
_img = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
if _img.shape[0] > h_max:
|
||||
h_max = _img.shape[0]
|
||||
if _img.shape[1] > w_max:
|
||||
w_max = _img.shape[1]
|
||||
|
||||
h_max = (int(h_max / 32) + 1) * 32
|
||||
w_max = (int(w_max / 32) + 1) * 32
|
||||
|
||||
priors = prior_box(image_sizes=(h_max, w_max),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
else:
|
||||
target_size = 1600
|
||||
max_size = 2176
|
||||
priors = prior_box(image_sizes=(max_size, max_size),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
|
||||
# init detection engine
|
||||
detection = DetectionEngine(cfg)
|
||||
|
||||
# testing begin
|
||||
print('Predict box starting')
|
||||
for i, img_name in enumerate(test_dataset):
|
||||
image_path = os.path.join(testset_folder, 'images', img_name)
|
||||
|
||||
img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
img = np.float32(img_raw)
|
||||
|
||||
# testing scale
|
||||
if cfg['val_origin_size']:
|
||||
resize = 1
|
||||
assert img.shape[0] <= h_max and img.shape[1] <= w_max
|
||||
image_t = np.empty((h_max, w_max, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
else:
|
||||
im_size_min = np.min(img.shape[0:2])
|
||||
im_size_max = np.max(img.shape[0:2])
|
||||
resize = float(target_size) / float(im_size_min)
|
||||
# prevent bigger axis from being more than max_size:
|
||||
if np.round(resize * im_size_max) > max_size:
|
||||
resize = float(max_size) / float(im_size_max)
|
||||
|
||||
img = cv2.resize(img, None, None, fx=resize, fy=resize, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
assert img.shape[0] <= max_size and img.shape[1] <= max_size
|
||||
image_t = np.empty((max_size, max_size, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
|
||||
scale = np.array([img.shape[1], img.shape[0], img.shape[1], img.shape[0]], dtype=img.dtype)
|
||||
img -= (104, 117, 123)
|
||||
img = img.transpose(2, 0, 1)
|
||||
img = np.expand_dims(img, 0)
|
||||
img = Tensor(img)
|
||||
|
||||
timers['forward_time'].start()
|
||||
boxes, confs, _ = network(img)
|
||||
timers['forward_time'].end()
|
||||
timers['misc'].start()
|
||||
detection.detect(boxes, confs, resize, scale, img_name, priors)
|
||||
timers['misc'].end()
|
||||
|
||||
print('im_detect: {:d}/{:d} forward_pass_time: {:.4f}s misc: {:.4f}s'.format(i + 1, num_images,
|
||||
timers['forward_time'].diff,
|
||||
timers['misc'].diff))
|
||||
print('Predict box done.')
|
||||
print('Eval starting')
|
||||
|
||||
if cfg['val_save_result']:
|
||||
# Save the predict result if you want.
|
||||
predict_result_path = detection.write_result()
|
||||
print('predict result path is {}'.format(predict_result_path))
|
||||
|
||||
detection.get_eval_result()
|
||||
print(args_opt.val_model)
|
||||
print('Eval done.')
|
||||
|
||||
|
||||
def val_with_mobilenet(args_opt):
|
||||
"""val_with_mobilenet"""
|
||||
from src.network_with_mobilenet import RetinaFace, resnet50, mobilenet025
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target='GPU', save_graphs=False)
|
||||
|
||||
# cfg = cfg_res50
|
||||
cfg = cfg_mobile025
|
||||
|
||||
if cfg['name'] == 'ResNet50':
|
||||
backbone = resnet50(1001)
|
||||
elif cfg['name'] == 'MobileNet025':
|
||||
backbone = mobilenet025(1000)
|
||||
network = RetinaFace(phase='predict', backbone=backbone, cfg=cfg)
|
||||
backbone.set_train(False)
|
||||
network.set_train(False)
|
||||
|
||||
# load checkpoint
|
||||
assert args_opt.val_model is not None, 'val_model is None.'
|
||||
param_dict = load_checkpoint(args_opt.val_model)
|
||||
print('Load trained model done. {}'.format(args_opt.val_model))
|
||||
network.init_parameters_data()
|
||||
load_param_into_net(network, param_dict)
|
||||
|
||||
# testing dataset
|
||||
testset_folder = cfg['val_dataset_folder']
|
||||
testset_label_path = cfg['val_dataset_folder'] + "label.txt"
|
||||
with open(testset_label_path, 'r') as f:
|
||||
_test_dataset = f.readlines()
|
||||
test_dataset = []
|
||||
for im_path in _test_dataset:
|
||||
if im_path.startswith('# '):
|
||||
test_dataset.append(im_path[2:-1]) # delete '# ...\n'
|
||||
|
||||
num_images = len(test_dataset)
|
||||
|
||||
timers = {'forward_time': Timer(), 'misc': Timer()}
|
||||
|
||||
if cfg['val_origin_size']:
|
||||
h_max, w_max = 0, 0
|
||||
for img_name in test_dataset:
|
||||
image_path = os.path.join(testset_folder, 'images', img_name)
|
||||
_img = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
if _img.shape[0] > h_max:
|
||||
h_max = _img.shape[0]
|
||||
if _img.shape[1] > w_max:
|
||||
w_max = _img.shape[1]
|
||||
|
||||
h_max = (int(h_max / 32) + 1) * 32
|
||||
w_max = (int(w_max / 32) + 1) * 32
|
||||
|
||||
priors = prior_box(image_sizes=(h_max, w_max),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
else:
|
||||
target_size = 1600
|
||||
max_size = 2176
|
||||
priors = prior_box(image_sizes=(max_size, max_size),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
|
||||
# init detection engine
|
||||
detection = DetectionEngine(cfg)
|
||||
|
||||
# testing begin
|
||||
print('Predict box starting')
|
||||
for i, img_name in enumerate(test_dataset):
|
||||
image_path = os.path.join(testset_folder, 'images', img_name)
|
||||
|
||||
img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
img = np.float32(img_raw)
|
||||
|
||||
# testing scale
|
||||
if cfg['val_origin_size']:
|
||||
resize = 1
|
||||
assert img.shape[0] <= h_max and img.shape[1] <= w_max
|
||||
image_t = np.empty((h_max, w_max, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
else:
|
||||
im_size_min = np.min(img.shape[0:2])
|
||||
im_size_max = np.max(img.shape[0:2])
|
||||
resize = float(target_size) / float(im_size_min)
|
||||
# prevent bigger axis from being more than max_size:
|
||||
if np.round(resize * im_size_max) > max_size:
|
||||
resize = float(max_size) / float(im_size_max)
|
||||
|
||||
img = cv2.resize(img, None, None, fx=resize, fy=resize, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
assert img.shape[0] <= max_size and img.shape[1] <= max_size
|
||||
image_t = np.empty((max_size, max_size, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
|
||||
scale = np.array([img.shape[1], img.shape[0], img.shape[1], img.shape[0]], dtype=img.dtype)
|
||||
img -= (104, 117, 123)
|
||||
img = img.transpose(2, 0, 1)
|
||||
img = np.expand_dims(img, 0)
|
||||
img = Tensor(img)
|
||||
|
||||
timers['forward_time'].start()
|
||||
boxes, confs, _ = network(img)
|
||||
timers['forward_time'].end()
|
||||
timers['misc'].start()
|
||||
detection.detect(boxes, confs, resize, scale, img_name, priors)
|
||||
timers['misc'].end()
|
||||
|
||||
print('im_detect: {:d}/{:d} forward_pass_time: {:.4f}s misc: {:.4f}s'.format(i + 1, num_images,
|
||||
timers['forward_time'].diff,
|
||||
timers['misc'].diff))
|
||||
print('Predict box done.')
|
||||
print('Eval starting')
|
||||
|
||||
if cfg['val_save_result']:
|
||||
# Save the predict result if you want.
|
||||
predict_result_path = detection.write_result()
|
||||
print('predict result path is {}'.format(predict_result_path))
|
||||
|
||||
detection.get_eval_result()
|
||||
print('Eval done.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='val')
|
||||
parser.add_argument('--backbone_name', type=str, default='ResNet50',
|
||||
help='backbone name')
|
||||
parser.add_argument('--val_model', type=str, default='./train_parallel3/checkpoint/ckpt_3/RetinaFace-56_201.ckpt',
|
||||
help='val_model location')
|
||||
args = parser.parse_args()
|
||||
if args_opt.backbone_name == 'ResNet50':
|
||||
val_with_resnet(args_opt=args)
|
||||
elif args_opt.backbone_name == 'MobileNet025':
|
||||
val_with_mobilenet(args_opt=args)
|
|
@ -0,0 +1,62 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""
|
||||
##############export checkpoint file into air, onnx or mindir model#################
|
||||
python export.py
|
||||
"""
|
||||
import argparse
|
||||
import numpy as np
|
||||
|
||||
from mindspore import Tensor, load_checkpoint, load_param_into_net, export, context
|
||||
|
||||
from src.network import RetinaFace, resnet50
|
||||
from src.config import cfg_res50
|
||||
|
||||
parser = argparse.ArgumentParser(description='senet export')
|
||||
parser.add_argument("--device_id", type=int, default=0, help="Device id")
|
||||
parser.add_argument("--batch_size", type=int, default=1, help="batch size")
|
||||
parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
|
||||
parser.add_argument("--file_name", type=str, default="retinaface", help="output file name.")
|
||||
parser.add_argument("--file_format", type=str, choices=["AIR", "ONNX", "MINDIR"], default="MINDIR", help="file format")
|
||||
parser.add_argument("--device_target", type=str, default="Ascend",
|
||||
choices=["Ascend", "GPU", "CPU"], help="device target(default: Ascend)")
|
||||
args = parser.parse_args()
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
|
||||
if args.device_target == "Ascend":
|
||||
context.set_context(device_id=args.device_id)
|
||||
|
||||
|
||||
def export_net():
|
||||
"""export net"""
|
||||
if cfg_res50['val_origin_size']:
|
||||
height, width = 5568, 1056
|
||||
else:
|
||||
height, width = 2176, 2176
|
||||
|
||||
backbone = resnet50(1001)
|
||||
net = RetinaFace(phase='predict', backbone=backbone)
|
||||
backbone.set_train(False)
|
||||
net.set_train(False)
|
||||
|
||||
assert args.ckpt_file is not None, "checkpoint_path is None."
|
||||
param_dict = load_checkpoint(args.ckpt_file)
|
||||
net.init_parameters_data()
|
||||
load_param_into_net(net, param_dict)
|
||||
input_arr = Tensor(np.zeros([args.batch_size, 3, height, width], np.float32))
|
||||
export(net, input_arr, file_name=args.file_name, file_format=args.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_net()
|
|
@ -0,0 +1,423 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Infer Retinaface_resnet50."""
|
||||
from __future__ import print_function
|
||||
import argparse
|
||||
import os
|
||||
import time
|
||||
import datetime
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
from mindspore import context
|
||||
|
||||
from src.config import cfg_res50
|
||||
from src.utils import decode_bbox, prior_box
|
||||
|
||||
class Timer():
|
||||
def __init__(self):
|
||||
self.start_time = 0.
|
||||
self.diff = 0.
|
||||
|
||||
def start(self):
|
||||
self.start_time = time.time()
|
||||
|
||||
def end(self):
|
||||
self.diff = time.time() - self.start_time
|
||||
|
||||
class DetectionEngine:
|
||||
"""DetectionEngine"""
|
||||
def __init__(self, cfg):
|
||||
self.results = {}
|
||||
self.nms_thresh = cfg['val_nms_threshold']
|
||||
self.conf_thresh = cfg['val_confidence_threshold']
|
||||
self.iou_thresh = cfg['val_iou_threshold']
|
||||
self.var = cfg['variance']
|
||||
self.save_prefix = cfg['val_predict_save_folder']
|
||||
self.gt_dir = cfg['infer_gt_dir']
|
||||
|
||||
def _iou(self, a, b):
|
||||
"""_iou"""
|
||||
A = a.shape[0]
|
||||
B = b.shape[0]
|
||||
max_xy = np.minimum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 2:4], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 2:4], 0), [A, B, 2]))
|
||||
min_xy = np.maximum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 0:2], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 0:2], 0), [A, B, 2]))
|
||||
inter = np.maximum((max_xy - min_xy + 1), np.zeros_like(max_xy - min_xy))
|
||||
inter = inter[:, :, 0] * inter[:, :, 1]
|
||||
|
||||
area_a = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(a[:, 2] - a[:, 0] + 1) * (a[:, 3] - a[:, 1] + 1), 1),
|
||||
np.shape(inter))
|
||||
area_b = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(b[:, 2] - b[:, 0] + 1) * (b[:, 3] - b[:, 1] + 1), 0),
|
||||
np.shape(inter))
|
||||
union = area_a + area_b - inter
|
||||
return inter / union
|
||||
|
||||
def _nms(self, boxes, threshold=0.5):
|
||||
"""_nms"""
|
||||
x1 = boxes[:, 0]
|
||||
y1 = boxes[:, 1]
|
||||
x2 = boxes[:, 2]
|
||||
y2 = boxes[:, 3]
|
||||
scores = boxes[:, 4]
|
||||
|
||||
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
||||
order = scores.argsort()[::-1]
|
||||
|
||||
reserved_boxes = []
|
||||
while order.size > 0:
|
||||
i = order[0]
|
||||
reserved_boxes.append(i)
|
||||
max_x1 = np.maximum(x1[i], x1[order[1:]])
|
||||
max_y1 = np.maximum(y1[i], y1[order[1:]])
|
||||
min_x2 = np.minimum(x2[i], x2[order[1:]])
|
||||
min_y2 = np.minimum(y2[i], y2[order[1:]])
|
||||
|
||||
intersect_w = np.maximum(0.0, min_x2 - max_x1 + 1)
|
||||
intersect_h = np.maximum(0.0, min_y2 - max_y1 + 1)
|
||||
intersect_area = intersect_w * intersect_h
|
||||
ovr = intersect_area / (areas[i] + areas[order[1:]] - intersect_area)
|
||||
|
||||
indices = np.where(ovr <= threshold)[0]
|
||||
order = order[indices + 1]
|
||||
|
||||
return reserved_boxes
|
||||
|
||||
def write_result(self):
|
||||
"""write_result"""
|
||||
# save result to file.
|
||||
import json
|
||||
t = datetime.datetime.now().strftime('_%Y_%m_%d_%H_%M_%S')
|
||||
try:
|
||||
if not os.path.isdir(self.save_prefix):
|
||||
os.makedirs(self.save_prefix)
|
||||
|
||||
self.file_path = self.save_prefix + '/predict' + t + '.json'
|
||||
f = open(self.file_path, 'w')
|
||||
json.dump(self.results, f)
|
||||
except IOError as e:
|
||||
raise RuntimeError("Unable to open json file to dump. What(): {}".format(str(e)))
|
||||
else:
|
||||
f.close()
|
||||
return self.file_path
|
||||
|
||||
def detect(self, boxes, confs, resize, scale, image_path, priors):
|
||||
"""detect"""
|
||||
if boxes.shape[0] == 0:
|
||||
# add to result
|
||||
event_name, img_name = image_path.split('/')
|
||||
self.results[event_name][img_name[:-4]] = {'img_path': image_path,
|
||||
'bboxes': []}
|
||||
return
|
||||
|
||||
boxes = decode_bbox(np.squeeze(boxes, 0), priors, self.var)
|
||||
boxes = boxes * scale / resize
|
||||
|
||||
scores = np.squeeze(confs, 0)[:, 1]
|
||||
# ignore low scores
|
||||
inds = np.where(scores > self.conf_thresh)[0]
|
||||
boxes = boxes[inds]
|
||||
scores = scores[inds]
|
||||
|
||||
# keep top-K before NMS
|
||||
order = scores.argsort()[::-1]
|
||||
boxes = boxes[order]
|
||||
scores = scores[order]
|
||||
|
||||
# do NMS
|
||||
dets = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
||||
keep = self._nms(dets, self.nms_thresh)
|
||||
dets = dets[keep, :]
|
||||
|
||||
dets[:, 2:4] = (dets[:, 2:4].astype(np.int) - dets[:, 0:2].astype(np.int)).astype(np.float) # int
|
||||
dets[:, 0:4] = dets[:, 0:4].astype(np.int).astype(np.float) # int
|
||||
|
||||
|
||||
# add to result
|
||||
event_name, img_name = image_path.split('/')
|
||||
if event_name not in self.results.keys():
|
||||
self.results[event_name] = {}
|
||||
self.results[event_name][img_name[:-4]] = {'img_path': image_path,
|
||||
'bboxes': dets[:, :5].astype(np.float).tolist()}
|
||||
|
||||
def _get_gt_boxes(self):
|
||||
"""_get_gt_boxes"""
|
||||
from scipy.io import loadmat
|
||||
gt = loadmat(os.path.join(self.gt_dir, 'wider_face_val.mat'))
|
||||
hard = loadmat(os.path.join(self.gt_dir, 'wider_hard_val.mat'))
|
||||
medium = loadmat(os.path.join(self.gt_dir, 'wider_medium_val.mat'))
|
||||
easy = loadmat(os.path.join(self.gt_dir, 'wider_easy_val.mat'))
|
||||
|
||||
faceboxes = gt['face_bbx_list']
|
||||
events = gt['event_list']
|
||||
files = gt['file_list']
|
||||
|
||||
hard_gt_list = hard['gt_list']
|
||||
medium_gt_list = medium['gt_list']
|
||||
easy_gt_list = easy['gt_list']
|
||||
|
||||
return faceboxes, events, files, hard_gt_list, medium_gt_list, easy_gt_list
|
||||
|
||||
def _norm_pre_score(self):
|
||||
"""_norm_pre_score"""
|
||||
max_score = 0
|
||||
min_score = 1
|
||||
|
||||
for event in self.results:
|
||||
for name in self.results[event].keys():
|
||||
bbox = np.array(self.results[event][name]['bboxes']).astype(np.float)
|
||||
if bbox.shape[0] <= 0:
|
||||
continue
|
||||
max_score = max(max_score, np.max(bbox[:, -1]))
|
||||
min_score = min(min_score, np.min(bbox[:, -1]))
|
||||
|
||||
length = max_score - min_score
|
||||
for event in self.results:
|
||||
for name in self.results[event].keys():
|
||||
bbox = np.array(self.results[event][name]['bboxes']).astype(np.float)
|
||||
if bbox.shape[0] <= 0:
|
||||
continue
|
||||
bbox[:, -1] -= min_score
|
||||
bbox[:, -1] /= length
|
||||
self.results[event][name]['bboxes'] = bbox.tolist()
|
||||
|
||||
def _image_eval(self, predict, gt, keep, iou_thresh, section_num):
|
||||
"""_image_eval"""
|
||||
_predict = predict.copy()
|
||||
_gt = gt.copy()
|
||||
|
||||
image_p_right = np.zeros(_predict.shape[0])
|
||||
image_gt_right = np.zeros(_gt.shape[0])
|
||||
proposal = np.ones(_predict.shape[0])
|
||||
|
||||
# x1y1wh -> x1y1x2y2
|
||||
_predict[:, 2:4] = _predict[:, 0:2] + _predict[:, 2:4]
|
||||
_gt[:, 2:4] = _gt[:, 0:2] + _gt[:, 2:4]
|
||||
|
||||
ious = self._iou(_predict[:, 0:4], _gt[:, 0:4])
|
||||
for i in range(_predict.shape[0]):
|
||||
gt_ious = ious[i, :]
|
||||
max_iou, max_index = gt_ious.max(), gt_ious.argmax()
|
||||
if max_iou >= iou_thresh:
|
||||
if keep[max_index] == 0:
|
||||
image_gt_right[max_index] = -1
|
||||
proposal[i] = -1
|
||||
elif image_gt_right[max_index] == 0:
|
||||
image_gt_right[max_index] = 1
|
||||
|
||||
right_index = np.where(image_gt_right == 1)[0]
|
||||
image_p_right[i] = len(right_index)
|
||||
|
||||
|
||||
|
||||
image_pr = np.zeros((section_num, 2), dtype=np.float)
|
||||
for section in range(section_num):
|
||||
_thresh = 1 - (section + 1)/section_num
|
||||
over_score_index = np.where(predict[:, 4] >= _thresh)[0]
|
||||
if over_score_index.shape[0] <= 0:
|
||||
image_pr[section, 0] = 0
|
||||
image_pr[section, 1] = 0
|
||||
else:
|
||||
index = over_score_index[-1]
|
||||
p_num = len(np.where(proposal[0:(index+1)] == 1)[0])
|
||||
image_pr[section, 0] = p_num
|
||||
image_pr[section, 1] = image_p_right[index]
|
||||
|
||||
return image_pr
|
||||
|
||||
|
||||
def get_eval_result(self):
|
||||
"""get_eval_result"""
|
||||
self._norm_pre_score()
|
||||
facebox_list, event_list, file_list, hard_gt_list, medium_gt_list, easy_gt_list = self._get_gt_boxes()
|
||||
section_num = 1000
|
||||
sets = ['easy', 'medium', 'hard']
|
||||
set_gts = [easy_gt_list, medium_gt_list, hard_gt_list]
|
||||
ap_key_dict = {0: "Easy Val AP : ", 1: "Medium Val AP : ", 2: "Hard Val AP : ",}
|
||||
ap_dict = {}
|
||||
for _set in range(len(sets)):
|
||||
gt_list = set_gts[_set]
|
||||
count_gt = 0
|
||||
pr_curve = np.zeros((section_num, 2), dtype=np.float)
|
||||
for i, _ in enumerate(event_list):
|
||||
event = str(event_list[i][0][0])
|
||||
image_list = file_list[i][0]
|
||||
event_predict_dict = self.results[event]
|
||||
event_gt_index_list = gt_list[i][0]
|
||||
event_gt_box_list = facebox_list[i][0]
|
||||
|
||||
for j, _ in enumerate(image_list):
|
||||
predict = np.array(event_predict_dict[str(image_list[j][0][0])]['bboxes']).astype(np.float)
|
||||
gt_boxes = event_gt_box_list[j][0].astype('float')
|
||||
keep_index = event_gt_index_list[j][0]
|
||||
count_gt += len(keep_index)
|
||||
|
||||
if gt_boxes.shape[0] <= 0 or predict.shape[0] <= 0:
|
||||
continue
|
||||
keep = np.zeros(gt_boxes.shape[0])
|
||||
if keep_index.shape[0] > 0:
|
||||
keep[keep_index-1] = 1
|
||||
|
||||
image_pr = self._image_eval(predict, gt_boxes, keep,
|
||||
iou_thresh=self.iou_thresh,
|
||||
section_num=section_num)
|
||||
pr_curve += image_pr
|
||||
|
||||
precision = pr_curve[:, 1] / pr_curve[:, 0]
|
||||
recall = pr_curve[:, 1] / count_gt
|
||||
|
||||
precision = np.concatenate((np.array([0.]), precision, np.array([0.])))
|
||||
recall = np.concatenate((np.array([0.]), recall, np.array([1.])))
|
||||
for i in range(precision.shape[0]-1, 0, -1):
|
||||
precision[i-1] = np.maximum(precision[i-1], precision[i])
|
||||
index = np.where(recall[1:] != recall[:-1])[0]
|
||||
ap = np.sum((recall[index + 1] - recall[index]) * precision[index + 1])
|
||||
|
||||
|
||||
print(ap_key_dict[_set] + '{:.4f}'.format(ap))
|
||||
|
||||
return ap_dict
|
||||
|
||||
|
||||
def val():
|
||||
"""val"""
|
||||
parser = argparse.ArgumentParser(description='Postprocess file')
|
||||
parser.add_argument('--device_id', type=int, default=0, help='device id.')
|
||||
args_opt = parser.parse_args()
|
||||
cfg = cfg_res50
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target='Ascend',
|
||||
device_id=args_opt.device_id, save_graphs=False)
|
||||
|
||||
# testing dataset
|
||||
testset_folder = cfg['infer_dataset_folder']
|
||||
testset_label_path = cfg['infer_dataset_folder'] + "label.txt"
|
||||
with open(testset_label_path, 'r') as f:
|
||||
_test_dataset = f.readlines()
|
||||
test_dataset = [] # such as "0--Parade/0_Parade_marchingband_1_465.jpg"
|
||||
for im_path in _test_dataset:
|
||||
if im_path.startswith('# '):
|
||||
test_dataset.append(im_path[2:-1]) # delete '# ...\n'
|
||||
|
||||
num_images = len(test_dataset)
|
||||
|
||||
timers = {'forward_time': Timer(), 'misc': Timer()}
|
||||
|
||||
if cfg['val_origin_size']:
|
||||
h_max, w_max = 0, 0
|
||||
for img_name in test_dataset:
|
||||
image_path = os.path.join(testset_folder, 'images', img_name) # .jpg's location
|
||||
_img = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
if _img.shape[0] > h_max:
|
||||
h_max = _img.shape[0]
|
||||
if _img.shape[1] > w_max:
|
||||
w_max = _img.shape[1]
|
||||
|
||||
h_max = (int(h_max / 32) + 1) * 32
|
||||
w_max = (int(w_max / 32) + 1) * 32
|
||||
|
||||
priors = prior_box(image_sizes=(h_max, w_max),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
else:
|
||||
target_size = 1600
|
||||
max_size = 2176
|
||||
priors = prior_box(image_sizes=(max_size, max_size),
|
||||
min_sizes=[[16, 32], [64, 128], [256, 512]],
|
||||
steps=[8, 16, 32],
|
||||
clip=False)
|
||||
|
||||
# init detection engine
|
||||
detection = DetectionEngine(cfg)
|
||||
|
||||
# testing begin
|
||||
print('Predict box starting')
|
||||
for i, img_name in enumerate(test_dataset):
|
||||
image_path = os.path.join(testset_folder, 'images', img_name) # .jpg's location
|
||||
|
||||
img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
img = np.float32(img_raw)
|
||||
|
||||
# testing scale
|
||||
if cfg['val_origin_size']:
|
||||
resize = 1
|
||||
assert img.shape[0] <= h_max and img.shape[1] <= w_max
|
||||
image_t = np.empty((h_max, w_max, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
else:
|
||||
im_size_min = np.min(img.shape[0:2])
|
||||
im_size_max = np.max(img.shape[0:2])
|
||||
resize = float(target_size) / float(im_size_min)
|
||||
# prevent bigger axis from being more than max_size:
|
||||
if np.round(resize * im_size_max) > max_size:
|
||||
resize = float(max_size) / float(im_size_max)
|
||||
|
||||
img = cv2.resize(img, None, None, fx=resize, fy=resize, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
assert img.shape[0] <= max_size and img.shape[1] <= max_size
|
||||
image_t = np.empty((max_size, max_size, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
|
||||
scale = np.array([img.shape[1], img.shape[0], img.shape[1], img.shape[0]], dtype=img.dtype)
|
||||
|
||||
timers['forward_time'].start()
|
||||
boxes_name = os.path.join("./result_Files", "widerface_test" + "_" + str(i) + "_0.bin")
|
||||
boxes = np.fromfile(boxes_name, np.float32)
|
||||
if cfg['val_origin_size']:
|
||||
boxes = boxes.reshape(1, 241164, 4)
|
||||
else:
|
||||
boxes = boxes.reshape(1, 194208, 4)
|
||||
confs_name = os.path.join("./result_Files", "widerface_test" + "_" + str(i) + "_1.bin")
|
||||
confs = np.fromfile(confs_name, np.float32)
|
||||
if cfg['val_origin_size']:
|
||||
confs = confs.reshape(1, 241164, 2)
|
||||
else:
|
||||
confs = confs.reshape(1, 194208, 2)
|
||||
|
||||
timers['forward_time'].end()
|
||||
timers['misc'].start()
|
||||
detection.detect(boxes, confs, resize, scale, img_name, priors)
|
||||
timers['misc'].end()
|
||||
|
||||
print('im_detect: {:d}/{:d} forward_pass_time: {:.4f}s misc: {:.4f}s'.format(i + 1, num_images,
|
||||
timers['forward_time'].diff,
|
||||
timers['misc'].diff))
|
||||
print('Predict box done.')
|
||||
print('Eval starting')
|
||||
|
||||
if cfg['val_save_result']:
|
||||
# Save the predict result if you want.
|
||||
predict_result_path = detection.write_result()
|
||||
print('predict result path is {}'.format(predict_result_path))
|
||||
|
||||
detection.get_eval_result()
|
||||
print(cfg['val_model'])
|
||||
print('Eval done.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
val()
|
|
@ -0,0 +1,88 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""preprocess"""
|
||||
from __future__ import print_function
|
||||
import argparse
|
||||
import os
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
from src.config import cfg_res50
|
||||
|
||||
cfg = cfg_res50
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='Process file')
|
||||
parser.add_argument('--val_dataset_folder', type=str, default='/home/dataset/widerface/val',
|
||||
help='val dataset folder.')
|
||||
args_opt = parser.parse_args()
|
||||
|
||||
# testing dataset
|
||||
testset_folder = args_opt.val_dataset_folder
|
||||
testset_label_path = os.path.join(args_opt.val_dataset_folder, "label.txt")
|
||||
with open(testset_label_path, 'r') as f:
|
||||
_test_dataset = f.readlines()
|
||||
test_dataset = []
|
||||
for im_path in _test_dataset:
|
||||
if im_path.startswith('# '):
|
||||
test_dataset.append(im_path[2:-1]) # delete '# ...\n'
|
||||
|
||||
# transform data to bin_file
|
||||
print('Transform starting')
|
||||
img_path = "./bin_file"
|
||||
os.makedirs(img_path)
|
||||
h_max, w_max = 5568, 1056
|
||||
for i, img_name in enumerate(test_dataset):
|
||||
image_path = os.path.join(testset_folder, 'images', img_name)
|
||||
|
||||
img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
||||
img = np.float32(img_raw)
|
||||
|
||||
# testing scale
|
||||
if cfg['val_origin_size']:
|
||||
resize = 1
|
||||
assert img.shape[0] <= h_max and img.shape[1] <= w_max
|
||||
image_t = np.empty((h_max, w_max, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
else:
|
||||
im_size_min = np.min(img.shape[0:2])
|
||||
im_size_max = np.max(img.shape[0:2])
|
||||
resize = float(target_size) / float(im_size_min)
|
||||
# prevent bigger axis from being more than max_size:
|
||||
if np.round(resize * im_size_max) > max_size:
|
||||
resize = float(max_size) / float(im_size_max)
|
||||
|
||||
img = cv2.resize(img, None, None, fx=resize, fy=resize, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
assert img.shape[0] <= max_size and img.shape[1] <= max_size
|
||||
image_t = np.empty((max_size, max_size, 3), dtype=img.dtype)
|
||||
image_t[:, :] = (104.0, 117.0, 123.0)
|
||||
image_t[0:img.shape[0], 0:img.shape[1]] = img
|
||||
img = image_t
|
||||
|
||||
scale = np.array([img.shape[1], img.shape[0], img.shape[1], img.shape[0]], dtype=img.dtype)
|
||||
img -= (104, 117, 123)
|
||||
img = img.transpose(2, 0, 1)
|
||||
img = np.expand_dims(img, 0) # [1, c, h, w] (1, 3, 2176, 2176)
|
||||
# save bin file
|
||||
file_name = "widerface_test" + "_" + str(i) + ".bin"
|
||||
file_path = os.path.join(img_path, file_name)
|
||||
img.tofile(file_path)
|
||||
if i % 50 == 0:
|
||||
print("Finish {} files".format(i))
|
||||
print("=" * 20, "export bin files finished", "=" * 20)
|
|
@ -0,0 +1,3 @@
|
|||
numpy
|
||||
opencv-python
|
||||
scipy
|
|
@ -0,0 +1,40 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
|
||||
ulimit -u unlimited
|
||||
export DEVICE_NUM=8
|
||||
export RANK_SIZE=8
|
||||
RANK_TABLE_FILE=$(realpath $1)
|
||||
export RANK_TABLE_FILE
|
||||
echo "RANK_TABLE_FILE=${RANK_TABLE_FILE}"
|
||||
|
||||
export SERVER_ID=0
|
||||
rank_start=$((DEVICE_NUM * SERVER_ID))
|
||||
for((i=0; i<${DEVICE_NUM}; i++))
|
||||
do
|
||||
export DEVICE_ID=$i
|
||||
export RANK_ID=$((rank_start + i))
|
||||
rm -rf ./train_parallel$i
|
||||
mkdir ./train_parallel$i
|
||||
cp -r ./src ./train_parallel$i
|
||||
cp ./train.py ./train_parallel$i
|
||||
echo "start training for rank $RANK_ID, device $DEVICE_ID"
|
||||
cd ./train_parallel$i ||exit
|
||||
env > env.log
|
||||
python train.py --backbone_name 'ResNet50' --device_id=$i > log 2>&1 &
|
||||
cd ..
|
||||
done
|
|
@ -0,0 +1,27 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
echo "=============================================================================================================="
|
||||
echo "Please run the script as: "
|
||||
echo "bash run_distribute_gpu_train.sh DEVICE_NUM CUDA_VISIBLE_DEVICES"
|
||||
echo "for example: bash run_distribute_gpu_train.sh 4 0,1,2,3"
|
||||
echo "=============================================================================================================="
|
||||
|
||||
RANK_SIZE=$1
|
||||
export CUDA_VISIBLE_DEVICES="$2"
|
||||
|
||||
mpirun --allow-run-as-root -n $RANK_SIZE --output-filename log_output --merge-stderr-to-stdout \
|
||||
python train.py --backbone_name 'MobileNet025' > train.log 2>&1 &
|
|
@ -0,0 +1,129 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
if [[ $# -lt 2 || $# -gt 3 ]]; then
|
||||
echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID]
|
||||
DEVICE_ID is optional, it can be set by environment variable device_id, otherwise the value is zero"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
get_real_path(){
|
||||
if [ "${1:0:1}" == "/" ]; then
|
||||
echo "$1"
|
||||
else
|
||||
echo "$(realpath -m $PWD/$1)"
|
||||
fi
|
||||
}
|
||||
model=$(get_real_path $1)
|
||||
|
||||
dataset_path=$(get_real_path $2)
|
||||
|
||||
|
||||
device_id=0
|
||||
if [ $# == 3 ]; then
|
||||
device_id=$3
|
||||
fi
|
||||
|
||||
echo "mindir name: "$model
|
||||
echo "dataset path: "$dataset_path
|
||||
echo "device id: "$device_id
|
||||
|
||||
export ASCEND_HOME=/usr/local/Ascend/
|
||||
if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then
|
||||
export PATH=$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH
|
||||
export LD_LIBRARY_PATH=/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
|
||||
export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe
|
||||
export PYTHONPATH=${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH
|
||||
export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp
|
||||
else
|
||||
export PATH=$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH
|
||||
export LD_LIBRARY_PATH=/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
|
||||
export PYTHONPATH=$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH
|
||||
export ASCEND_OPP_PATH=$ASCEND_HOME/opp
|
||||
fi
|
||||
|
||||
export ASCEND_HOME=/usr/local/Ascend
|
||||
|
||||
export PATH=$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/toolkit/bin:$PATH
|
||||
|
||||
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/local/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:/usr/local/Ascend/toolkit/lib64:$LD_LIBRARY_PATH
|
||||
|
||||
export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages
|
||||
|
||||
export PATH=/usr/local/python375/bin:$PATH
|
||||
export NPU_HOST_LIB=/usr/local/Ascend/acllib/lib64/stub
|
||||
export ASCEND_OPP_PATH=/usr/local/Ascend/opp
|
||||
export ASCEND_AICPU_PATH=/usr/local/Ascend
|
||||
export LD_LIBRARY_PATH=/usr/local/lib64/:$LD_LIBRARY_PATH
|
||||
|
||||
function preprocess_data()
|
||||
{
|
||||
if [ -d preprocess_Result ]; then
|
||||
rm -rf ./preprocess_Result
|
||||
fi
|
||||
mkdir preprocess_Result
|
||||
python3.7 ../preprocess.py --val_dataset_folder=$dataset_path
|
||||
}
|
||||
|
||||
function compile_app()
|
||||
{
|
||||
cd ../ascend310_infer/ || exit
|
||||
bash build.sh &> build.log
|
||||
}
|
||||
|
||||
function infer()
|
||||
{
|
||||
cd - || exit
|
||||
if [ -d result_Files ]; then
|
||||
rm -rf ./result_Files
|
||||
fi
|
||||
if [ -d time_Result ]; then
|
||||
rm -rf ./time_Result
|
||||
fi
|
||||
mkdir result_Files
|
||||
mkdir time_Result
|
||||
|
||||
../ascend310_infer/out/main --mindir_path=$model --input0_path=./bin_file --device_id=$device_id &> infer.log
|
||||
}
|
||||
|
||||
function cal_acc()
|
||||
{
|
||||
python3.7 ../postprocess.py --device_id=$device_id &> acc.log
|
||||
}
|
||||
|
||||
preprocess_data
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "preprocess dataset failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
compile_app
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "compile app code failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
infer
|
||||
if [ $? -ne 0 ]; then
|
||||
echo " execute inference failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cal_acc
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "calculate accuracy failed"
|
||||
exit 1
|
||||
fi
|
|
@ -0,0 +1,22 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
echo "=============================================================================================================="
|
||||
echo "Please run the script as: "
|
||||
echo "for example: bash run_standalone_eval_ascend.sh [CKPT_FILE]"
|
||||
echo "=============================================================================================================="
|
||||
|
||||
python eval.py --backbone_name 'ResNet50' --val_model $1 > ./eval.log 2>&1 &
|
|
@ -0,0 +1,24 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
echo "=============================================================================================================="
|
||||
echo "Please run the script as: "
|
||||
echo "bash run_standalone_gpu_eval.sh CUDA_VISIBLE_DEVICES"
|
||||
echo "for example: bash run_standalone_gpu_eval.sh 0 [CKPT_FILE]"
|
||||
echo "=============================================================================================================="
|
||||
|
||||
export CUDA_VISIBLE_DEVICES="$1"
|
||||
python eval.py --backbone_name 'MobileNet025' --val_model $2 > eval.log 2>&1 &
|
|
@ -0,0 +1,19 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
echo "Usage: bash ./scripts/run_standalone_train_ascend.sh"
|
||||
|
||||
python train.py --backbone_name 'ResNet50' > train.log 2>&1 &
|
|
@ -0,0 +1,315 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Augmentation."""
|
||||
import random
|
||||
import copy
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
def _rand(a=0., b=1.):
|
||||
return np.random.rand() * (b - a) + a
|
||||
|
||||
def bbox_iof(bbox_a, bbox_b, offset=0):
|
||||
"""bbox_iof"""
|
||||
if bbox_a.shape[1] < 4 or bbox_b.shape[1] < 4:
|
||||
raise IndexError("Bounding boxes axis 1 must have at least length 4")
|
||||
|
||||
tl = np.maximum(bbox_a[:, None, 0:2], bbox_b[:, 0:2])
|
||||
br = np.minimum(bbox_a[:, None, 2:4], bbox_b[:, 2:4])
|
||||
|
||||
area_i = np.prod(br - tl + offset, axis=2) * (tl < br).all(axis=2)
|
||||
area_a = np.prod(bbox_a[:, 2:4] - bbox_a[:, :2] + offset, axis=1)
|
||||
return area_i / np.maximum(area_a[:, None], 1)
|
||||
|
||||
def _is_iof_satisfied_constraint(box, crop_box):
|
||||
iof = bbox_iof(box, crop_box)
|
||||
satisfied = np.any((iof >= 1.0))
|
||||
return satisfied
|
||||
|
||||
def _choose_candidate(max_trial, image_w, image_h, boxes):
|
||||
"""_choose_candidate"""
|
||||
# add default candidate
|
||||
candidates = [(0, 0, image_w, image_h)]
|
||||
|
||||
for _ in range(max_trial):
|
||||
# box_data should have at least one box
|
||||
if _rand() > 0.2:
|
||||
scale = _rand(0.3, 1.0)
|
||||
else:
|
||||
scale = 1.0
|
||||
|
||||
nh = int(scale * min(image_w, image_h))
|
||||
nw = nh
|
||||
|
||||
dx = int(_rand(0, image_w - nw))
|
||||
dy = int(_rand(0, image_h - nh))
|
||||
|
||||
if boxes.shape[0] > 0:
|
||||
crop_box = np.array((dx, dy, dx + nw, dy + nh))
|
||||
if not _is_iof_satisfied_constraint(boxes, crop_box[np.newaxis]):
|
||||
continue
|
||||
else:
|
||||
candidates.append((dx, dy, nw, nh))
|
||||
else:
|
||||
raise Exception("!!! annotation box is less than 1")
|
||||
|
||||
if len(candidates) >= 3:
|
||||
break
|
||||
|
||||
return candidates
|
||||
|
||||
def _correct_bbox_by_candidates(candidates, input_w, input_h, flip, boxes, labels, landms, allow_outside_center):
|
||||
"""Calculate correct boxes."""
|
||||
while candidates:
|
||||
if len(candidates) > 1:
|
||||
# ignore default candidate which do not crop
|
||||
candidate = candidates.pop(np.random.randint(1, len(candidates)))
|
||||
else:
|
||||
candidate = candidates.pop(np.random.randint(0, len(candidates)))
|
||||
dx, dy, nw, nh = candidate
|
||||
|
||||
boxes_t = copy.deepcopy(boxes)
|
||||
landms_t = copy.deepcopy(landms)
|
||||
labels_t = copy.deepcopy(labels)
|
||||
landms_t = landms_t.reshape([-1, 5, 2])
|
||||
|
||||
if nw == nh:
|
||||
scale = float(input_w) / float(nw)
|
||||
else:
|
||||
scale = float(input_w) / float(max(nh, nw))
|
||||
boxes_t[:, [0, 2]] = (boxes_t[:, [0, 2]] - dx) * scale
|
||||
boxes_t[:, [1, 3]] = (boxes_t[:, [1, 3]] - dy) * scale
|
||||
landms_t[:, :, 0] = (landms_t[:, :, 0] - dx) * scale
|
||||
landms_t[:, :, 1] = (landms_t[:, :, 1] - dy) * scale
|
||||
|
||||
if flip:
|
||||
boxes_t[:, [0, 2]] = input_w - boxes_t[:, [2, 0]]
|
||||
landms_t[:, :, 0] = input_w - landms_t[:, :, 0]
|
||||
# flip landms
|
||||
landms_t_1 = landms_t[:, 1, :].copy()
|
||||
landms_t[:, 1, :] = landms_t[:, 0, :]
|
||||
landms_t[:, 0, :] = landms_t_1
|
||||
landms_t_4 = landms_t[:, 4, :].copy()
|
||||
landms_t[:, 4, :] = landms_t[:, 3, :]
|
||||
landms_t[:, 3, :] = landms_t_4
|
||||
|
||||
if allow_outside_center:
|
||||
pass
|
||||
else:
|
||||
mask1 = np.logical_and((boxes_t[:, 0] + boxes_t[:, 2])/2. >= 0., (boxes_t[:, 1] + boxes_t[:, 3])/2. >= 0.)
|
||||
boxes_t = boxes_t[mask1]
|
||||
landms_t = landms_t[mask1]
|
||||
labels_t = labels_t[mask1]
|
||||
|
||||
mask2 = np.logical_and((boxes_t[:, 0] + boxes_t[:, 2]) / 2. <= input_w,
|
||||
(boxes_t[:, 1] + boxes_t[:, 3]) / 2. <= input_h)
|
||||
boxes_t = boxes_t[mask2]
|
||||
landms_t = landms_t[mask2]
|
||||
labels_t = labels_t[mask2]
|
||||
|
||||
# recorrect x, y for case x,y < 0 reset to zero, after dx and dy, some box can smaller than zero
|
||||
boxes_t[:, 0:2][boxes_t[:, 0:2] < 0] = 0
|
||||
# recorrect w,h not higher than input size
|
||||
boxes_t[:, 2][boxes_t[:, 2] > input_w] = input_w
|
||||
boxes_t[:, 3][boxes_t[:, 3] > input_h] = input_h
|
||||
box_w = boxes_t[:, 2] - boxes_t[:, 0]
|
||||
box_h = boxes_t[:, 3] - boxes_t[:, 1]
|
||||
# discard invalid box: w or h smaller than 1 pixel
|
||||
mask3 = np.logical_and(box_w > 1, box_h > 1)
|
||||
boxes_t = boxes_t[mask3]
|
||||
landms_t = landms_t[mask3]
|
||||
labels_t = labels_t[mask3]
|
||||
|
||||
# normal
|
||||
boxes_t[:, [0, 2]] /= input_w
|
||||
boxes_t[:, [1, 3]] /= input_h
|
||||
landms_t[:, :, 0] /= input_w
|
||||
landms_t[:, :, 1] /= input_h
|
||||
|
||||
landms_t = landms_t.reshape([-1, 10])
|
||||
labels_t = np.expand_dims(labels_t, 1)
|
||||
|
||||
targets_t = np.hstack((boxes_t, landms_t, labels_t))
|
||||
|
||||
if boxes_t.shape[0] > 0:
|
||||
|
||||
return targets_t, candidate
|
||||
|
||||
raise Exception('all candidates can not satisfied re-correct bbox')
|
||||
|
||||
def get_interp_method(interp, sizes=()):
|
||||
"""Get the interpolation method for resize functions.
|
||||
The major purpose of this function is to wrap a random interp method selection
|
||||
and a auto-estimation method.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
interp : int
|
||||
interpolation method for all resizing operations
|
||||
|
||||
Possible values:
|
||||
0: Nearest Neighbors Interpolation.
|
||||
1: Bilinear interpolation.
|
||||
2: Bicubic interpolation over 4x4 pixel neighborhood.
|
||||
3: Nearest Neighbors. [Originally it should be Area-based,
|
||||
as we cannot find Area-based, so we use NN instead.
|
||||
Area-based (resampling using pixel area relation). It may be a
|
||||
preferred method for image decimation, as it gives moire-free
|
||||
results. But when the image is zoomed, it is similar to the Nearest
|
||||
Neighbors method. (used by default).
|
||||
4: Lanczos interpolation over 8x8 pixel neighborhood.
|
||||
9: Cubic for enlarge, area for shrink, bilinear for others
|
||||
10: Random select from interpolation method mentioned above.
|
||||
Note:
|
||||
When shrinking an image, it will generally look best with AREA-based
|
||||
interpolation, whereas, when enlarging an image, it will generally look best
|
||||
with Bicubic (slow) or Bilinear (faster but still looks OK).
|
||||
More details can be found in the documentation of OpenCV, please refer to
|
||||
http://docs.opencv.org/master/da/d54/group__imgproc__transform.html.
|
||||
sizes : tuple of int
|
||||
(old_height, old_width, new_height, new_width), if None provided, auto(9)
|
||||
will return Area(2) anyway.
|
||||
|
||||
Returns
|
||||
-------
|
||||
int
|
||||
interp method from 0 to 4
|
||||
"""
|
||||
if interp == 9:
|
||||
if sizes:
|
||||
assert len(sizes) == 4
|
||||
oh, ow, nh, nw = sizes
|
||||
if nh > oh and nw > ow:
|
||||
return 2
|
||||
if nh < oh and nw < ow:
|
||||
return 0
|
||||
return 1
|
||||
return 2
|
||||
if interp == 10:
|
||||
return random.randint(0, 4)
|
||||
if interp not in (0, 1, 2, 3, 4):
|
||||
raise ValueError('Unknown interp method %d' % interp)
|
||||
return interp
|
||||
|
||||
def cv_image_reshape(interp):
|
||||
"""Reshape pil image."""
|
||||
reshape_type = {
|
||||
0: cv2.INTER_LINEAR,
|
||||
1: cv2.INTER_CUBIC,
|
||||
2: cv2.INTER_AREA,
|
||||
3: cv2.INTER_NEAREST,
|
||||
4: cv2.INTER_LANCZOS4,
|
||||
}
|
||||
return reshape_type[interp]
|
||||
|
||||
def color_convert(image, a=1, b=0):
|
||||
c_image = image.astype(float) * a + b
|
||||
c_image[c_image < 0] = 0
|
||||
c_image[c_image > 255] = 255
|
||||
|
||||
image[:] = c_image
|
||||
|
||||
def color_distortion(image):
|
||||
"""color_distortion"""
|
||||
image = copy.deepcopy(image)
|
||||
|
||||
if _rand() > 0.5:
|
||||
if _rand() > 0.5:
|
||||
color_convert(image, b=_rand(-32, 32))
|
||||
if _rand() > 0.5:
|
||||
color_convert(image, a=_rand(0.5, 1.5))
|
||||
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
|
||||
if _rand() > 0.5:
|
||||
color_convert(image[:, :, 1], a=_rand(0.5, 1.5))
|
||||
if _rand() > 0.5:
|
||||
h_img = image[:, :, 0].astype(int) + random.randint(-18, 18)
|
||||
h_img %= 180
|
||||
image[:, :, 0] = h_img
|
||||
image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
|
||||
else:
|
||||
if _rand() > 0.5:
|
||||
color_convert(image, b=random.uniform(-32, 32))
|
||||
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
|
||||
if _rand() > 0.5:
|
||||
color_convert(image[:, :, 1], a=random.uniform(0.5, 1.5))
|
||||
if _rand() > 0.5:
|
||||
tmp = image[:, :, 0].astype(int) + random.randint(-18, 18)
|
||||
tmp %= 180
|
||||
image[:, :, 0] = tmp
|
||||
image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
|
||||
if _rand() > 0.5:
|
||||
color_convert(image, a=random.uniform(0.5, 1.5))
|
||||
|
||||
return image
|
||||
|
||||
class preproc():
|
||||
"""preproc"""
|
||||
def __init__(self, image_dim):
|
||||
self.image_input_size = image_dim
|
||||
|
||||
def __call__(self, image, target):
|
||||
assert target.shape[0] > 0, "target without ground truth."
|
||||
_target = copy.deepcopy(target)
|
||||
boxes = _target[:, :4]
|
||||
landms = _target[:, 4:-1]
|
||||
labels = _target[:, -1]
|
||||
|
||||
aug_image, aug_target = self._data_aug(image, boxes, labels, landms, self.image_input_size)
|
||||
|
||||
return aug_image, aug_target
|
||||
|
||||
def _data_aug(self, image, boxes, labels, landms, image_input_size, max_trial=250):
|
||||
"""_data_aug"""
|
||||
image_h, image_w, _ = image.shape
|
||||
input_h, input_w = image_input_size, image_input_size
|
||||
|
||||
flip = _rand() < .5
|
||||
|
||||
candidates = _choose_candidate(max_trial=max_trial,
|
||||
image_w=image_w,
|
||||
image_h=image_h,
|
||||
boxes=boxes)
|
||||
targets, candidate = _correct_bbox_by_candidates(candidates=candidates,
|
||||
input_w=input_w,
|
||||
input_h=input_h,
|
||||
flip=flip,
|
||||
boxes=boxes,
|
||||
labels=labels,
|
||||
landms=landms,
|
||||
allow_outside_center=False)
|
||||
# crop image
|
||||
dx, dy, nw, nh = candidate
|
||||
image = image[dy:(dy + nh), dx:(dx + nw)]
|
||||
|
||||
if nw != nh:
|
||||
assert nw == image_w and nh == image_h
|
||||
# pad ori image to square
|
||||
l = max(nw, nh)
|
||||
t_image = np.empty((l, l, 3), dtype=image.dtype)
|
||||
t_image[:, :] = (104, 117, 123)
|
||||
t_image[:nh, :nw] = image
|
||||
image = t_image
|
||||
|
||||
interp = get_interp_method(interp=10)
|
||||
image = cv2.resize(image, (input_w, input_h), interpolation=cv_image_reshape(interp))
|
||||
|
||||
if flip:
|
||||
image = image[:, ::-1]
|
||||
|
||||
image = image.astype(np.float32)
|
||||
image -= (104, 117, 123)
|
||||
image = image.transpose(2, 0, 1)
|
||||
|
||||
return image, targets
|
|
@ -0,0 +1,134 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Config for train and eval."""
|
||||
cfg_res50 = {
|
||||
'name': 'ResNet50',
|
||||
'device_target': "Ascend",
|
||||
'device_id': 0,
|
||||
'variance': [0.1, 0.2],
|
||||
'clip': False,
|
||||
'loc_weight': 2.0,
|
||||
'class_weight': 1.0,
|
||||
'landm_weight': 1.0,
|
||||
'batch_size': 8,
|
||||
'num_workers': 16,
|
||||
'num_anchor': 29126,
|
||||
'nnpu': 8,
|
||||
'image_size': 840,
|
||||
'in_channel': 256,
|
||||
'out_channel': 256,
|
||||
'match_thresh': 0.35,
|
||||
|
||||
# opt
|
||||
'optim': 'sgd', # 'sgd' or 'momentum'
|
||||
'momentum': 0.9,
|
||||
'weight_decay': 1e-4,
|
||||
'loss_scale': 1,
|
||||
|
||||
# seed
|
||||
'seed': 1,
|
||||
|
||||
# lr
|
||||
'epoch': 60,
|
||||
'T_max': 50, # cosine_annealing
|
||||
'eta_min': 0.0, # cosine_annealing
|
||||
'decay1': 20,
|
||||
'decay2': 40,
|
||||
'lr_type': 'dynamic_lr', # 'dynamic_lr' or cosine_annealing
|
||||
'initial_lr': 0.04,
|
||||
'warmup_epoch': -1, # dynamic_lr: -1, cosine_annealing:0
|
||||
'gamma': 0.1,
|
||||
|
||||
# checkpoint
|
||||
'ckpt_path': './checkpoint/',
|
||||
'keep_checkpoint_max': 8,
|
||||
'resume_net': None,
|
||||
|
||||
# dataset
|
||||
'training_dataset': '../data/widerface/train/label.txt',
|
||||
'pretrain': True,
|
||||
'pretrain_path': '../data/resnet-90_625.ckpt',
|
||||
|
||||
# val
|
||||
'val_model': './train_parallel3/checkpoint/ckpt_3/RetinaFace-56_201.ckpt',
|
||||
'val_dataset_folder': './data/widerface/val/',
|
||||
'val_origin_size': True,
|
||||
'val_confidence_threshold': 0.02,
|
||||
'val_nms_threshold': 0.4,
|
||||
'val_iou_threshold': 0.5,
|
||||
'val_save_result': False,
|
||||
'val_predict_save_folder': './widerface_result',
|
||||
'val_gt_dir': './data/ground_truth/',
|
||||
|
||||
# infer
|
||||
'infer_dataset_folder': '/home/dataset/widerface/val/',
|
||||
'infer_gt_dir': '/home/dataset/widerface/ground_truth/',
|
||||
}
|
||||
|
||||
cfg_mobile025 = {
|
||||
'name': 'MobileNet025',
|
||||
'variance': [0.1, 0.2],
|
||||
'clip': False,
|
||||
'loc_weight': 2.0,
|
||||
'class_weight': 1.0,
|
||||
'landm_weight': 1.0,
|
||||
'batch_size': 8,
|
||||
'num_workers': 12,
|
||||
'num_anchor': 16800,
|
||||
'ngpu': 2,
|
||||
'image_size': 640,
|
||||
'in_channel': 32,
|
||||
'out_channel': 64,
|
||||
'match_thresh': 0.35,
|
||||
|
||||
# opt
|
||||
'optim': 'sgd',
|
||||
'momentum': 0.9,
|
||||
'weight_decay': 5e-4,
|
||||
|
||||
# seed
|
||||
'seed': 1,
|
||||
|
||||
# lr
|
||||
'epoch': 120,
|
||||
'decay1': 70,
|
||||
'decay2': 90,
|
||||
'lr_type': 'dynamic_lr',
|
||||
'initial_lr': 0.02,
|
||||
'warmup_epoch': 5,
|
||||
'gamma': 0.1,
|
||||
|
||||
# checkpoint
|
||||
'ckpt_path': './checkpoint/',
|
||||
'save_checkpoint_steps': 2000,
|
||||
'keep_checkpoint_max': 3,
|
||||
'resume_net': None,
|
||||
|
||||
# dataset
|
||||
'training_dataset': '../data/widerface/train/label.txt',
|
||||
'pretrain': False,
|
||||
'pretrain_path': '../data/mobilenetv1-90_5004.ckpt',
|
||||
|
||||
# val
|
||||
'val_model': './checkpoint/ckpt_0/RetinaFace-117_804.ckpt',
|
||||
'val_dataset_folder': './data/widerface/val/',
|
||||
'val_origin_size': False,
|
||||
'val_confidence_threshold': 0.02,
|
||||
'val_nms_threshold': 0.4,
|
||||
'val_iou_threshold': 0.5,
|
||||
'val_save_result': False,
|
||||
'val_predict_save_folder': './widerface_result',
|
||||
'val_gt_dir': './data/ground_truth/',
|
||||
}
|
|
@ -0,0 +1,171 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Dataset for train and eval."""
|
||||
import os
|
||||
import copy
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
import mindspore.dataset as de
|
||||
from mindspore.communication.management import init, get_rank, get_group_size
|
||||
|
||||
from .augmemtation import preproc
|
||||
from .utils import bbox_encode
|
||||
|
||||
|
||||
class WiderFace():
|
||||
"""WiderFace"""
|
||||
def __init__(self, label_path):
|
||||
self.images_list = []
|
||||
self.labels_list = []
|
||||
f = open(label_path, 'r')
|
||||
lines = f.readlines()
|
||||
First = True
|
||||
labels = []
|
||||
for line in lines:
|
||||
line = line.rstrip()
|
||||
if line.startswith('#'):
|
||||
if First is True:
|
||||
First = False
|
||||
else:
|
||||
c_labels = copy.deepcopy(labels)
|
||||
self.labels_list.append(c_labels)
|
||||
labels.clear()
|
||||
# remove '# '
|
||||
path = line[2:]
|
||||
path = label_path.replace('label.txt', 'images/') + path
|
||||
|
||||
assert os.path.exists(path), 'image path is not exists.'
|
||||
|
||||
self.images_list.append(path)
|
||||
else:
|
||||
line = line.split(' ')
|
||||
label = [float(x) for x in line]
|
||||
labels.append(label)
|
||||
# add the last label
|
||||
self.labels_list.append(labels)
|
||||
|
||||
# del bbox which width is zero or height is zero
|
||||
for i in range(len(self.labels_list) - 1, -1, -1):
|
||||
labels = self.labels_list[i]
|
||||
for j in range(len(labels) - 1, -1, -1):
|
||||
label = labels[j]
|
||||
if label[2] <= 0 or label[3] <= 0:
|
||||
labels.pop(j)
|
||||
if not labels:
|
||||
self.images_list.pop(i)
|
||||
self.labels_list.pop(i)
|
||||
else:
|
||||
self.labels_list[i] = labels
|
||||
|
||||
def __len__(self):
|
||||
return len(self.images_list)
|
||||
|
||||
def __getitem__(self, item):
|
||||
return self.images_list[item], self.labels_list[item]
|
||||
|
||||
def read_dataset(img_path, annotation):
|
||||
"""read_dataset"""
|
||||
cv2.setNumThreads(2)
|
||||
|
||||
if isinstance(img_path, str):
|
||||
img = cv2.imread(img_path)
|
||||
else:
|
||||
img = cv2.imread(img_path.tostring().decode("utf-8"))
|
||||
|
||||
labels = annotation
|
||||
anns = np.zeros((0, 15))
|
||||
if labels.shape[0] <= 0:
|
||||
return anns
|
||||
for _, label in enumerate(labels):
|
||||
ann = np.zeros((1, 15))
|
||||
|
||||
# get bbox
|
||||
ann[0, 0:2] = label[0:2] # x1, y1
|
||||
ann[0, 2:4] = label[0:2] + label[2:4] # x2, y2
|
||||
|
||||
# get landmarks
|
||||
ann[0, 4:14] = label[[4, 5, 7, 8, 10, 11, 13, 14, 16, 17]]
|
||||
|
||||
# set flag
|
||||
if (ann[0, 4] < 0):
|
||||
ann[0, 14] = -1
|
||||
else:
|
||||
ann[0, 14] = 1
|
||||
|
||||
anns = np.append(anns, ann, axis=0)
|
||||
target = np.array(anns).astype(np.float32)
|
||||
|
||||
return img, target
|
||||
|
||||
|
||||
def create_dataset(data_dir, cfg, batch_size=32, repeat_num=1, shuffle=True, multiprocessing=True, num_worker=16):
|
||||
"""create_dataset"""
|
||||
dataset = WiderFace(data_dir)
|
||||
|
||||
if cfg['name'] == 'ResNet50':
|
||||
device_num, rank_id = _get_rank_info()
|
||||
elif cfg['name'] == 'MobileNet025':
|
||||
init("nccl")
|
||||
rank_id = get_rank()
|
||||
device_num = get_group_size()
|
||||
if device_num == 1:
|
||||
de_dataset = de.GeneratorDataset(dataset, ["image", "annotation"],
|
||||
shuffle=shuffle,
|
||||
num_parallel_workers=num_worker)
|
||||
else:
|
||||
de_dataset = de.GeneratorDataset(dataset, ["image", "annotation"],
|
||||
shuffle=shuffle,
|
||||
num_parallel_workers=num_worker,
|
||||
num_shards=device_num,
|
||||
shard_id=rank_id)
|
||||
|
||||
aug = preproc(cfg['image_size'])
|
||||
encode = bbox_encode(cfg)
|
||||
|
||||
def union_data(image, annot):
|
||||
i, a = read_dataset(image, annot)
|
||||
i, a = aug(i, a)
|
||||
out = encode(i, a)
|
||||
|
||||
return out
|
||||
|
||||
de_dataset = de_dataset.map(input_columns=["image", "annotation"],
|
||||
output_columns=["image", "truths", "conf", "landm"],
|
||||
column_order=["image", "truths", "conf", "landm"],
|
||||
operations=union_data,
|
||||
python_multiprocessing=multiprocessing,
|
||||
num_parallel_workers=num_worker)
|
||||
|
||||
de_dataset = de_dataset.batch(batch_size, drop_remainder=True)
|
||||
de_dataset = de_dataset.repeat(repeat_num)
|
||||
|
||||
|
||||
return de_dataset
|
||||
|
||||
|
||||
def _get_rank_info():
|
||||
"""
|
||||
get rank size and rank id
|
||||
"""
|
||||
rank_size = int(os.environ.get("RANK_SIZE", 1))
|
||||
|
||||
if rank_size > 1:
|
||||
rank_size = get_group_size()
|
||||
rank_id = get_rank()
|
||||
else:
|
||||
rank_size = rank_id = None
|
||||
|
||||
return rank_size, rank_id
|
|
@ -0,0 +1,121 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Loss."""
|
||||
import mindspore.common.dtype as mstype
|
||||
import mindspore.nn as nn
|
||||
from mindspore.ops import operations as P
|
||||
from mindspore.ops import functional as F
|
||||
from mindspore import Tensor
|
||||
|
||||
|
||||
class SoftmaxCrossEntropyWithLogits(nn.Cell):
|
||||
"""SoftmaxCrossEntropyWithLogits"""
|
||||
def __init__(self):
|
||||
super(SoftmaxCrossEntropyWithLogits, self).__init__()
|
||||
self.log_softmax = P.LogSoftmax()
|
||||
self.neg = P.Neg()
|
||||
self.one_hot = P.OneHot()
|
||||
self.on_value = Tensor(1.0, mstype.float32)
|
||||
self.off_value = Tensor(0.0, mstype.float32)
|
||||
self.reduce_sum = P.ReduceSum()
|
||||
|
||||
def construct(self, logits, labels):
|
||||
"""construct"""
|
||||
prob = self.log_softmax(logits)
|
||||
labels = self.one_hot(labels, F.shape(logits)[-1], self.on_value, self.off_value)
|
||||
|
||||
return self.neg(self.reduce_sum(prob * labels, 1))
|
||||
|
||||
|
||||
class MultiBoxLoss(nn.Cell):
|
||||
"""MultiBoxLoss"""
|
||||
def __init__(self, num_classes, num_boxes, neg_pre_positive, batch_size):
|
||||
super(MultiBoxLoss, self).__init__()
|
||||
self.num_classes = num_classes
|
||||
self.num_boxes = num_boxes
|
||||
self.neg_pre_positive = neg_pre_positive
|
||||
self.notequal = P.NotEqual()
|
||||
self.less = P.Less()
|
||||
self.tile = P.Tile()
|
||||
self.reduce_sum = P.ReduceSum()
|
||||
self.reduce_mean = P.ReduceMean()
|
||||
self.expand_dims = P.ExpandDims()
|
||||
self.smooth_l1_loss = P.SmoothL1Loss()
|
||||
self.cross_entropy = SoftmaxCrossEntropyWithLogits()
|
||||
self.maximum = P.Maximum()
|
||||
self.minimum = P.Minimum()
|
||||
self.sort_descend = P.TopK(True)
|
||||
self.sort = P.TopK(True)
|
||||
self.max = P.ReduceMax()
|
||||
self.log = P.Log()
|
||||
self.exp = P.Exp()
|
||||
self.concat = P.Concat(axis=1)
|
||||
self.reduce_sum2 = P.ReduceSum(keep_dims=True)
|
||||
self.mul = P.Mul()
|
||||
self.reduce_sum_new = P.ReduceSum(keep_dims=True)
|
||||
|
||||
def construct(self, loc_data, loc_t, conf_data, conf_t, landm_data, landm_t):
|
||||
"""construct"""
|
||||
# landm loss
|
||||
mask_pos1 = F.cast(self.less(0.0, F.cast(conf_t, mstype.float32)), mstype.float32)
|
||||
|
||||
N1 = self.maximum(self.reduce_sum(mask_pos1), 1)
|
||||
mask_pos_idx1 = self.tile(self.expand_dims(mask_pos1, -1), (1, 1, 10))
|
||||
loss_landm = self.reduce_sum(self.smooth_l1_loss(landm_data, landm_t) * mask_pos_idx1)
|
||||
loss_landm = loss_landm / N1
|
||||
|
||||
# Localization Loss
|
||||
mask_pos = F.cast(self.notequal(0, conf_t), mstype.float32)
|
||||
conf_t = F.cast(mask_pos, mstype.int32)
|
||||
|
||||
N = self.maximum(self.reduce_sum(mask_pos), 1)
|
||||
mask_pos_idx = self.tile(self.expand_dims(mask_pos, -1), (1, 1, 4))
|
||||
loss_l = self.reduce_sum(self.smooth_l1_loss(loc_data, loc_t) * mask_pos_idx)
|
||||
loss_l = loss_l / N
|
||||
|
||||
# Conf Loss
|
||||
conf_t_shape = F.shape(conf_t)
|
||||
conf_t = F.reshape(conf_t, (-1,))
|
||||
indices = self.concat((1 - F.reshape(conf_t, (-1, 1)), F.reshape(conf_t, (-1, 1))))
|
||||
|
||||
batch_conf = F.reshape(conf_data, (-1, self.num_classes))
|
||||
x_max = self.max(batch_conf)
|
||||
loss_c = self.log(self.reduce_sum2(self.exp(batch_conf - x_max), 1)) + x_max
|
||||
mul_tensor = self.mul(indices, batch_conf)
|
||||
loss_c = loss_c - self.reduce_sum_new(mul_tensor, 1)
|
||||
loss_c = F.reshape(loss_c, conf_t_shape)
|
||||
|
||||
# hard example mining
|
||||
num_matched_boxes = F.reshape(self.reduce_sum(mask_pos, 1), (-1,))
|
||||
neg_masked_cross_entropy = F.cast(loss_c * (1 - mask_pos), mstype.float32)
|
||||
|
||||
_, loss_idx = self.sort_descend(neg_masked_cross_entropy, self.num_boxes)
|
||||
_, relative_position = self.sort(F.cast(loss_idx, mstype.float32), self.num_boxes)
|
||||
relative_position = F.cast(relative_position, mstype.float32)
|
||||
relative_position = relative_position[:, ::-1]
|
||||
relative_position = F.cast(relative_position, mstype.int32)
|
||||
|
||||
num_neg_boxes = self.minimum(num_matched_boxes * self.neg_pre_positive, self.num_boxes - 1)
|
||||
tile_num_neg_boxes = self.tile(self.expand_dims(num_neg_boxes, -1), (1, self.num_boxes))
|
||||
top_k_neg_mask = F.cast(self.less(relative_position, tile_num_neg_boxes), mstype.float32)
|
||||
|
||||
cross_entropy = self.cross_entropy(batch_conf, conf_t)
|
||||
cross_entropy = F.reshape(cross_entropy, conf_t_shape)
|
||||
|
||||
loss_c = self.reduce_sum(cross_entropy * self.minimum(mask_pos + top_k_neg_mask, 1))
|
||||
|
||||
loss_c = loss_c / N
|
||||
|
||||
return loss_l, loss_c, loss_landm
|
|
@ -0,0 +1,81 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""learning rate schedule."""
|
||||
import math
|
||||
import numpy as np
|
||||
|
||||
|
||||
def warmup_cosine_annealing_lr(lr5, steps_per_epoch, warmup_epochs, max_epoch, T_max, eta_min=0):
|
||||
""" warmup cosine annealing lr"""
|
||||
base_lr = lr5
|
||||
warmup_init_lr = 0
|
||||
total_steps = int(max_epoch * steps_per_epoch)
|
||||
warmup_steps = int(warmup_epochs * steps_per_epoch)
|
||||
|
||||
lr_each_step = []
|
||||
for i in range(total_steps):
|
||||
last_epoch = i // steps_per_epoch
|
||||
if i < warmup_steps:
|
||||
lr5 = linear_warmup_lr(i + 1, warmup_steps, base_lr, warmup_init_lr)
|
||||
else:
|
||||
lr5 = eta_min + (base_lr - eta_min) * (1. + math.cos(math.pi * last_epoch / T_max)) / 2
|
||||
lr_each_step.append(lr5)
|
||||
|
||||
return np.array(lr_each_step).astype(np.float32)
|
||||
|
||||
|
||||
def _linear_warmup_learning_rate(current_step, warmup_steps, base_lr, init_lr):
|
||||
lr_inc = (float(base_lr) - float(init_lr)) / float(warmup_steps)
|
||||
learning_rate = float(init_lr) + lr_inc * current_step
|
||||
return learning_rate
|
||||
|
||||
|
||||
def _a_cosine_learning_rate(current_step, base_lr, warmup_steps, decay_steps):
|
||||
base = float(current_step - warmup_steps) / float(decay_steps)
|
||||
learning_rate = (1 + math.cos(base * math.pi)) / 2 * base_lr
|
||||
return learning_rate
|
||||
|
||||
|
||||
def _dynamic_lr(base_lr, total_steps, warmup_steps, warmup_ratio=1 / 3):
|
||||
lr = []
|
||||
for i in range(total_steps):
|
||||
if i < warmup_steps:
|
||||
lr.append(_linear_warmup_learning_rate(i, warmup_steps, base_lr, base_lr * warmup_ratio))
|
||||
else:
|
||||
lr.append(_a_cosine_learning_rate(i, base_lr, warmup_steps, total_steps))
|
||||
|
||||
return lr
|
||||
|
||||
|
||||
def adjust_learning_rate(initial_lr, gamma, stepvalues, steps_pre_epoch, total_epochs, warmup_epoch=5, lr_type1=None):
|
||||
"""adjust_learning_rate"""
|
||||
if lr_type1 == 'dynamic_lr':
|
||||
return _dynamic_lr(initial_lr, total_epochs * steps_pre_epoch, warmup_epoch * steps_pre_epoch,
|
||||
warmup_ratio=1 / 3)
|
||||
|
||||
lr_each_step = []
|
||||
for epoch in range(1, total_epochs + 1):
|
||||
for _ in range(steps_pre_epoch):
|
||||
if epoch <= warmup_epoch:
|
||||
lr = 0.1 * initial_lr * (1.5849 ** (epoch - 1))
|
||||
else:
|
||||
if stepvalues[0] <= epoch <= stepvalues[1]:
|
||||
lr = initial_lr * (gamma ** (1))
|
||||
elif epoch > stepvalues[1]:
|
||||
lr = initial_lr * (gamma ** (2))
|
||||
else:
|
||||
lr = initial_lr
|
||||
lr_each_step.append(lr)
|
||||
return lr_each_step
|
|
@ -0,0 +1,610 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Network."""
|
||||
import math
|
||||
from functools import reduce
|
||||
import numpy as np
|
||||
|
||||
import mindspore
|
||||
import mindspore.nn as nn
|
||||
from mindspore.ops import functional as F
|
||||
from mindspore.ops import operations as P
|
||||
from mindspore.ops import composite as C
|
||||
from mindspore import context, Tensor
|
||||
from mindspore.parallel._auto_parallel_context import auto_parallel_context
|
||||
from mindspore.communication.management import get_group_size
|
||||
|
||||
# ResNet
|
||||
def _weight_variable(shape, factor=0.01):
|
||||
init_value = np.random.randn(*shape).astype(np.float32) * factor
|
||||
return Tensor(init_value)
|
||||
|
||||
def _conv3x3(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 3, 3)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=3, stride=stride, padding=1, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv1x1(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 1, 1)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=1, stride=stride, padding=0, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv7x7(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 7, 7)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=7, stride=stride, padding=3, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _bn(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _bn_last(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _fc(in_channel, out_channel):
|
||||
weight_shape = (out_channel, in_channel)
|
||||
weight = _weight_variable(weight_shape)
|
||||
return nn.Dense(in_channel, out_channel, has_bias=True, weight_init=weight, bias_init=0)
|
||||
|
||||
class ResidualBlock(nn.Cell):
|
||||
"""ResidualBlock"""
|
||||
expansion = 4
|
||||
|
||||
def __init__(self,
|
||||
in_channel,
|
||||
out_channel,
|
||||
stride=1):
|
||||
super(ResidualBlock, self).__init__()
|
||||
|
||||
channel = out_channel // self.expansion
|
||||
self.conv1 = _conv1x1(in_channel, channel, stride=1)
|
||||
self.bn1 = _bn(channel)
|
||||
|
||||
self.conv2 = _conv3x3(channel, channel, stride=stride)
|
||||
self.bn2 = _bn(channel)
|
||||
|
||||
self.conv3 = _conv1x1(channel, out_channel, stride=1)
|
||||
self.bn3 = _bn_last(out_channel)
|
||||
|
||||
self.relu = nn.ReLU()
|
||||
|
||||
self.down_sample = False
|
||||
|
||||
if stride != 1 or in_channel != out_channel:
|
||||
self.down_sample = True
|
||||
self.down_sample_layer = None
|
||||
|
||||
if self.down_sample:
|
||||
self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride),
|
||||
_bn(out_channel)])
|
||||
self.add = P.Add()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
identity = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv3(out)
|
||||
out = self.bn3(out)
|
||||
|
||||
if self.down_sample:
|
||||
identity = self.down_sample_layer(identity)
|
||||
|
||||
out = self.add(out, identity)
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
class ResNet(nn.Cell):
|
||||
"""ResNet"""
|
||||
def __init__(self,
|
||||
block,
|
||||
layer_nums,
|
||||
in_channels,
|
||||
out_channels,
|
||||
strides,
|
||||
num_classes):
|
||||
super(ResNet, self).__init__()
|
||||
|
||||
if not len(layer_nums) == len(in_channels) == len(out_channels) == 4:
|
||||
raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!")
|
||||
|
||||
self.conv1 = _conv7x7(3, 64, stride=2)
|
||||
self.bn1 = _bn(64)
|
||||
self.relu = P.ReLU()
|
||||
|
||||
|
||||
self.pad = P.Pad(((0, 0), (0, 0), (1, 0), (1, 0)))
|
||||
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="valid")
|
||||
|
||||
|
||||
self.layer1 = self._make_layer(block,
|
||||
layer_nums[0],
|
||||
in_channel=in_channels[0],
|
||||
out_channel=out_channels[0],
|
||||
stride=strides[0])
|
||||
self.layer2 = self._make_layer(block,
|
||||
layer_nums[1],
|
||||
in_channel=in_channels[1],
|
||||
out_channel=out_channels[1],
|
||||
stride=strides[1])
|
||||
self.layer3 = self._make_layer(block,
|
||||
layer_nums[2],
|
||||
in_channel=in_channels[2],
|
||||
out_channel=out_channels[2],
|
||||
stride=strides[2])
|
||||
self.layer4 = self._make_layer(block,
|
||||
layer_nums[3],
|
||||
in_channel=in_channels[3],
|
||||
out_channel=out_channels[3],
|
||||
stride=strides[3])
|
||||
|
||||
self.mean = P.ReduceMean(keep_dims=True)
|
||||
self.flatten = nn.Flatten()
|
||||
self.end_point = _fc(out_channels[3], num_classes)
|
||||
|
||||
def _make_layer(self, block, layer_num, in_channel, out_channel, stride):
|
||||
"""_make_layer"""
|
||||
layers = []
|
||||
|
||||
resnet_block = block(in_channel, out_channel, stride=stride)
|
||||
layers.append(resnet_block)
|
||||
|
||||
for _ in range(1, layer_num):
|
||||
resnet_block = block(out_channel, out_channel, stride=1)
|
||||
layers.append(resnet_block)
|
||||
|
||||
return nn.SequentialCell(layers)
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.relu(x)
|
||||
x = self.pad(x)
|
||||
|
||||
c1 = self.maxpool(x)
|
||||
|
||||
c2 = self.layer1(c1)
|
||||
c3 = self.layer2(c2)
|
||||
c4 = self.layer3(c3)
|
||||
c5 = self.layer4(c4)
|
||||
|
||||
out = self.mean(c5, (2, 3))
|
||||
out = self.flatten(out)
|
||||
out = self.end_point(out)
|
||||
|
||||
return c3, c4, c5
|
||||
|
||||
def resnet50(class_num=10):
|
||||
return ResNet(ResidualBlock,
|
||||
[3, 4, 6, 3],
|
||||
[64, 256, 512, 1024],
|
||||
[256, 512, 1024, 2048],
|
||||
[1, 2, 2, 2],
|
||||
class_num)
|
||||
|
||||
|
||||
# MobileNet0.25
|
||||
def conv_bn(inp, oup, stride=1, leaky=0):
|
||||
return nn.SequentialCell([
|
||||
nn.Conv2d(in_channels=inp, out_channels=oup, kernel_size=3, stride=stride,
|
||||
pad_mode='pad', padding=1, has_bias=False),
|
||||
nn.BatchNorm2d(num_features=oup, momentum=0.9),
|
||||
nn.LeakyReLU(alpha=leaky) # ms official: nn.get_activation('relu6')
|
||||
])
|
||||
|
||||
def conv_dw(inp, oup, stride, leaky=0.1):
|
||||
return nn.SequentialCell([
|
||||
nn.Conv2d(in_channels=inp, out_channels=inp, kernel_size=3, stride=stride,
|
||||
pad_mode='pad', padding=1, group=inp, has_bias=False),
|
||||
nn.BatchNorm2d(num_features=inp, momentum=0.9),
|
||||
nn.LeakyReLU(alpha=leaky), # ms official: nn.get_activation('relu6')
|
||||
|
||||
nn.Conv2d(in_channels=inp, out_channels=oup, kernel_size=1, stride=1,
|
||||
pad_mode='pad', padding=0, has_bias=False),
|
||||
nn.BatchNorm2d(num_features=oup, momentum=0.9),
|
||||
nn.LeakyReLU(alpha=leaky), # ms official: nn.get_activation('relu6')
|
||||
])
|
||||
|
||||
|
||||
class MobileNetV1(nn.Cell):
|
||||
"""MobileNetV1"""
|
||||
def __init__(self, num_classes):
|
||||
super(MobileNetV1, self).__init__()
|
||||
self.stage1 = nn.SequentialCell([
|
||||
conv_bn(3, 8, 2, leaky=0.1), # 3
|
||||
conv_dw(8, 16, 1), # 7
|
||||
conv_dw(16, 32, 2), # 11
|
||||
conv_dw(32, 32, 1), # 19
|
||||
conv_dw(32, 64, 2), # 27
|
||||
conv_dw(64, 64, 1), # 43
|
||||
])
|
||||
self.stage2 = nn.SequentialCell([
|
||||
conv_dw(64, 128, 2), # 43 + 16 = 59
|
||||
conv_dw(128, 128, 1), # 59 + 32 = 91
|
||||
conv_dw(128, 128, 1), # 91 + 32 = 123
|
||||
conv_dw(128, 128, 1), # 123 + 32 = 155
|
||||
conv_dw(128, 128, 1), # 155 + 32 = 187
|
||||
conv_dw(128, 128, 1), # 187 + 32 = 219
|
||||
])
|
||||
self.stage3 = nn.SequentialCell([
|
||||
conv_dw(128, 256, 2), # 219 +3 2 = 241
|
||||
conv_dw(256, 256, 1), # 241 + 64 = 301
|
||||
])
|
||||
self.avg = P.ReduceMean()
|
||||
self.fc = nn.Dense(in_channels=256, out_channels=num_classes)
|
||||
|
||||
def construct(self, x):
|
||||
x1 = self.stage1(x)
|
||||
x2 = self.stage2(x1)
|
||||
x3 = self.stage3(x2)
|
||||
out = self.avg(x3, (2, 3))
|
||||
out = self.fc(out)
|
||||
return x1, x2, x3
|
||||
|
||||
|
||||
def mobilenet025(class_num=1000):
|
||||
return MobileNetV1(class_num)
|
||||
|
||||
|
||||
# RetinaFace
|
||||
def Init_KaimingUniform(arr_shape, a=0, nonlinearity='leaky_relu', has_bias=False):
|
||||
"""Init_KaimingUniform"""
|
||||
def _calculate_in_and_out(arr_shape):
|
||||
dim = len(arr_shape)
|
||||
if dim < 2:
|
||||
raise ValueError("If initialize data with xavier uniform, the dimension of data must greater than 1.")
|
||||
|
||||
n_in = arr_shape[1]
|
||||
n_out = arr_shape[0]
|
||||
|
||||
if dim > 2:
|
||||
|
||||
counter = reduce(lambda x, y: x * y, arr_shape[2:])
|
||||
n_in *= counter
|
||||
n_out *= counter
|
||||
return n_in, n_out
|
||||
|
||||
def calculate_gain(nonlinearity, a=None):
|
||||
linear_fans = ['linear', 'conv1d', 'conv2d', 'conv3d',
|
||||
'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d']
|
||||
if nonlinearity in linear_fans or nonlinearity == 'sigmoid':
|
||||
return 1
|
||||
if nonlinearity == 'tanh':
|
||||
return 5.0 / 3
|
||||
if nonlinearity == 'relu':
|
||||
return math.sqrt(2.0)
|
||||
if nonlinearity == 'leaky_relu':
|
||||
if a is None:
|
||||
negative_slope = 0.01
|
||||
elif not isinstance(a, bool) and isinstance(a, int) or isinstance(a, float):
|
||||
negative_slope = a
|
||||
else:
|
||||
raise ValueError("negative_slope {} not a valid number".format(a))
|
||||
return math.sqrt(2.0 / (1 + negative_slope ** 2))
|
||||
|
||||
raise ValueError("Unsupported nonlinearity {}".format(nonlinearity))
|
||||
|
||||
fan_in, _ = _calculate_in_and_out(arr_shape)
|
||||
gain = calculate_gain(nonlinearity, a)
|
||||
std = gain / math.sqrt(fan_in)
|
||||
bound = math.sqrt(3.0) * std
|
||||
weight = np.random.uniform(-bound, bound, arr_shape).astype(np.float32)
|
||||
|
||||
bias = None
|
||||
if has_bias:
|
||||
bound_bias = 1 / math.sqrt(fan_in)
|
||||
bias = np.random.uniform(-bound_bias, bound_bias, arr_shape[0:1]).astype(np.float32)
|
||||
bias = Tensor(bias)
|
||||
|
||||
return Tensor(weight), bias
|
||||
|
||||
class ConvBNReLU(nn.SequentialCell):
|
||||
def __init__(self, in_planes, out_planes, kernel_size, stride, padding, groups, norm_layer, leaky=0):
|
||||
weight_shape = (out_planes, in_planes, kernel_size, kernel_size)
|
||||
kaiming_weight, _ = Init_KaimingUniform(weight_shape, a=math.sqrt(5))
|
||||
|
||||
super(ConvBNReLU, self).__init__(
|
||||
nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_mode='pad', padding=padding, group=groups,
|
||||
has_bias=False, weight_init=kaiming_weight),
|
||||
norm_layer(out_planes),
|
||||
nn.LeakyReLU(alpha=leaky)
|
||||
)
|
||||
|
||||
class ConvBN(nn.SequentialCell):
|
||||
def __init__(self, in_planes, out_planes, kernel_size, stride, padding, groups, norm_layer):
|
||||
weight_shape = (out_planes, in_planes, kernel_size, kernel_size)
|
||||
kaiming_weight, _ = Init_KaimingUniform(weight_shape, a=math.sqrt(5))
|
||||
|
||||
super(ConvBN, self).__init__(
|
||||
nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_mode='pad', padding=padding, group=groups,
|
||||
has_bias=False, weight_init=kaiming_weight),
|
||||
norm_layer(out_planes),
|
||||
)
|
||||
|
||||
class SSH(nn.Cell):
|
||||
"""SSH"""
|
||||
def __init__(self, in_channel, out_channel):
|
||||
super(SSH, self).__init__()
|
||||
assert out_channel % 4 == 0
|
||||
leaky = 0
|
||||
if out_channel <= 64:
|
||||
leaky = 0.1
|
||||
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.conv3X3 = ConvBN(in_channel, out_channel // 2, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.conv5X5_1 = ConvBNReLU(in_channel, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.conv5X5_2 = ConvBN(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.conv7X7_2 = ConvBNReLU(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.conv7X7_3 = ConvBN(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.cat = P.Concat(axis=1)
|
||||
self.relu = nn.ReLU()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
conv3X3 = self.conv3X3(x)
|
||||
|
||||
conv5X5_1 = self.conv5X5_1(x)
|
||||
conv5X5 = self.conv5X5_2(conv5X5_1)
|
||||
|
||||
conv7X7_2 = self.conv7X7_2(conv5X5_1)
|
||||
conv7X7 = self.conv7X7_3(conv7X7_2)
|
||||
|
||||
out = self.cat((conv3X3, conv5X5, conv7X7))
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
class FPN(nn.Cell):
|
||||
"""FPN"""
|
||||
def __init__(self, cfg):
|
||||
super(FPN, self).__init__()
|
||||
out_channels = cfg['out_channel']
|
||||
leaky = 0
|
||||
if out_channels <= 64:
|
||||
leaky = 0.1
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.output1 = ConvBNReLU(cfg['in_channel'] * 2, cfg['out_channel'], kernel_size=1, stride=1,
|
||||
padding=0, groups=1, norm_layer=norm_layer, leaky=leaky)
|
||||
self.output2 = ConvBNReLU(cfg['in_channel'] * 4, cfg['out_channel'], kernel_size=1, stride=1,
|
||||
padding=0, groups=1, norm_layer=norm_layer, leaky=leaky)
|
||||
self.output3 = ConvBNReLU(cfg['in_channel'] * 8, cfg['out_channel'], kernel_size=1, stride=1,
|
||||
padding=0, groups=1, norm_layer=norm_layer, leaky=leaky)
|
||||
|
||||
self.merge1 = ConvBNReLU(cfg['out_channel'], cfg['out_channel'], kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.merge2 = ConvBNReLU(cfg['out_channel'], cfg['out_channel'], kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
|
||||
def construct(self, input1, input2, input3):
|
||||
"""construct"""
|
||||
output1 = self.output1(input1)
|
||||
output2 = self.output2(input2)
|
||||
output3 = self.output3(input3)
|
||||
|
||||
up3 = P.ResizeNearestNeighbor([P.Shape()(output2)[2], P.Shape()(output2)[3]])(output3)
|
||||
output2 = up3 + output2
|
||||
output2 = self.merge2(output2)
|
||||
|
||||
up2 = P.ResizeNearestNeighbor([P.Shape()(output1)[2], P.Shape()(output1)[3]])(output2)
|
||||
output1 = up2 + output1
|
||||
output1 = self.merge1(output1)
|
||||
|
||||
return output1, output2, output3
|
||||
|
||||
class ClassHead(nn.Cell):
|
||||
"""ClassHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(ClassHead, self).__init__()
|
||||
self.num_anchors = num_anchors
|
||||
|
||||
weight_shape = (self.num_anchors * 2, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, self.num_anchors * 2, kernel_size=(1, 1), stride=1, padding=0,
|
||||
has_bias=True, weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 2))
|
||||
|
||||
class BboxHead(nn.Cell):
|
||||
"""BboxHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(BboxHead, self).__init__()
|
||||
|
||||
weight_shape = (num_anchors * 4, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, num_anchors * 4, kernel_size=(1, 1), stride=1, padding=0, has_bias=True,
|
||||
weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 4))
|
||||
|
||||
class LandmarkHead(nn.Cell):
|
||||
"""LandmarkHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(LandmarkHead, self).__init__()
|
||||
|
||||
weight_shape = (num_anchors * 10, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, num_anchors * 10, kernel_size=(1, 1), stride=1, padding=0, has_bias=True,
|
||||
weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 10))
|
||||
|
||||
class RetinaFace(nn.Cell):
|
||||
"""RetinaFace"""
|
||||
def __init__(self, phase='train', backbone=None, cfg=None):
|
||||
|
||||
super(RetinaFace, self).__init__()
|
||||
self.phase = phase
|
||||
|
||||
self.base = backbone
|
||||
|
||||
self.fpn = FPN(cfg)
|
||||
|
||||
self.ssh1 = SSH(cfg['out_channel'], cfg['out_channel'])
|
||||
self.ssh2 = SSH(cfg['out_channel'], cfg['out_channel'])
|
||||
self.ssh3 = SSH(cfg['out_channel'], cfg['out_channel'])
|
||||
|
||||
self.ClassHead = self._make_class_head(fpn_num=3, inchannels=[cfg['out_channel'], cfg['out_channel'],
|
||||
cfg['out_channel']], anchor_num=[2, 2, 2])
|
||||
self.BboxHead = self._make_bbox_head(fpn_num=3, inchannels=[cfg['out_channel'], cfg['out_channel'],
|
||||
cfg['out_channel']], anchor_num=[2, 2, 2])
|
||||
self.LandmarkHead = self._make_landmark_head(fpn_num=3, inchannels=[cfg['out_channel'],
|
||||
cfg['out_channel'],
|
||||
cfg['out_channel']],
|
||||
anchor_num=[2, 2, 2])
|
||||
|
||||
self.cat = P.Concat(axis=1)
|
||||
|
||||
def _make_class_head(self, fpn_num, inchannels, anchor_num):
|
||||
classhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
classhead.append(ClassHead(inchannels[i], anchor_num[i]))
|
||||
return classhead
|
||||
|
||||
def _make_bbox_head(self, fpn_num, inchannels, anchor_num):
|
||||
bboxhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
bboxhead.append(BboxHead(inchannels[i], anchor_num[i]))
|
||||
return bboxhead
|
||||
|
||||
def _make_landmark_head(self, fpn_num, inchannels, anchor_num):
|
||||
landmarkhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
landmarkhead.append(LandmarkHead(inchannels[i], anchor_num[i]))
|
||||
return landmarkhead
|
||||
|
||||
def construct(self, inputs):
|
||||
"""construct"""
|
||||
f1, f2, f3 = self.base(inputs)
|
||||
f1, f2, f3 = self.fpn(f1, f2, f3)
|
||||
|
||||
# SSH
|
||||
f1 = self.ssh1(f1)
|
||||
f2 = self.ssh2(f2)
|
||||
f3 = self.ssh3(f3)
|
||||
features = [f1, f2, f3]
|
||||
|
||||
bbox = ()
|
||||
for i, feature in enumerate(features):
|
||||
bbox = bbox + (self.BboxHead[i](feature),)
|
||||
bbox_regressions = self.cat(bbox)
|
||||
|
||||
cls = ()
|
||||
for i, feature in enumerate(features):
|
||||
cls = cls + (self.ClassHead[i](feature),)
|
||||
classifications = self.cat(cls)
|
||||
|
||||
landm = ()
|
||||
for i, feature in enumerate(features):
|
||||
landm = landm + (self.LandmarkHead[i](feature),)
|
||||
ldm_regressions = self.cat(landm)
|
||||
|
||||
if self.phase == 'train':
|
||||
output = (bbox_regressions, classifications, ldm_regressions)
|
||||
else:
|
||||
output = (bbox_regressions, P.Softmax(-1)(classifications), ldm_regressions)
|
||||
|
||||
return output
|
||||
|
||||
class RetinaFaceWithLossCell(nn.Cell):
|
||||
"""RetinaFaceWithLossCell"""
|
||||
def __init__(self, network, multibox_loss, config):
|
||||
super(RetinaFaceWithLossCell, self).__init__()
|
||||
self.network = network
|
||||
self.loc_weight = config['loc_weight']
|
||||
self.class_weight = config['class_weight']
|
||||
self.landm_weight = config['landm_weight']
|
||||
self.multibox_loss = multibox_loss
|
||||
|
||||
def construct(self, img, loc_t, conf_t, landm_t):
|
||||
pred_loc, pre_conf, pre_landm = self.network(img)
|
||||
loss_loc, loss_conf, loss_landm = self.multibox_loss(pred_loc, loc_t, pre_conf, conf_t, pre_landm, landm_t)
|
||||
|
||||
return loss_loc * self.loc_weight + loss_conf * self.class_weight + loss_landm * self.landm_weight
|
||||
|
||||
class TrainingWrapper(nn.Cell):
|
||||
"""TrainingWrapper"""
|
||||
def __init__(self, network, optimizer, sens=1.0):
|
||||
super(TrainingWrapper, self).__init__(auto_prefix=False)
|
||||
self.network = network
|
||||
self.weights = mindspore.ParameterTuple(network.trainable_params())
|
||||
self.optimizer = optimizer
|
||||
self.grad = C.GradOperation(get_by_list=True, sens_param=True)
|
||||
self.sens = sens
|
||||
self.reducer_flag = False
|
||||
self.grad_reducer = None
|
||||
self.parallel_mode = context.get_auto_parallel_context("parallel_mode")
|
||||
class_list = [mindspore.context.ParallelMode.DATA_PARALLEL, mindspore.context.ParallelMode.HYBRID_PARALLEL]
|
||||
if self.parallel_mode in class_list:
|
||||
self.reducer_flag = True
|
||||
if self.reducer_flag:
|
||||
mean = context.get_auto_parallel_context("gradients_mean")
|
||||
if auto_parallel_context().get_device_num_is_set():
|
||||
degree = context.get_auto_parallel_context("device_num")
|
||||
else:
|
||||
degree = get_group_size()
|
||||
self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree)
|
||||
|
||||
def construct(self, *args):
|
||||
weights = self.weights
|
||||
loss = self.network(*args)
|
||||
sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
|
||||
grads = self.grad(self.network, weights)(*args, sens)
|
||||
if self.reducer_flag:
|
||||
# apply grad reducer on grads
|
||||
grads = self.grad_reducer(grads)
|
||||
return F.depend(loss, self.optimizer(grads))
|
|
@ -0,0 +1,578 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Network."""
|
||||
import math
|
||||
from functools import reduce
|
||||
import numpy as np
|
||||
|
||||
import mindspore
|
||||
import mindspore.nn as nn
|
||||
from mindspore.ops import functional as F
|
||||
from mindspore.ops import operations as P
|
||||
from mindspore.ops import composite as C
|
||||
from mindspore import context, Tensor
|
||||
from mindspore.parallel._auto_parallel_context import auto_parallel_context
|
||||
from mindspore.communication.management import get_group_size
|
||||
|
||||
conv_weight_init = 'HeUniform'
|
||||
|
||||
# ResNet
|
||||
def _weight_variable(shape, factor=0.01):
|
||||
init_value = np.random.randn(*shape).astype(np.float32) * factor
|
||||
return Tensor(init_value)
|
||||
|
||||
def _conv3x3(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 3, 3)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=3, stride=stride, padding=1, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv1x1(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 1, 1)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=1, stride=stride, padding=0, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv7x7(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 7, 7)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=7, stride=stride, padding=3, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _bn(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _bn_last(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _fc(in_channel, out_channel):
|
||||
weight_shape = (out_channel, in_channel)
|
||||
weight = _weight_variable(weight_shape)
|
||||
return nn.Dense(in_channel, out_channel, has_bias=True, weight_init=weight, bias_init=0)
|
||||
|
||||
class ResidualBlock(nn.Cell):
|
||||
"""ResidualBlock"""
|
||||
expansion = 4
|
||||
|
||||
def __init__(self,
|
||||
in_channel,
|
||||
out_channel,
|
||||
stride=1):
|
||||
super(ResidualBlock, self).__init__()
|
||||
|
||||
channel = out_channel // self.expansion
|
||||
self.conv1 = _conv1x1(in_channel, channel, stride=1)
|
||||
self.bn1 = _bn(channel)
|
||||
|
||||
self.conv2 = _conv3x3(channel, channel, stride=stride)
|
||||
self.bn2 = _bn(channel)
|
||||
|
||||
self.conv3 = _conv1x1(channel, out_channel, stride=1)
|
||||
self.bn3 = _bn_last(out_channel)
|
||||
|
||||
self.relu = nn.ReLU()
|
||||
|
||||
self.down_sample = False
|
||||
|
||||
if stride != 1 or in_channel != out_channel:
|
||||
self.down_sample = True
|
||||
self.down_sample_layer = None
|
||||
|
||||
if self.down_sample:
|
||||
self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride),
|
||||
_bn(out_channel)])
|
||||
self.add = P.Add()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
identity = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv3(out)
|
||||
out = self.bn3(out)
|
||||
|
||||
if self.down_sample:
|
||||
identity = self.down_sample_layer(identity)
|
||||
|
||||
out = self.add(out, identity)
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
class ResNet(nn.Cell):
|
||||
"""ResNet"""
|
||||
def __init__(self,
|
||||
block,
|
||||
layer_nums,
|
||||
in_channels,
|
||||
out_channels,
|
||||
strides,
|
||||
num_classes):
|
||||
super(ResNet, self).__init__()
|
||||
|
||||
if not len(layer_nums) == len(in_channels) == len(out_channels) == 4:
|
||||
raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!")
|
||||
|
||||
self.conv1 = _conv7x7(3, 64, stride=2)
|
||||
self.bn1 = _bn(64)
|
||||
self.relu = P.ReLU()
|
||||
|
||||
self.zeros1 = P.Zeros()
|
||||
self.zeros2 = P.Zeros()
|
||||
self.concat1 = P.Concat(axis=2)
|
||||
self.concat2 = P.Concat(axis=3)
|
||||
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="valid")
|
||||
|
||||
|
||||
self.layer1 = self._make_layer(block,
|
||||
layer_nums[0],
|
||||
in_channel=in_channels[0],
|
||||
out_channel=out_channels[0],
|
||||
stride=strides[0])
|
||||
self.layer2 = self._make_layer(block,
|
||||
layer_nums[1],
|
||||
in_channel=in_channels[1],
|
||||
out_channel=out_channels[1],
|
||||
stride=strides[1])
|
||||
self.layer3 = self._make_layer(block,
|
||||
layer_nums[2],
|
||||
in_channel=in_channels[2],
|
||||
out_channel=out_channels[2],
|
||||
stride=strides[2])
|
||||
self.layer4 = self._make_layer(block,
|
||||
layer_nums[3],
|
||||
in_channel=in_channels[3],
|
||||
out_channel=out_channels[3],
|
||||
stride=strides[3])
|
||||
|
||||
self.mean = P.ReduceMean(keep_dims=True)
|
||||
self.flatten = nn.Flatten()
|
||||
self.end_point = _fc(out_channels[3], num_classes)
|
||||
|
||||
def _make_layer(self, block, layer_num, in_channel, out_channel, stride):
|
||||
"""_make_layer"""
|
||||
layers = []
|
||||
|
||||
resnet_block = block(in_channel, out_channel, stride=stride)
|
||||
layers.append(resnet_block)
|
||||
|
||||
for _ in range(1, layer_num):
|
||||
resnet_block = block(out_channel, out_channel, stride=1)
|
||||
layers.append(resnet_block)
|
||||
|
||||
return nn.SequentialCell(layers)
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.relu(x)
|
||||
zeros1 = self.zeros1((x.shape[0], x.shape[1], 1, x.shape[3]), mindspore.float32)
|
||||
x = self.concat1((zeros1, x))
|
||||
zeros2 = self.zeros2((x.shape[0], x.shape[1], x.shape[2], 1), mindspore.float32)
|
||||
x = self.concat2((zeros2, x))
|
||||
|
||||
c1 = self.maxpool(x)
|
||||
|
||||
c2 = self.layer1(c1)
|
||||
c3 = self.layer2(c2)
|
||||
c4 = self.layer3(c3)
|
||||
c5 = self.layer4(c4)
|
||||
|
||||
out = self.mean(c5, (2, 3))
|
||||
out = self.flatten(out)
|
||||
out = self.end_point(out)
|
||||
|
||||
return c3, c4, c5
|
||||
|
||||
def resnet50(class_num=10):
|
||||
return ResNet(ResidualBlock,
|
||||
[3, 4, 6, 3],
|
||||
[64, 256, 512, 1024],
|
||||
[256, 512, 1024, 2048],
|
||||
[1, 2, 2, 2],
|
||||
class_num)
|
||||
|
||||
|
||||
# RetinaFace
|
||||
def Init_KaimingUniform(arr_shape, a=0, nonlinearity='leaky_relu', has_bias=False):
|
||||
"""Init_KaimingUniform"""
|
||||
def _calculate_in_and_out(arr_shape):
|
||||
dim = len(arr_shape)
|
||||
if dim < 2:
|
||||
raise ValueError("If initialize data with xavier uniform, the dimension of data must greater than 1.")
|
||||
|
||||
n_in = arr_shape[1]
|
||||
n_out = arr_shape[0]
|
||||
|
||||
if dim > 2:
|
||||
|
||||
counter = reduce(lambda x, y: x * y, arr_shape[2:])
|
||||
n_in *= counter
|
||||
n_out *= counter
|
||||
return n_in, n_out
|
||||
|
||||
def calculate_gain(nonlinearity, a=None):
|
||||
linear_fans = ['linear', 'conv1d', 'conv2d', 'conv3d',
|
||||
'conv_transpose1d', 'conv_transpose2d', 'conv_transpose3d']
|
||||
if nonlinearity in linear_fans or nonlinearity == 'sigmoid':
|
||||
return 1
|
||||
if nonlinearity == 'tanh':
|
||||
return 5.0 / 3
|
||||
if nonlinearity == 'relu':
|
||||
return math.sqrt(2.0)
|
||||
if nonlinearity == 'leaky_relu':
|
||||
if a is None:
|
||||
negative_slope = 0.01
|
||||
elif not isinstance(a, bool) and isinstance(a, int) or isinstance(a, float):
|
||||
negative_slope = a
|
||||
else:
|
||||
raise ValueError("negative_slope {} not a valid number".format(a))
|
||||
return math.sqrt(2.0 / (1 + negative_slope ** 2))
|
||||
|
||||
raise ValueError("Unsupported nonlinearity {}".format(nonlinearity))
|
||||
|
||||
fan_in, _ = _calculate_in_and_out(arr_shape)
|
||||
gain = calculate_gain(nonlinearity, a)
|
||||
std = gain / math.sqrt(fan_in)
|
||||
bound = math.sqrt(3.0) * std
|
||||
weight = np.random.uniform(-bound, bound, arr_shape).astype(np.float32)
|
||||
|
||||
bias = None
|
||||
if has_bias:
|
||||
bound_bias = 1 / math.sqrt(fan_in)
|
||||
bias = np.random.uniform(-bound_bias, bound_bias, arr_shape[0:1]).astype(np.float32)
|
||||
bias = Tensor(bias)
|
||||
|
||||
return Tensor(weight), bias
|
||||
|
||||
class ConvBNReLU(nn.SequentialCell):
|
||||
def __init__(self, in_planes, out_planes, kernel_size, stride, padding, groups, norm_layer, leaky=0):
|
||||
weight_shape = (out_planes, in_planes, kernel_size, kernel_size)
|
||||
kaiming_weight, _ = Init_KaimingUniform(weight_shape, a=math.sqrt(5))
|
||||
|
||||
super(ConvBNReLU, self).__init__(
|
||||
nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_mode='pad', padding=padding, group=groups,
|
||||
has_bias=False, weight_init=kaiming_weight),
|
||||
norm_layer(out_planes),
|
||||
nn.ReLU()
|
||||
)
|
||||
|
||||
class ConvBN(nn.SequentialCell):
|
||||
def __init__(self, in_planes, out_planes, kernel_size, stride, padding, groups, norm_layer):
|
||||
weight_shape = (out_planes, in_planes, kernel_size, kernel_size)
|
||||
kaiming_weight, _ = Init_KaimingUniform(weight_shape, a=math.sqrt(5))
|
||||
|
||||
super(ConvBN, self).__init__(
|
||||
nn.Conv2d(in_planes, out_planes, kernel_size, stride, pad_mode='pad', padding=padding, group=groups,
|
||||
has_bias=False, weight_init=kaiming_weight),
|
||||
norm_layer(out_planes),
|
||||
)
|
||||
|
||||
class SSH(nn.Cell):
|
||||
"""SSH"""
|
||||
def __init__(self, in_channel, out_channel):
|
||||
super(SSH, self).__init__()
|
||||
assert out_channel % 4 == 0
|
||||
leaky = 0
|
||||
if out_channel <= 64:
|
||||
leaky = 0.1
|
||||
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.conv3X3 = ConvBN(in_channel, out_channel // 2, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.conv5X5_1 = ConvBNReLU(in_channel, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.conv5X5_2 = ConvBN(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.conv7X7_2 = ConvBNReLU(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.conv7X7_3 = ConvBN(out_channel // 4, out_channel // 4, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer)
|
||||
|
||||
self.cat = P.Concat(axis=1)
|
||||
self.relu = nn.ReLU()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
conv3X3 = self.conv3X3(x)
|
||||
|
||||
conv5X5_1 = self.conv5X5_1(x)
|
||||
conv5X5 = self.conv5X5_2(conv5X5_1)
|
||||
|
||||
conv7X7_2 = self.conv7X7_2(conv5X5_1)
|
||||
conv7X7 = self.conv7X7_3(conv7X7_2)
|
||||
|
||||
out = self.cat((conv3X3, conv5X5, conv7X7))
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
class FPN(nn.Cell):
|
||||
"""FPN"""
|
||||
def __init__(self):
|
||||
super(FPN, self).__init__()
|
||||
out_channels = 256
|
||||
leaky = 0
|
||||
if out_channels <= 64:
|
||||
leaky = 0.1
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.output1 = ConvBNReLU(512, 256, kernel_size=1, stride=1, padding=0, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.output2 = ConvBNReLU(1024, 256, kernel_size=1, stride=1, padding=0, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.output3 = ConvBNReLU(2048, 256, kernel_size=1, stride=1, padding=0, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
|
||||
self.merge1 = ConvBNReLU(256, 256, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
self.merge2 = ConvBNReLU(256, 256, kernel_size=3, stride=1, padding=1, groups=1,
|
||||
norm_layer=norm_layer, leaky=leaky)
|
||||
|
||||
def construct(self, input1, input2, input3):
|
||||
"""construct"""
|
||||
output1 = self.output1(input1)
|
||||
output2 = self.output2(input2)
|
||||
output3 = self.output3(input3)
|
||||
|
||||
up3 = P.ResizeNearestNeighbor([P.Shape()(output2)[2], P.Shape()(output2)[3]])(output3)
|
||||
output2 = up3 + output2
|
||||
output2 = self.merge2(output2)
|
||||
|
||||
up2 = P.ResizeNearestNeighbor([P.Shape()(output1)[2], P.Shape()(output1)[3]])(output2)
|
||||
output1 = up2 + output1
|
||||
output1 = self.merge1(output1)
|
||||
|
||||
return output1, output2, output3
|
||||
|
||||
class ClassHead(nn.Cell):
|
||||
"""ClassHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(ClassHead, self).__init__()
|
||||
self.num_anchors = num_anchors
|
||||
|
||||
weight_shape = (self.num_anchors * 2, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, self.num_anchors * 2, kernel_size=(1, 1), stride=1, padding=0,
|
||||
has_bias=True, weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 2))
|
||||
|
||||
class BboxHead(nn.Cell):
|
||||
"""BboxHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(BboxHead, self).__init__()
|
||||
|
||||
weight_shape = (num_anchors * 4, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, num_anchors * 4, kernel_size=(1, 1), stride=1, padding=0, has_bias=True,
|
||||
weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 4))
|
||||
|
||||
class LandmarkHead(nn.Cell):
|
||||
"""LandmarkHead"""
|
||||
def __init__(self, inchannels=512, num_anchors=3):
|
||||
super(LandmarkHead, self).__init__()
|
||||
|
||||
weight_shape = (num_anchors * 10, inchannels, 1, 1)
|
||||
kaiming_weight, kaiming_bias = Init_KaimingUniform(weight_shape, a=math.sqrt(5), has_bias=True)
|
||||
self.conv1x1 = nn.Conv2d(inchannels, num_anchors * 10, kernel_size=(1, 1), stride=1, padding=0, has_bias=True,
|
||||
weight_init=kaiming_weight, bias_init=kaiming_bias)
|
||||
|
||||
self.permute = P.Transpose()
|
||||
self.reshape = P.Reshape()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
out = self.conv1x1(x)
|
||||
out = self.permute(out, (0, 2, 3, 1))
|
||||
return self.reshape(out, (P.Shape()(out)[0], -1, 10))
|
||||
|
||||
class RetinaFace(nn.Cell):
|
||||
"""RetinaFace"""
|
||||
def __init__(self, phase='train', backbone=None):
|
||||
|
||||
super(RetinaFace, self).__init__()
|
||||
self.phase = phase
|
||||
|
||||
self.base = backbone
|
||||
|
||||
self.fpn = FPN()
|
||||
|
||||
self.ssh1 = SSH(256, 256)
|
||||
self.ssh2 = SSH(256, 256)
|
||||
self.ssh3 = SSH(256, 256)
|
||||
|
||||
self.ClassHead = self._make_class_head(fpn_num=3, inchannels=[256, 256, 256], anchor_num=[2, 2, 2])
|
||||
self.BboxHead = self._make_bbox_head(fpn_num=3, inchannels=[256, 256, 256], anchor_num=[2, 2, 2])
|
||||
self.LandmarkHead = self._make_landmark_head(fpn_num=3, inchannels=[256, 256, 256], anchor_num=[2, 2, 2])
|
||||
|
||||
self.cat = P.Concat(axis=1)
|
||||
|
||||
def _make_class_head(self, fpn_num, inchannels, anchor_num):
|
||||
classhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
classhead.append(ClassHead(inchannels[i], anchor_num[i]))
|
||||
return classhead
|
||||
|
||||
def _make_bbox_head(self, fpn_num, inchannels, anchor_num):
|
||||
bboxhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
bboxhead.append(BboxHead(inchannels[i], anchor_num[i]))
|
||||
return bboxhead
|
||||
|
||||
def _make_landmark_head(self, fpn_num, inchannels, anchor_num):
|
||||
landmarkhead = nn.CellList()
|
||||
for i in range(fpn_num):
|
||||
landmarkhead.append(LandmarkHead(inchannels[i], anchor_num[i]))
|
||||
return landmarkhead
|
||||
|
||||
def construct(self, inputs):
|
||||
"""construct"""
|
||||
f1, f2, f3 = self.base(inputs)
|
||||
f1, f2, f3 = self.fpn(f1, f2, f3)
|
||||
|
||||
# SSH
|
||||
f1 = self.ssh1(f1)
|
||||
f2 = self.ssh2(f2)
|
||||
f3 = self.ssh3(f3)
|
||||
features = [f1, f2, f3]
|
||||
|
||||
bbox = ()
|
||||
for i, feature in enumerate(features):
|
||||
bbox = bbox + (self.BboxHead[i](feature),)
|
||||
bbox_regressions = self.cat(bbox)
|
||||
|
||||
cls = ()
|
||||
for i, feature in enumerate(features):
|
||||
cls = cls + (self.ClassHead[i](feature),)
|
||||
classifications = self.cat(cls)
|
||||
|
||||
landm = ()
|
||||
for i, feature in enumerate(features):
|
||||
landm = landm + (self.LandmarkHead[i](feature),)
|
||||
ldm_regressions = self.cat(landm)
|
||||
|
||||
if self.phase == 'train':
|
||||
output = (bbox_regressions, classifications, ldm_regressions)
|
||||
else:
|
||||
output = (bbox_regressions, P.Softmax(-1)(classifications), ldm_regressions)
|
||||
|
||||
return output
|
||||
|
||||
class RetinaFaceWithLossCell(nn.Cell):
|
||||
"""RetinaFaceWithLossCell"""
|
||||
def __init__(self, network, multibox_loss, config):
|
||||
super(RetinaFaceWithLossCell, self).__init__()
|
||||
self.network = network
|
||||
self.loc_weight = config['loc_weight']
|
||||
self.class_weight = config['class_weight']
|
||||
self.landm_weight = config['landm_weight']
|
||||
self.multibox_loss = multibox_loss
|
||||
|
||||
def construct(self, img, loc_t, conf_t, landm_t):
|
||||
"""construct"""
|
||||
pred_loc, pre_conf, pre_landm = self.network(img)
|
||||
loss_loc, loss_conf, loss_landm = self.multibox_loss(pred_loc, loc_t, pre_conf, conf_t, pre_landm, landm_t) # 跳转到loss.py中的MultiBoxLoss类construct函数
|
||||
|
||||
return loss_loc * self.loc_weight + loss_conf * self.class_weight + loss_landm * self.landm_weight
|
||||
|
||||
# form dsj
|
||||
GRADIENT_CLIP_TYPE = 1
|
||||
GRADIENT_CLIP_VALUE = 1.0
|
||||
|
||||
clip_grad = C.MultitypeFuncGraph("clip_grad")
|
||||
|
||||
|
||||
@clip_grad.register("Number", "Number", "Tensor")
|
||||
def _clip_grad(clip_type, clip_value, grad):
|
||||
"""_clip_grad"""
|
||||
if clip_type not in (0, 1):
|
||||
return grad
|
||||
dt = F.dtype(grad)
|
||||
if clip_type == 0:
|
||||
new_grad = C.clip_by_value(grad, F.cast(F.tuple_to_array((-clip_value,)), dt),
|
||||
F.cast(F.tuple_to_array((clip_value,)), dt))
|
||||
else:
|
||||
new_grad = nn.ClipByNorm()(grad, F.cast(F.tuple_to_array((clip_value,)), dt))
|
||||
return new_grad
|
||||
|
||||
|
||||
class TrainingWrapper(nn.Cell):
|
||||
"""TrainingWrapper"""
|
||||
def __init__(self, network, optimizer, sens=1.0):
|
||||
super(TrainingWrapper, self).__init__(auto_prefix=False)
|
||||
self.network = network
|
||||
self.weights = mindspore.ParameterTuple(network.trainable_params())
|
||||
self.optimizer = optimizer
|
||||
self.grad = C.GradOperation(get_by_list=True, sens_param=True)
|
||||
self.sens = sens
|
||||
self.reducer_flag = False
|
||||
self.grad_reducer = None
|
||||
self.parallel_mode = context.get_auto_parallel_context("parallel_mode")
|
||||
class_list = [mindspore.context.ParallelMode.DATA_PARALLEL, mindspore.context.ParallelMode.HYBRID_PARALLEL]
|
||||
if self.parallel_mode in class_list:
|
||||
self.reducer_flag = True
|
||||
if self.reducer_flag:
|
||||
mean = context.get_auto_parallel_context("gradients_mean")
|
||||
if auto_parallel_context().get_device_num_is_set():
|
||||
degree = context.get_auto_parallel_context("device_num")
|
||||
else:
|
||||
degree = get_group_size()
|
||||
self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree)
|
||||
# from dsj
|
||||
self.hyper_map = mindspore.ops.HyperMap()
|
||||
|
||||
def construct(self, *args):
|
||||
"""construct"""
|
||||
weights = self.weights
|
||||
loss = self.network(*args)
|
||||
sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
|
||||
grads = self.grad(self.network, weights)(*args, sens)
|
||||
# from dsj
|
||||
grads = self.hyper_map(F.partial(clip_grad, GRADIENT_CLIP_TYPE, GRADIENT_CLIP_VALUE), grads)
|
||||
if self.reducer_flag:
|
||||
# apply grad reducer on grads
|
||||
grads = self.grad_reducer(grads)
|
||||
return F.depend(loss, self.optimizer(grads))
|
|
@ -0,0 +1,204 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Network."""
|
||||
import numpy as np
|
||||
|
||||
import mindspore.nn as nn
|
||||
from mindspore.ops import operations as P
|
||||
from mindspore import Tensor
|
||||
|
||||
# ResNet
|
||||
def _weight_variable(shape, factor=0.01):
|
||||
init_value = np.random.randn(*shape).astype(np.float32) * factor
|
||||
return Tensor(init_value)
|
||||
|
||||
def _conv3x3(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 3, 3)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=3, stride=stride, padding=1, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv1x1(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 1, 1)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=1, stride=stride, padding=0, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _conv7x7(in_channel, out_channel, stride=1):
|
||||
weight_shape = (out_channel, in_channel, 7, 7)
|
||||
weight = _weight_variable(weight_shape)
|
||||
|
||||
return nn.Conv2d(in_channel, out_channel,
|
||||
kernel_size=7, stride=stride, padding=3, pad_mode='pad', weight_init=weight)
|
||||
|
||||
def _bn(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _bn_last(channel):
|
||||
return nn.BatchNorm2d(channel)
|
||||
|
||||
|
||||
def _fc(in_channel, out_channel):
|
||||
weight_shape = (out_channel, in_channel)
|
||||
weight = _weight_variable(weight_shape)
|
||||
return nn.Dense(in_channel, out_channel, has_bias=True, weight_init=weight, bias_init=0)
|
||||
|
||||
class ResidualBlock(nn.Cell):
|
||||
"""ResidualBlock"""
|
||||
expansion = 4
|
||||
|
||||
def __init__(self,
|
||||
in_channel,
|
||||
out_channel,
|
||||
stride=1):
|
||||
super(ResidualBlock, self).__init__()
|
||||
|
||||
channel = out_channel // self.expansion
|
||||
self.conv1 = _conv1x1(in_channel, channel, stride=1)
|
||||
self.bn1 = _bn(channel)
|
||||
|
||||
self.conv2 = _conv3x3(channel, channel, stride=stride)
|
||||
self.bn2 = _bn(channel)
|
||||
|
||||
self.conv3 = _conv1x1(channel, out_channel, stride=1)
|
||||
self.bn3 = _bn_last(out_channel)
|
||||
|
||||
self.relu = nn.ReLU()
|
||||
|
||||
self.down_sample = False
|
||||
|
||||
if stride != 1 or in_channel != out_channel:
|
||||
self.down_sample = True
|
||||
self.down_sample_layer = None
|
||||
|
||||
if self.down_sample:
|
||||
self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride),
|
||||
_bn(out_channel)])
|
||||
self.add = P.Add()
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
identity = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv3(out)
|
||||
out = self.bn3(out)
|
||||
|
||||
if self.down_sample:
|
||||
identity = self.down_sample_layer(identity)
|
||||
|
||||
out = self.add(out, identity)
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
class ResNet(nn.Cell):
|
||||
"""ResNet"""
|
||||
def __init__(self,
|
||||
block,
|
||||
layer_nums,
|
||||
in_channels,
|
||||
out_channels,
|
||||
strides,
|
||||
num_classes):
|
||||
super(ResNet, self).__init__()
|
||||
|
||||
if not len(layer_nums) == len(in_channels) == len(out_channels) == 4:
|
||||
raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!")
|
||||
|
||||
self.conv1 = _conv7x7(3, 64, stride=2)
|
||||
self.bn1 = _bn(64)
|
||||
self.relu = P.ReLU()
|
||||
|
||||
|
||||
self.pad = P.Pad(((0, 0), (0, 0), (1, 0), (1, 0)))
|
||||
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="valid")
|
||||
|
||||
|
||||
self.layer1 = self._make_layer(block,
|
||||
layer_nums[0],
|
||||
in_channel=in_channels[0],
|
||||
out_channel=out_channels[0],
|
||||
stride=strides[0])
|
||||
self.layer2 = self._make_layer(block,
|
||||
layer_nums[1],
|
||||
in_channel=in_channels[1],
|
||||
out_channel=out_channels[1],
|
||||
stride=strides[1])
|
||||
self.layer3 = self._make_layer(block,
|
||||
layer_nums[2],
|
||||
in_channel=in_channels[2],
|
||||
out_channel=out_channels[2],
|
||||
stride=strides[2])
|
||||
self.layer4 = self._make_layer(block,
|
||||
layer_nums[3],
|
||||
in_channel=in_channels[3],
|
||||
out_channel=out_channels[3],
|
||||
stride=strides[3])
|
||||
|
||||
self.mean = P.ReduceMean(keep_dims=True)
|
||||
self.flatten = nn.Flatten()
|
||||
self.end_point = _fc(out_channels[3], num_classes)
|
||||
|
||||
def _make_layer(self, block, layer_num, in_channel, out_channel, stride):
|
||||
"""_make_layer"""
|
||||
layers = []
|
||||
|
||||
resnet_block = block(in_channel, out_channel, stride=stride)
|
||||
layers.append(resnet_block)
|
||||
|
||||
for _ in range(1, layer_num):
|
||||
resnet_block = block(out_channel, out_channel, stride=1)
|
||||
layers.append(resnet_block)
|
||||
|
||||
return nn.SequentialCell(layers)
|
||||
|
||||
def construct(self, x):
|
||||
"""construct"""
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.relu(x)
|
||||
x = self.pad(x)
|
||||
|
||||
c1 = self.maxpool(x)
|
||||
|
||||
c2 = self.layer1(c1)
|
||||
c3 = self.layer2(c2)
|
||||
c4 = self.layer3(c3)
|
||||
c5 = self.layer4(c4)
|
||||
|
||||
out = self.mean(c5, (2, 3))
|
||||
out = self.flatten(out)
|
||||
out = self.end_point(out)
|
||||
|
||||
return out
|
||||
|
||||
def resnet50(class_num=1000):
|
||||
return ResNet(ResidualBlock,
|
||||
[3, 4, 6, 3],
|
||||
[64, 256, 512, 1024],
|
||||
[256, 512, 1024, 2048],
|
||||
[1, 2, 2, 2],
|
||||
class_num)
|
|
@ -0,0 +1,165 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Utils."""
|
||||
from itertools import product
|
||||
import math
|
||||
import numpy as np
|
||||
|
||||
|
||||
def prior_box(image_sizes, min_sizes, steps, clip=False):
|
||||
"""prior box"""
|
||||
feature_maps = [
|
||||
[math.ceil(image_sizes[0] / step), math.ceil(image_sizes[1] / step)]
|
||||
for step in steps]
|
||||
|
||||
anchors = []
|
||||
for k, f in enumerate(feature_maps):
|
||||
for i, j in product(range(f[0]), range(f[1])):
|
||||
for min_size in min_sizes[k]:
|
||||
s_kx = min_size / image_sizes[1]
|
||||
s_ky = min_size / image_sizes[0]
|
||||
cx = (j + 0.5) * steps[k] / image_sizes[1]
|
||||
cy = (i + 0.5) * steps[k] / image_sizes[0]
|
||||
anchors += [cx, cy, s_kx, s_ky]
|
||||
|
||||
output = np.asarray(anchors).reshape([-1, 4]).astype(np.float32)
|
||||
|
||||
if clip:
|
||||
output = np.clip(output, 0, 1)
|
||||
|
||||
return output
|
||||
|
||||
def center_point_2_box(boxes):
|
||||
return np.concatenate((boxes[:, 0:2] - boxes[:, 2:4] / 2,
|
||||
boxes[:, 0:2] + boxes[:, 2:4] / 2), axis=1)
|
||||
|
||||
def compute_intersect(a, b):
|
||||
"""compute_intersect"""
|
||||
A = a.shape[0]
|
||||
B = b.shape[0]
|
||||
max_xy = np.minimum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 2:4], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 2:4], 0), [A, B, 2]))
|
||||
min_xy = np.maximum(
|
||||
np.broadcast_to(np.expand_dims(a[:, 0:2], 1), [A, B, 2]),
|
||||
np.broadcast_to(np.expand_dims(b[:, 0:2], 0), [A, B, 2]))
|
||||
inter = np.maximum((max_xy - min_xy), np.zeros_like(max_xy - min_xy))
|
||||
return inter[:, :, 0] * inter[:, :, 1]
|
||||
|
||||
def compute_overlaps(a, b):
|
||||
"""compute_overlaps"""
|
||||
inter = compute_intersect(a, b)
|
||||
area_a = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(a[:, 2] - a[:, 0]) * (a[:, 3] - a[:, 1]), 1),
|
||||
np.shape(inter))
|
||||
area_b = np.broadcast_to(
|
||||
np.expand_dims(
|
||||
(b[:, 2] - b[:, 0]) * (b[:, 3] - b[:, 1]), 0),
|
||||
np.shape(inter))
|
||||
union = area_a + area_b - inter
|
||||
return inter / union
|
||||
|
||||
def match(threshold, boxes, priors, var, labels, landms):
|
||||
"""match"""
|
||||
overlaps = compute_overlaps(boxes, center_point_2_box(priors))
|
||||
|
||||
best_prior_overlap = overlaps.max(1, keepdims=True)
|
||||
best_prior_idx = np.argsort(-overlaps, axis=1)[:, 0:1]
|
||||
|
||||
valid_gt_idx = best_prior_overlap[:, 0] >= 0.2
|
||||
best_prior_idx_filter = best_prior_idx[valid_gt_idx, :]
|
||||
if best_prior_idx_filter.shape[0] <= 0:
|
||||
loc = np.zeros((priors.shape[0], 4), dtype=np.float32)
|
||||
conf = np.zeros((priors.shape[0],), dtype=np.int32)
|
||||
landm = np.zeros((priors.shape[0], 10), dtype=np.float32)
|
||||
return loc, conf, landm
|
||||
|
||||
best_truth_overlap = overlaps.max(0, keepdims=True)
|
||||
best_truth_idx = np.argsort(-overlaps, axis=0)[:1, :]
|
||||
|
||||
best_truth_idx = best_truth_idx.squeeze(0)
|
||||
best_truth_overlap = best_truth_overlap.squeeze(0)
|
||||
best_prior_idx = best_prior_idx.squeeze(1)
|
||||
best_prior_idx_filter = best_prior_idx_filter.squeeze(1)
|
||||
best_truth_overlap[best_prior_idx_filter] = 2
|
||||
|
||||
for j in range(best_prior_idx.shape[0]):
|
||||
best_truth_idx[best_prior_idx[j]] = j
|
||||
|
||||
matches = boxes[best_truth_idx]
|
||||
|
||||
# encode boxes
|
||||
offset_cxcy = (matches[:, 0:2] + matches[:, 2:4]) / 2 - priors[:, 0:2]
|
||||
offset_cxcy /= (var[0] * priors[:, 2:4])
|
||||
wh = (matches[:, 2:4] - matches[:, 0:2]) / priors[:, 2:4]
|
||||
wh[wh == 0] = 1e-12
|
||||
wh = np.log(wh) / var[1]
|
||||
loc = np.concatenate([offset_cxcy, wh], axis=1)
|
||||
|
||||
|
||||
conf = labels[best_truth_idx]
|
||||
conf[best_truth_overlap < threshold] = 0
|
||||
|
||||
matches_landm = landms[best_truth_idx]
|
||||
|
||||
# encode landms
|
||||
matched = np.reshape(matches_landm, [-1, 5, 2])
|
||||
priors = np.broadcast_to(np.expand_dims(priors, 1), [priors.shape[0], 5, 4])
|
||||
offset_cxcy = matched[:, :, 0:2] - priors[:, :, 0:2]
|
||||
offset_cxcy /= (priors[:, :, 2:4] * var[0])
|
||||
landm = np.reshape(offset_cxcy, [-1, 10])
|
||||
|
||||
|
||||
return loc, np.array(conf, dtype=np.int32), landm
|
||||
|
||||
|
||||
class bbox_encode():
|
||||
"""bbox_encode"""
|
||||
def __init__(self, cfg):
|
||||
self.match_thresh = cfg['match_thresh']
|
||||
self.variances = cfg['variance']
|
||||
self.priors = prior_box((cfg['image_size'], cfg['image_size']),
|
||||
[[16, 32], [64, 128], [256, 512]],
|
||||
[8, 16, 32],
|
||||
cfg['clip'])
|
||||
|
||||
def __call__(self, image, targets):
|
||||
|
||||
boxes = targets[:, :4]
|
||||
labels = targets[:, -1]
|
||||
landms = targets[:, 4:14]
|
||||
priors = self.priors
|
||||
|
||||
loc_t, conf_t, landm_t = match(self.match_thresh, boxes, priors, self.variances, labels, landms)
|
||||
|
||||
return image, loc_t, conf_t, landm_t
|
||||
|
||||
def decode_bbox(bbox, priors, var):
|
||||
boxes = np.concatenate((
|
||||
priors[:, 0:2] + bbox[:, 0:2] * var[0] * priors[:, 2:4],
|
||||
priors[:, 2:4] * np.exp(bbox[:, 2:4] * var[1])), axis=1) # (xc, yc, w, h)
|
||||
boxes[:, :2] -= boxes[:, 2:] / 2 # (x0, y0, w, h)
|
||||
boxes[:, 2:] += boxes[:, :2] # (x0, y0, x1, y1)
|
||||
return boxes
|
||||
|
||||
def decode_landm(landm, priors, var):
|
||||
|
||||
return np.concatenate((priors[:, 0:2] + landm[:, 0:2] * var[0] * priors[:, 2:4],
|
||||
priors[:, 0:2] + landm[:, 2:4] * var[0] * priors[:, 2:4],
|
||||
priors[:, 0:2] + landm[:, 4:6] * var[0] * priors[:, 2:4],
|
||||
priors[:, 0:2] + landm[:, 6:8] * var[0] * priors[:, 2:4],
|
||||
priors[:, 0:2] + landm[:, 8:10] * var[0] * priors[:, 2:4],
|
||||
), axis=1)
|
|
@ -0,0 +1,227 @@
|
|||
# Copyright 2021 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
"""Train Retinaface_resnet50ormobilenet0.25."""
|
||||
|
||||
import argparse
|
||||
import math
|
||||
import mindspore
|
||||
|
||||
from mindspore import context
|
||||
from mindspore.context import ParallelMode
|
||||
from mindspore.train import Model
|
||||
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor
|
||||
from mindspore.communication.management import init, get_rank, get_group_size
|
||||
from mindspore.train.serialization import load_checkpoint, load_param_into_net
|
||||
|
||||
from src.config import cfg_res50, cfg_mobile025
|
||||
from src.loss import MultiBoxLoss
|
||||
from src.dataset import create_dataset
|
||||
from src.lr_schedule import adjust_learning_rate, warmup_cosine_annealing_lr
|
||||
|
||||
def train_with_resnet(cfg):
|
||||
"""train_with_resnet"""
|
||||
mindspore.common.seed.set_seed(cfg['seed'])
|
||||
from src.network_with_resnet import RetinaFace, RetinaFaceWithLossCell, TrainingWrapper, resnet50
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target=cfg['device_target'])
|
||||
device_num = cfg['nnpu']
|
||||
rank = 0
|
||||
if cfg['device_target'] == "Ascend":
|
||||
if device_num > 1:
|
||||
context.reset_auto_parallel_context()
|
||||
context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL,
|
||||
gradients_mean=True)
|
||||
init()
|
||||
rank = get_rank()
|
||||
else:
|
||||
context.set_context(device_id=cfg['device_id'])
|
||||
elif cfg['device_target'] == "GPU":
|
||||
if cfg['ngpu'] > 1:
|
||||
init("nccl")
|
||||
context.set_auto_parallel_context(device_num=get_group_size(), parallel_mode=ParallelMode.DATA_PARALLEL,
|
||||
gradients_mean=True)
|
||||
rank = get_rank()
|
||||
|
||||
batch_size = cfg['batch_size']
|
||||
max_epoch = cfg['epoch']
|
||||
|
||||
momentum = cfg['momentum']
|
||||
lr_type = cfg['lr_type']
|
||||
weight_decay = cfg['weight_decay']
|
||||
loss_scale = cfg['loss_scale']
|
||||
initial_lr = cfg['initial_lr']
|
||||
gamma = cfg['gamma']
|
||||
T_max = cfg['T_max']
|
||||
eta_min = cfg['eta_min']
|
||||
training_dataset = cfg['training_dataset']
|
||||
num_classes = 2
|
||||
negative_ratio = 7
|
||||
stepvalues = (cfg['decay1'], cfg['decay2'])
|
||||
|
||||
ds_train = create_dataset(training_dataset, cfg, batch_size, multiprocessing=True, num_worker=cfg['num_workers'])
|
||||
print('dataset size is : \n', ds_train.get_dataset_size())
|
||||
|
||||
steps_per_epoch = math.ceil(ds_train.get_dataset_size())
|
||||
|
||||
multibox_loss = MultiBoxLoss(num_classes, cfg['num_anchor'], negative_ratio, cfg['batch_size'])
|
||||
backbone = resnet50(1001)
|
||||
backbone.set_train(True)
|
||||
|
||||
if cfg['pretrain'] and cfg['resume_net'] is None:
|
||||
pretrained_res50 = cfg['pretrain_path']
|
||||
param_dict_res50 = load_checkpoint(pretrained_res50)
|
||||
load_param_into_net(backbone, param_dict_res50)
|
||||
print('Load resnet50 from [{}] done.'.format(pretrained_res50))
|
||||
|
||||
net = RetinaFace(phase='train', backbone=backbone)
|
||||
net.set_train(True)
|
||||
|
||||
if cfg['resume_net'] is not None:
|
||||
pretrain_model_path = cfg['resume_net']
|
||||
param_dict_retinaface = load_checkpoint(pretrain_model_path)
|
||||
load_param_into_net(net, param_dict_retinaface)
|
||||
print('Resume Model from [{}] Done.'.format(cfg['resume_net']))
|
||||
|
||||
net = RetinaFaceWithLossCell(net, multibox_loss, cfg)
|
||||
|
||||
if lr_type == 'dynamic_lr':
|
||||
lr = adjust_learning_rate(initial_lr, gamma, stepvalues, steps_per_epoch, max_epoch,
|
||||
warmup_epoch=cfg['warmup_epoch'], lr_type1=lr_type)
|
||||
elif lr_type == 'cosine_annealing':
|
||||
lr = warmup_cosine_annealing_lr(initial_lr, steps_per_epoch, cfg['warmup_epoch'], max_epoch, T_max, eta_min)
|
||||
|
||||
if cfg['optim'] == 'momentum':
|
||||
opt = mindspore.nn.Momentum(net.trainable_params(), lr, momentum, weight_decay, loss_scale)
|
||||
elif cfg['optim'] == 'sgd':
|
||||
opt = mindspore.nn.SGD(params=net.trainable_params(), learning_rate=lr, momentum=momentum,
|
||||
weight_decay=weight_decay, loss_scale=loss_scale)
|
||||
else:
|
||||
raise ValueError('optim is not define.')
|
||||
|
||||
net = TrainingWrapper(net, opt)
|
||||
|
||||
model = Model(net)
|
||||
|
||||
config_ck = CheckpointConfig(save_checkpoint_steps=ds_train.get_dataset_size() * 1,
|
||||
keep_checkpoint_max=cfg['keep_checkpoint_max'])
|
||||
cfg['ckpt_path'] = cfg['ckpt_path'] + "ckpt_" + str(rank) + "/"
|
||||
ckpoint_cb = ModelCheckpoint(prefix="RetinaFace", directory=cfg['ckpt_path'], config=config_ck)
|
||||
|
||||
time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())
|
||||
callback_list = [LossMonitor(), time_cb, ckpoint_cb]
|
||||
|
||||
print("============== Starting Training ==============")
|
||||
model.train(max_epoch, ds_train, callbacks=callback_list)
|
||||
|
||||
|
||||
def train_with_mobilenet(cfg):
|
||||
"""train_with_mobilenet"""
|
||||
mindspore.common.seed.set_seed(cfg['seed'])
|
||||
from src.network_with_mobilenet import RetinaFace, RetinaFaceWithLossCell, TrainingWrapper, resnet50, mobilenet025
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target='GPU', save_graphs=False)
|
||||
if context.get_context("device_target") == "GPU":
|
||||
# Enable graph kernel
|
||||
context.set_context(enable_graph_kernel=True, graph_kernel_flags="--enable_parallel_fusion")
|
||||
if cfg['ngpu'] > 1:
|
||||
init("nccl")
|
||||
context.set_auto_parallel_context(device_num=get_group_size(), parallel_mode=ParallelMode.DATA_PARALLEL,
|
||||
gradients_mean=True)
|
||||
cfg['ckpt_path'] = cfg['ckpt_path'] + "ckpt_" + str(get_rank()) + "/"
|
||||
|
||||
batch_size = cfg['batch_size']
|
||||
max_epoch = cfg['epoch']
|
||||
|
||||
momentum = cfg['momentum']
|
||||
lr_type = cfg['lr_type']
|
||||
weight_decay = cfg['weight_decay']
|
||||
initial_lr = cfg['initial_lr']
|
||||
gamma = cfg['gamma']
|
||||
training_dataset = cfg['training_dataset']
|
||||
num_classes = 2
|
||||
negative_ratio = 7
|
||||
stepvalues = (cfg['decay1'], cfg['decay2'])
|
||||
|
||||
ds_train = create_dataset(training_dataset, cfg, batch_size, multiprocessing=True, num_worker=cfg['num_workers'])
|
||||
print('dataset size is : \n', ds_train.get_dataset_size())
|
||||
|
||||
steps_per_epoch = math.ceil(ds_train.get_dataset_size())
|
||||
|
||||
multibox_loss = MultiBoxLoss(num_classes, cfg['num_anchor'], negative_ratio, cfg['batch_size'])
|
||||
if cfg['name'] == 'ResNet50':
|
||||
backbone = resnet50(1001)
|
||||
elif cfg['name'] == 'MobileNet025':
|
||||
backbone = mobilenet025(1000)
|
||||
backbone.set_train(True)
|
||||
|
||||
if cfg['name'] == 'ResNet50' and cfg['pretrain'] and cfg['resume_net'] is None:
|
||||
pretrained_res50 = cfg['pretrain_path']
|
||||
param_dict_res50 = load_checkpoint(pretrained_res50)
|
||||
load_param_into_net(backbone, param_dict_res50)
|
||||
print('Load resnet50 from [{}] done.'.format(pretrained_res50))
|
||||
elif cfg['name'] == 'MobileNet025' and cfg['pretrain'] and cfg['resume_net'] is None:
|
||||
pretrained_mobile025 = cfg['pretrain_path']
|
||||
param_dict_mobile025 = load_checkpoint(pretrained_mobile025)
|
||||
load_param_into_net(backbone, param_dict_mobile025)
|
||||
print('Load mobilenet0.25 from [{}] done.'.format(pretrained_mobile025))
|
||||
|
||||
net = RetinaFace(phase='train', backbone=backbone, cfg=cfg)
|
||||
net.set_train(True)
|
||||
|
||||
if cfg['resume_net'] is not None:
|
||||
pretrain_model_path = cfg['resume_net']
|
||||
param_dict_retinaface = load_checkpoint(pretrain_model_path)
|
||||
load_param_into_net(net, param_dict_retinaface)
|
||||
print('Resume Model from [{}] Done.'.format(cfg['resume_net']))
|
||||
|
||||
net = RetinaFaceWithLossCell(net, multibox_loss, cfg)
|
||||
|
||||
lr = adjust_learning_rate(initial_lr, gamma, stepvalues, steps_per_epoch, max_epoch,
|
||||
warmup_epoch=cfg['warmup_epoch'], lr_type1=lr_type)
|
||||
|
||||
if cfg['optim'] == 'momentum':
|
||||
opt = mindspore.nn.Momentum(net.trainable_params(), lr, momentum)
|
||||
elif cfg['optim'] == 'sgd':
|
||||
opt = mindspore.nn.SGD(params=net.trainable_params(), learning_rate=lr, momentum=momentum,
|
||||
weight_decay=weight_decay, loss_scale=1)
|
||||
else:
|
||||
raise ValueError('optim is not define.')
|
||||
|
||||
net = TrainingWrapper(net, opt)
|
||||
|
||||
model = Model(net)
|
||||
|
||||
config_ck = CheckpointConfig(save_checkpoint_steps=cfg['save_checkpoint_steps'],
|
||||
keep_checkpoint_max=cfg['keep_checkpoint_max'])
|
||||
ckpoint_cb = ModelCheckpoint(prefix="RetinaFace", directory=cfg['ckpt_path'], config=config_ck)
|
||||
|
||||
time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())
|
||||
callback_list = [LossMonitor(), time_cb, ckpoint_cb]
|
||||
|
||||
print("============== Starting Training ==============")
|
||||
model.train(max_epoch, ds_train, callbacks=callback_list, dataset_sink_mode=True)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='train')
|
||||
parser.add_argument('--backbone_name', type=str, default='ResNet50',
|
||||
help='backbone name')
|
||||
args_opt = parser.parse_args()
|
||||
|
||||
if args_opt.backbone_name == 'ResNet50':
|
||||
config = cfg_res50
|
||||
train_with_resnet(cfg=config)
|
||||
elif args_opt.backbone_name == 'MobileNet025':
|
||||
config = cfg_mobile025
|
||||
train_with_mobilenet(cfg=config)
|
||||
print('train config:\n', config)
|
Loading…
Reference in New Issue