add deeplabv3plus gpu

modified:   README_CN.md
This commit is contained in:
zhangxiaoxiao 2021-08-23 15:26:47 +08:00
parent f960f0671f
commit aba56674c6
22 changed files with 711 additions and 49 deletions

View File

@ -17,10 +17,12 @@
- [训练过程](#训练过程)
- [用法](#用法)
- [Ascend处理器环境运行](#ascend处理器环境运行)
- [GPU处理器环境运行](#gpu处理器环境运行)
- [结果](#结果)
- [评估过程](#评估过程)
- [用法](#用法-1)
- [Ascend处理器环境运行](#ascend处理器环境运行-1)
- [GPU处理器环境运行](#gpu处理器环境运行-1)
- [结果](#结果-1)
- [训练准确率](#训练准确率)
- [模型描述](#模型描述)
@ -173,6 +175,66 @@ run_eval_s8_multiscale.sh
run_eval_s8_multiscale_flip.sh
```
- GPU处理器环境运行
按照以下训练步骤进行8卡训练
1.使用VOCaug数据集训练s16微调ResNet-101预训练模型。脚本如下
```bash
bash run_distribute_train_s16_r1_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL
```
2.使用VOCaug数据集训练s8微调上一步的模型。脚本如下
```bash
bash run_distribute_train_s8_r1_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL
```
3.使用VOCtrain数据集训练s8微调上一步的模型。脚本如下
```bash
bash run_distribute_train_s8_r2_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL
```
评估步骤如下:
1.使用voc val数据集评估s16。评估脚本如下
```bash
bash run_eval_s16_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
2.使用voc val数据集评估多尺度s16。评估脚本如下
```bash
bash run_eval_s16_multiscale_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
3.使用voc val数据集评估多尺度和翻转s16。评估脚本如下
```bash
bash run_eval_s16_multiscale_flip_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
4.使用voc val数据集评估s8。评估脚本如下
```bash
bash run_eval_s8_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
5.使用voc val数据集评估多尺度s8。评估脚本如下
```bash
bash run_eval_s8_multiscale_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
6.使用voc val数据集评估多尺度和翻转s8。评估脚本如下
```bash
bash run_eval_s8_multiscale_flip_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID
```
# 脚本说明
## 脚本及样例代码
@ -192,6 +254,15 @@ run_eval_s8_multiscale_flip.sh
├── run_eval_s8.sh # 使用s8结构启动Ascend评估
├── run_eval_s8_multiscale.sh # 使用多尺度s8结构启动Ascend评估
├── run_eval_s8_multiscale_filp.sh # 使用多尺度和翻转s8结构启动Ascend评估
├── run_distribute_train_s16_r1_gpu.sh # 使用s16结构的VOCaug数据集启动GPU分布式训练8卡
├── run_distribute_train_s8_r1_gpu.sh # 使用s8结构的VOCaug数据集启动GPU分布式训练8卡
├── run_distribute_train_s8_r2_gpu.sh # 使用s8结构的VOCtrain数据集启动GPU分布式训练8卡
├── run_eval_s16_gpu.sh # 使用s16结构启动GPU评估
├── run_eval_s16_multiscale_gpu.sh # 使用多尺度s16结构启动GPU评估
├── run_eval_s16_multiscale_filp_gpu.sh # 使用多尺度和翻转s16结构启动GPU评估
├── run_eval_s8_gpu.sh # 使用s8结构启动GPU评估
├── run_eval_s8_multiscale_gpu.sh # 使用多尺度s8结构启动GPU评估
├── run_eval_s8_multiscale_filp_gpu.sh # 使用多尺度和翻转s8结构启动GPU评估
├── src
├── tools
├── get_dataset_list.py # 获取数据清单文件
@ -274,7 +345,7 @@ do
echo 'start rank='$i', device id='$DEVICE_ID'...'
mkdir ${train_path}/device$DEVICE_ID
cd ${train_path}/device$DEVICE_ID
ython ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=32 \
@ -374,6 +445,10 @@ python train.py --train_url=/PATH/TO/OUTPUT_DIR \
--save_steps=410 \
```
#### GPU处理器环境运行
具体参数配置可参照[快速入门](#快速入门)中8卡训练脚本。
### 结果
#### Ascend处理器环境运行
@ -483,6 +558,10 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
```
#### GPU处理器环境运行
具体参数配置可参照[快速入门](#快速入门)中评估测试脚本。
### 结果
运行适用的训练脚本获取结果。要获得相同的结果,请按照快速入门中的步骤操作。
@ -506,21 +585,21 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
### 评估性能
| 参数 | Ascend 910|
| -------------------------- | -------------------------------------- |
| 模型版本 | DeepLabV3+ |
| 资源 | Ascend 910 |
| 上传日期 | 2021-03-16 |
| MindSpore版本 | 1.1.1 |
| 数据集 | PASCAL VOC2012 + SBD |
| 训练参数 | epoch = 300, batch_size = 32 (s16_r1) epoch = 800, batch_size = 16 (s8_r1) epoch = 300, batch_size = 16 (s8_r2) |
| 优化器 | Momentum |
| 损失函数 | Softmax交叉熵 |
| 输出 | 概率 |
| 损失 | 0.0041095633 |
| 性能 | 187736.386 ms单卡s16<br> 44474.187 ms八卡s16 |
| 微调检查点 | 453M .ckpt文件 |
| 脚本 | [链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/deeplabv3plus) |
| 参数 | Ascend 910| GPU |
| -------------------------- | -------------------------------------- | -------------------------------------- |
| 模型版本 | DeepLabV3+ | DeepLabV3+ |
| 资源 | Ascend 910 |NV SMX2 V100-32G|
| 上传日期 | 2021-03-16 |2021-08-23|
| MindSpore版本 | 1.1.1 |1.4.0|
| 数据集 | PASCAL VOC2012 + SBD | PASCAL VOC2012 + SBD |
| 训练参数 | epoch = 300, batch_size = 32 (s16_r1) epoch = 800, batch_size = 16 (s8_r1) epoch = 300, batch_size = 16 (s8_r2) |epoch = 300, batch_size = 16 (s16_r1) epoch = 800, batch_size = 8 (s8_r1) epoch = 300, batch_size = 8 (s8_r2) |
| 优化器 | Momentum | Momentum |
| 损失函数 | Softmax交叉熵 |Softmax交叉熵 |
| 输出 | 概率 |概率 |
| 损失 | 0.0041095633 |0.003395824|
| 性能 | 187736.386 ms单卡s16<br> 44474.187 ms八卡s16 | 1080 ms/step单卡s16|
| 微调检查点 | 453M .ckpt文件 | 454M .ckpt文件|
| 脚本 | [链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/deeplabv3plus) |[链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/deeplabv3plus) |
# 随机情况说明

View File

@ -24,9 +24,6 @@ from mindspore import context
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from src.deeplab_v3plus import DeepLabV3Plus
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False,
device_id=int(os.getenv('DEVICE_ID')))
def parse_args():
"""parse_args"""
@ -44,6 +41,11 @@ def parse_args():
parser.add_argument('--ignore_label', type=int, default=255, help='ignore label')
parser.add_argument('--num_classes', type=int, default=21, help='number of classes')
# device info
parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'GPU', 'CPU'],
help='device where the code will be implemented. (Default: Ascend)')
parser.add_argument('--device_id', type=int, default=0, help='device id')
# model
parser.add_argument('--model', type=str, default='', help='select model')
parser.add_argument('--freeze_bn', action='store_true', default=False, help='freeze bn')
@ -154,7 +156,8 @@ def eval_batch_scales(args, eval_net, img_lst, scales,
def net_eval():
"""net_eval"""
args = parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target, save_graphs=False,
device_id=args.device_id)
# data list
with open(args.data_lst) as f:
img_lst = f.readlines()

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=5
DEVICE_ID=5
export SLOG_PRINT_TO_STDOUT=0
train_path=/PATH/TO/EXPERIMENTS_DIR
train_code_path=/PATH/TO/MODEL_ZOO_CODE
@ -41,4 +41,5 @@ python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--save_steps=1500 \
--keep_checkpoint_max=200 >log 2>&1 &
--keep_checkpoint_max=200 \
--device_id=$DEVICE_ID >log 2>&1 &

View File

@ -31,7 +31,7 @@ mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
@ -50,5 +50,6 @@ do
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=410 \
--keep_checkpoint_max=200 >log 2>&1 &
--keep_checkpoint_max=200 \
--device_id=$DEVICE_ID >log 2>&1 &
done

View File

@ -0,0 +1,66 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_distribute_train_s16_r1_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL"
echo "for example:"
echo "bash run_distribute_train_s16_r1_gpu.sh \
voc2012/mindrecord_train/vocaug_mindrecord0 resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78.ckpt"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_FILE=$1
CKPT_PRE_TRAINED=$2
ulimit -c unlimited
export SLOG_PRINT_TO_STDOUT=0
export RANK_SIZE=8
export GLOG_v=2
train_path=s16_train
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
cp ../*.py ${train_path}
cp -r ../src ${train_path}
cd ${train_path} || exit
mpirun -allow-run-as-root -n $RANK_SIZE --output-filename log_output --merge-stderr-to-stdout \
python ./train.py --train_dir=${train_path}/ckpt \
--data_file=$DATA_FILE \
--train_epochs=300 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.04 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=$CKPT_PRE_TRAINED \
--is_distributed \
--save_steps=410 \
--keep_checkpoint_max=200 \
--device_target="GPU" >log 2>&1 &

View File

@ -31,7 +31,7 @@ mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
@ -51,5 +51,6 @@ do
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=820 \
--keep_checkpoint_max=200 >log 2>&1 &
--keep_checkpoint_max=200 \
--device_id=$DEVICE_ID >log 2>&1 &
done

View File

@ -0,0 +1,67 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_distribute_train_s8_r1_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL"
echo "for example:"
echo "bash run_distribute_train_s8_r1_gpu.sh \
voc2012/mindrecord_train/vocaug_mindrecord0 ckpt/DeepLabV3plus_s16-300_82.ckpt"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_FILE=$1
CKPT_PRE_TRAINED=$2
ulimit -c unlimited
export SLOG_PRINT_TO_STDOUT=0
export RANK_SIZE=8
export GLOG_v=2
train_path=s8_r1_train
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
cp ../*.py ${train_path}
cp -r ../src ${train_path}
cd ${train_path} || exit
mpirun -allow-run-as-root -n $RANK_SIZE --output-filename log_output --merge-stderr-to-stdout \
python ./train.py --train_dir=${train_path}/ckpt \
--data_file=$DATA_FILE \
--train_epochs=800 \
--batch_size=8 \
--crop_size=513 \
--base_lr=0.01 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=$CKPT_PRE_TRAINED \
--is_distributed \
--save_steps=820 \
--keep_checkpoint_max=200 \
--device_target="GPU" >log 2>&1 &

View File

@ -31,7 +31,7 @@ mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
@ -51,5 +51,6 @@ do
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=110 \
--keep_checkpoint_max=200 >log 2>&1 &
--keep_checkpoint_max=200 \
--device_id=$DEVICE_ID >log 2>&1 &
done

View File

@ -0,0 +1,70 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_distribute_train_s8_r2_gpu.sh /PATH/TO/MINDRECORD_NAME /PATH/TO/PRETRAIN_MODEL"
echo "for example:"
echo "bash run_distribute_train_s8_r2_gpu.sh \
voc2012/mindrecord_voctrain/mindrecord_voctrain0 ckpt/DeepLabV3plus_s8-800_165.ckpt"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_FILE=$1
CKPT_PRE_TRAINED=$2
ulimit -c unlimited
export SLOG_PRINT_TO_STDOUT=0
export RANK_SIZE=8
export GLOG_v=2
train_path=s8_r2_train
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
cp ../*.py ${train_path}
cp -r ../src ${train_path}
cd ${train_path} || exit
mpirun -allow-run-as-root -n $RANK_SIZE --output-filename log_output --merge-stderr-to-stdout \
python ./train.py --train_dir=${train_path}/ckpt \
--data_file=$DATA_FILE \
--train_epochs=300 \
--batch_size=8 \
--crop_size=513 \
--base_lr=0.004 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=$CKPT_PRE_TRAINED \
--is_distributed \
--save_steps=110 \
--keep_checkpoint_max=200 \
--device_target="GPU" >log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -33,5 +33,6 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--model=DeepLabV3plus_s16 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -0,0 +1,56 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s16_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s16_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s16_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=32 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -37,4 +37,5 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -38,5 +38,6 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -0,0 +1,63 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s16_multiscale_flip_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s16_multiscale_flip_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s16_multiscale_flip_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -0,0 +1,59 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s16_multiscale_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s16_multiscale_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s16_multiscale_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -33,5 +33,6 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--model=DeepLabV3plus_s8 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -0,0 +1,59 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s8_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s8_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s8_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -37,5 +37,6 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -14,7 +14,7 @@
# limitations under the License.
# ============================================================================
export DEVICE_ID=3
DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
@ -38,5 +38,6 @@ python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
--ckpt_path=/PATH/TO/PRETRAIN_MODEL \
--device_id=$DEVICE_ID >${eval_path}/eval_log 2>&1 &

View File

@ -0,0 +1,63 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s8_multiscale_flip_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s8_multiscale_flip_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s8_multiscale_flip_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -0,0 +1,59 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_eval_s8_multiscale_gpu.sh /PATH/TO/DATA /PATH/TO/DATA_lst.txt /PATH/TO/PRETRAIN_MODEL DEVICE_ID"
echo "for example:"
echo "bash run_eval_s8_multiscale_gpu.sh \
voc2012/VOCdevkit/VOC2012 voc2012/voc_val_lst.txt ckpt/DeepLabV3plus_s16-300_82.ckpt 0"
echo "It is better to use absolute path."
echo "=============================================================================================================="
exit 1
fi
DATA_ROOT=$1
DATA_LST=$2
CKPT_PATH=$3
export CUDA_VISIBLE_DEVICES=$4
export SLOG_PRINT_TO_STDOUT=0
eval_path=s8_multiscale_eval
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
cp ../*.py ${eval_path}
cp -r ../src ${eval_path}
cd ${eval_path} || exit
python ./eval.py --data_root=$DATA_ROOT \
--data_lst=$DATA_LST \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=$CKPT_PATH \
--device_target="GPU" \
--device_id=0 > eval_log 2>&1 &

View File

@ -76,8 +76,9 @@ def parse_args():
parser.add_argument('--ckpt_pre_trained', type=str, default='', help='PreTrained model')
# train
parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'CPU'],
parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'GPU', 'CPU'],
help='device where the code will be implemented. (Default: Ascend)')
parser.add_argument('--device_id', type=int, default=0, help='device id')
parser.add_argument('--is_distributed', action='store_true', help='distributed training')
parser.add_argument('--rank', type=int, default=0, help='local rank of distributed')
parser.add_argument('--group_size', type=int, default=1, help='world size of distributed')
@ -99,17 +100,16 @@ def parse_args():
def train():
"""train"""
args = parse_args()
if args.device_target == "CPU":
context.set_context(mode=context.GRAPH_MODE, save_graphs=False, device_target="CPU")
else:
context.set_context(mode=context.GRAPH_MODE, enable_auto_mixed_precision=True, save_graphs=False,
device_target="Ascend", device_id=int(os.getenv('DEVICE_ID')))
context.set_context(mode=context.GRAPH_MODE, save_graphs=False, device_target=args.device_target)
if args.device_target != "CPU":
context.set_context(enable_auto_mixed_precision=True, device_id=args.device_id)
# init multicards training
if args.modelArts_mode:
import moxing as mox
local_data_url = '/cache/data'
local_train_url = '/cache/ckpt'
device_id = int(os.getenv('DEVICE_ID'))
device_id = args.device_id
device_num = int(os.getenv('RANK_SIZE'))
if device_num > 1:
init()
@ -169,7 +169,15 @@ def train():
# load pretrained model
if args.ckpt_pre_trained or args.pretrainedmodel_filename:
param_dict = load_checkpoint(ckpt_file)
load_param_into_net(train_net, param_dict)
if args.model == 'DeepLabV3plus_s16':
trans_param_dict = {}
for key, val in param_dict.items():
key = key.replace("down_sample_layer", "downsample")
trans_param_dict[f"network.resnet.{key}"] = val
load_param_into_net(train_net, trans_param_dict)
else:
load_param_into_net(train_net, param_dict)
print('load_model {} success'.format(args.ckpt_pre_trained))
# optimizer
iters_per_epoch = dataset.get_dataset_size()
@ -188,7 +196,7 @@ def train():
# loss scale
manager_loss_scale = FixedLossScaleManager(args.loss_scale, drop_overflow_update=False)
amp_level = "O0" if args.device_target == "CPU" else "O3"
amp_level = "O0" if args.device_target != "Ascend" else "O3"
model = Model(train_net, optimizer=opt, amp_level=amp_level, loss_scale_manager=manager_loss_scale)
# callback for saving ckpts