!16420 Adding CPU mode for FaceQualityAssessment model

From: @huangbo77
Reviewed-by: 
Signed-off-by:
This commit is contained in:
mindspore-ci-bot 2021-06-01 20:20:47 +08:00 committed by Gitee
commit ac7178b8b9
14 changed files with 498 additions and 65 deletions

View File

@ -13,7 +13,7 @@
# [Face Quality Assessment Description](#contents)
This is a Face Quality Assessment network based on Resnet12, with support for training and evaluation on Ascend910.
This is a Face Quality Assessment network based on Resnet12, with support for training and evaluation on Ascend910, GPU and CPU.
ResNet (residual neural network) was proposed by Kaiming He and other four Chinese of Microsoft Research Institute. Through the use of ResNet unit, it successfully trained 152 layers of neural network, and won the championship in ilsvrc2015. The error rate on top 5 was 3.57%, and the parameter quantity was lower than vggnet, so the effect was very outstanding. Traditional convolution network or full connection network will have more or less information loss. At the same time, it will lead to the disappearance or explosion of gradient, which leads to the failure of deep network training. ResNet solves this problem to a certain extent. By passing the input information to the output, the integrity of the information is protected. The whole network only needs to learn the part of the difference between input and output, which simplifies the learning objectives and difficulties.The structure of ResNet can accelerate the training of neural network very quickly, and the accuracy of the model is also greatly improved. At the same time, ResNet is very popular, even can be directly used in the concept net network.
@ -67,8 +67,8 @@ We use about 122K face images as training dataset and 2K as evaluating dataset i
# [Environment Requirements](#contents)
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor.
- Hardware(Ascend/GPU/CPU)
- Prepare hardware environment with Ascend/GPU/CPU processor.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
@ -89,7 +89,14 @@ The entire code structure is as following:
├─ run_standalone_train.sh # launch standalone training(1p) in ascend
├─ run_distribute_train.sh # launch distributed training(8p) in ascend
├─ run_eval.sh # launch evaluating in ascend
└─ run_export.sh # launch exporting air model
├─ run_export.sh # launch exporting air model
├─ run_standalone_train_gpu.sh # launch standalone training(1p) in gpu
├─ run_distribute_train_gpu.sh # launch distributed training(8p) in gpu
├─ run_eval_gpu.sh # launch evaluating in gpu
├─ run_export_gpu.sh # launch exporting mindir model in gpu
├─ run_standalone_train_cpu.sh # launch standalone training(1p) in cpu
├─ run_eval_cpu.sh # launch evaluating in cpu
└─ run_export_cpu.sh # launch exporting mindir model in cpu
├─ src
├─ config.py # parameter configuration
├─ dataset.py # dataset loading and preprocessing for training
@ -109,18 +116,50 @@ The entire code structure is as following:
- Stand alone mode
```bash
Ascend
cd ./scripts
sh run_standalone_train.sh [TRAIN_LABEL_FILE] [USE_DEVICE_ID]
```
```bash
GPU
cd ./scripts
sh run_standalone_train_gpu.sh [TRAIN_LABEL_FILE]
```
```bash
CPU
cd ./scripts
sh run_standalone_train_cpu.sh [TRAIN_LABEL_FILE]
```
or (fine-tune)
```bash
Ascend
cd ./scripts
sh run_standalone_train.sh [TRAIN_LABEL_FILE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
```
for example:
```bash
GPU
cd ./scripts
sh run_standalone_train_gpu.sh [TRAIN_LABEL_FILE] [PRETRAINED_BACKBONE]
```
```bash
CPU
cd ./scripts
sh run_standalone_train_cpu.sh [TRAIN_LABEL_FILE] [PRETRAINED_BACKBONE]
```
for example, on Ascend:
```bash
cd ./scripts
@ -130,18 +169,36 @@ The entire code structure is as following:
- Distribute mode (recommended)
```bash
Ascend
cd ./scripts
sh run_distribute_train.sh [TRAIN_LABEL_FILE] [RANK_TABLE]
```
```bash
GPU
cd ./scripts
sh run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [TRAIN_LABEL_FILE]
```
or (fine-tune)
```bash
Ascend
cd ./scripts
sh run_distribute_train.sh [TRAIN_LABEL_FILE] [RANK_TABLE] [PRETRAINED_BACKBONE]
```
for example:
```bash
GPU
cd ./scripts
sh run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [TRAIN_LABEL_FILE] [PRETRAINED_BACKBONE]
```
for example, on Ascend:
```bash
cd ./scripts
@ -167,11 +224,27 @@ epoch[39], iter[19110], loss:2.111101, 8791.05 imgs/sec
### Evaluation
```bash
Ascend
cd ./scripts
sh run_eval.sh [EVAL_DIR] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
```
for example:
```bash
GPU
cd ./scripts
sh run_eval_gpu.sh [EVAL_DIR] [PRETRAINED_BACKBONE]
```
```bash
CPU
cd ./scripts
sh run_eval_cpu.sh [EVAL_DIR] [PRETRAINED_BACKBONE]
```
for example, on Ascend:
```bash
cd ./scripts
@ -192,45 +265,63 @@ MAE of elur:18.021210976971098
If you want to infer the network on Ascend 310, you should convert the model to AIR:
```bash
Ascend
cd ./scripts
sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
```
Or if you would like to convert your model to MINDIR file on GPU or CPU:
```bash
GPU
cd ./scripts
sh run_export_gpu.sh [PRETRAINED_BACKBONE] [BATCH_SIZE] [FILE_NAME](optional)
```
```bash
CPU
cd ./scripts
sh run_export_cpu.sh [PRETRAINED_BACKBONE] [BATCH_SIZE] [FILE_NAME](optional)
```
# [Model Description](#contents)
## [Performance](#contents)
### Training Performance
| Parameters | Face Quality Assessment |
| -------------------------- | ----------------------------------------------------------- |
| Model Version | V1 |
| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
| uploaded Date | 09/30/2020 (month/day/year) |
| MindSpore Version | 1.0.0 |
| Dataset | 122K images |
| Training Parameters | epoch=40, batch_size=32, momentum=0.9, lr=0.02 |
| Optimizer | Momentum |
| Loss Function | MSELoss, Softmax Cross Entropy |
| outputs | probability and point |
| Speed | 1pc: 200-240 ms/step; 8pcs: 35-40 ms/step |
| Total time | 1ps: 2.5 hours; 8pcs: 0.5 hours |
| Checkpoint for Fine tuning | 16M (.ckpt file) |
| Parameters | Ascend | CPU |
| -------------------------- | ------------------------------------------------- | ------------------------------------------ |
| Model Version | V1 | V1 |
| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 | Intel(R) Xeon(R) CPU E5-2690 v4 |
| Uploaded Date | 09/30/2020 (month/day/year) | 05/14/2021 (month/day/year) |
| MindSpore Version | 1.0.0 | 1.2.0 |
| Dataset | 122K images | 122K images |
| Training Parameters | epoch=40, batch_size=32, momentum=0.9, lr=0.02 | epoch=40, batch_size=32, momentum=0.9, lr=0.02 |
| Optimizer | Momentum | Momentum |
| Loss Function | MSELoss, Softmax Cross Entropy | MSELoss, Softmax Cross Entropy |
| Outputs | probability and point | probability and point |
| Speed | 1pc: 200-240 ms/step; 8pcs: 35-40 ms/step | 1pc: 6 s/step |
| Total time | 1ps: 2.5 hours; 8pcs: 0.5 hours | 1ps: 32 hours |
| Checkpoint for Fine tuning | 16M (.ckpt file) | 16M (.ckpt file) |
### Evaluation Performance
| Parameters | Face Quality Assessment |
| ------------------- | --------------------------- |
| Model Version | V1 |
| Resource | Ascend 910; OS Euler2.8 |
| Uploaded Date | 09/30/2020 (month/day/year) |
| MindSpore Version | 1.0.0 |
| Dataset | 2K images |
| batch_size | 1 |
| outputs | IPN, MAE |
| Accuracy(8pcs) | IPN of 5 keypoints:19.5 |
| | MAE of elur:18.02 |
| Model for inference | 16M (.ckpt file) |
| Parameters | Ascend | CPU |
| ------------------- | --------------------------- | --------------------------- |
| Model Version | V1 | V1 |
| Resource | Ascend 910; OS Euler2.8 | Intel(R) Xeon(R) CPU E5-2690 v4 |
| Uploaded Date | 09/30/2020 (month/day/year) | 05/14/2021 (month/day/year) |
| MindSpore Version | 1.0.0 | 1.2.0 |
| Dataset | 2K images | 2K images |
| batch_size | 256 | 256 |
| Outputs | IPN, MAE | IPN, MAE |
| Accuracy | 8 pcs: IPN of 5 keypoints:19.5 | 1 pcs: IPN of 5 keypoints:20.09 |
| | 8 pcs: MAE of elur:18.02 | 1 pcs: MAE of elur:18.23 |
| Model for inference | 16M (.ckpt file) | 16M (.ckpt file) |
# [ModelZoo Homepage](#contents)

View File

@ -1,4 +1,4 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Copyright 2020-2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -29,8 +29,6 @@ from mindspore import context
from src.face_qa import FaceQABackbone
warnings.filterwarnings('ignore')
devid = int(os.getenv('DEVICE_ID'))
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False, device_id=devid)
def softmax(x):
@ -210,7 +208,14 @@ if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Face Quality Assessment')
parser.add_argument('--eval_dir', type=str, default='', help='eval image dir, e.g. /home/test')
parser.add_argument('--pretrained', type=str, default='', help='pretrained model to load')
parser.add_argument('--device_target', type=str, choices=['Ascend', 'GPU', 'CPU'], default='Ascend',
help='device target')
arg = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=arg.device_target, save_graphs=False)
if arg.device_target == 'Ascend':
devid = int(os.getenv('DEVICE_ID'))
context.set_context(device_id=devid)
test_trains(arg)

View File

@ -1,4 +1,4 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Copyright 2020-2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Convert ckpt to air."""
"""Convert ckpt to air/mindir."""
import os
import argparse
import numpy as np
@ -23,9 +23,6 @@ from mindspore.train.serialization import export, load_checkpoint, load_param_in
from src.face_qa import FaceQABackbone
devid = int(os.getenv('DEVICE_ID'))
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False, device_id=devid)
def main(args):
network = FaceQABackbone()
@ -48,17 +45,25 @@ def main(args):
input_data = np.random.uniform(low=0, high=1.0, size=(args.batch_size, 3, 96, 96)).astype(np.float32)
tensor_input_data = Tensor(input_data)
export(network, tensor_input_data, file_name=ckpt_path.replace('.ckpt', '_' + str(args.batch_size) + 'b.air'),
file_format='AIR')
export(network, tensor_input_data, file_name=args.file_name, file_format=args.file_format)
print('-----------------------export model success-----------------------')
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Convert ckpt to air')
parser = argparse.ArgumentParser(description='Convert ckpt to air/mindir')
parser.add_argument('--pretrained', type=str, default='', help='pretrained model to load')
parser.add_argument('--batch_size', type=int, default=8, help='batch size')
parser.add_argument('--device_target', type=str, choices=['Ascend', 'GPU', 'CPU'], default='Ascend',
help='device target')
parser.add_argument('--file_name', type=str, default='FaceQualityAssessment', help='output file name')
parser.add_argument('--file_format', type=str, choices=['AIR', 'ONNX', 'MINDIR'], default='AIR', help='file format')
arg = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=arg.device_target, save_graphs=False)
if arg.device_target == 'Ascend':
devid = int(os.getenv('DEVICE_ID'))
context.set_context(device_id=devid)
main(arg)

View File

@ -0,0 +1,74 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 3 ]
then
echo "Usage: sh run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [TRAIN_LABEL_FILE]
[PRETRAINED_BACKBONE](optional)"
exit 1
fi
if [ $1 -lt 1 ] && [ $1 -gt 8 ]
then
echo "error: DEVICE_NUM=$1 is not in (1-8)"
exit 1
fi
export DEVICE_NUM=$1
export RANK_SIZE=$1
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
if [ -d "../train" ]
then
rm -rf ../train
fi
mkdir ../train
cd ../train || exit
export CUDA_VISIBLE_DEVICES="$2"
if [ $4 ] # pretrained ckpt
then
if [ $1 -gt 1 ]
then
mpirun -n $1 --allow-run-as-root python ${BASEPATH}/../train.py \
--train_label_file=$3 \
--is_distributed=1 \
--device_target='GPU' \
--pretrained=$4 > train.log 2>&1 &
else
python ${BASEPATH}/../train.py \
--train_label_file=$3 \
--is_distributed=0 \
--device_target='GPU' \
--pretrained=$4 > train.log 2>&1 &
fi
else
if [ $1 -gt 1 ]
then
mpirun -n $1 --allow-run-as-root python ${BASEPATH}/../train.py \
--train_label_file=$3 \
--is_distributed=1 \
--device_target='GPU' > train.log 2>&1 &
else
python ${BASEPATH}/../train.py \
--train_label_file=$3 \
--is_distributed=0 \
--device_target='GPU' > train.log 2>&1 &
fi
fi

View File

@ -0,0 +1,31 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 2 ]
then
echo "Usage: sh run_eval_cpu.sh [EVALDATA_PATH] [PRETRAINED_BACKBONE]"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
cd ..
python ${BASEPATH}/../eval.py \
--eval_dir=$1 \
--device_target='CPU' \
--pretrained=$2 > eval.log 2>&1 &

View File

@ -0,0 +1,31 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 2 ]
then
echo "Usage: sh run_eval_gpu.sh [EVALDATA_PATH] [PRETRAINED_BACKBONE]"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
cd ..
python ${BASEPATH}/../eval.py \
--eval_dir=$1 \
--device_target='GPU' \
--pretrained=$2 > eval.log 2>&1 &

View File

@ -0,0 +1,42 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 2 ]
then
echo "Usage: sh run_export_cpu.sh [PRETRAINED_BACKBONE] [BATCH_SIZE] [FILE_NAME](optional)"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
cd ..
if [ $3 ] # file name
then
python ${BASEPATH}/../export.py \
--pretrained=$1 \
--device_target='CPU' \
--batch_size=$2 \
--file_format=MINDIR \
--file_name=$3 > convert.log 2>&1 &
else
python ${BASEPATH}/../export.py \
--pretrained=$1 \
--device_target='CPU' \
--batch_size=$2 \
--file_format=MINDIR > convert.log 2>&1 &
fi

View File

@ -0,0 +1,42 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 2 ]
then
echo "Usage: sh run_export_gpu.sh [PRETRAINED_BACKBONE] [BATCH_SIZE] [FILE_NAME](optional)"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
cd ..
if [ $3 ] # file name
then
python ${BASEPATH}/../export.py \
--pretrained=$1 \
--device_target='GPU' \
--batch_size=$2 \
--file_format=MINDIR \
--file_name=$3 > convert.log 2>&1 &
else
python ${BASEPATH}/../export.py \
--pretrained=$1 \
--device_target='GPU' \
--batch_size=$2 \
--file_format=MINDIR > convert.log 2>&1 &
fi

View File

@ -0,0 +1,43 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 1 ]
then
echo "Usage: sh run_standalone_train_cpu.sh [TRAIN_LABEL_FILE] [PRETRAINED_BACKBONE](optional)"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
if [ -d "../train" ]
then
rm -rf ../train
fi
mkdir ../train
cd ../train || exit
if [ $2 ] # pretrained ckpt
then
python ${BASEPATH}/../train.py \
--train_label_file=$1 \
--device_target='CPU' \
--pretrained=$2 > train.log 2>&1 &
else
python ${BASEPATH}/../train.py \
--train_label_file=$1 \
--device_target='CPU' > train.log 2>&1 &
fi

View File

@ -0,0 +1,43 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# -lt 1 ]
then
echo "Usage: sh run_standalone_train_gpu.sh [TRAIN_LABEL_FILE] [PRETRAINED_BACKBONE](optional)"
exit 1
fi
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
if [ -d "../train" ]
then
rm -rf ../train
fi
mkdir ../train
cd ../train || exit
if [ $2 ] # pretrained ckpt
then
python ${BASEPATH}/../train.py \
--train_label_file=$1 \
--device_target='GPU' \
--pretrained=$2 > train.log 2>&1 &
else
python ${BASEPATH}/../train.py \
--train_label_file=$1 \
--device_target='GPU' > train.log 2>&1 &
fi

View File

@ -1,4 +1,4 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Copyright 2020-2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -19,14 +19,6 @@ from mindspore.ops import operations as P
from mindspore.nn import Dense, Cell
class Cut(nn.Cell):
def construct(self, x):
return x
def bn_with_initialize(out_channels):
bn = nn.BatchNorm2d(out_channels, momentum=0.9, eps=1e-5)
return bn

View File

@ -1,4 +1,4 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Copyright 2020-2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -23,11 +23,28 @@ from mindspore import Tensor
eps = 1e-24
class log_softmax(nn.Cell):
''' replacement of P.LogSoftmax() that supports x.shape=3. '''
def __init__(self):
super(log_softmax, self).__init__()
self.lsm = P.LogSoftmax()
self.concat = P.Concat(1)
self.reshape = P.Reshape()
def construct(self, x):
dim1 = x.shape[1]
result = []
for i in range(dim1):
lsm = self.lsm(x[:, i, :])
lsm = self.reshape(lsm, (F.shape(lsm)[0], 1, F.shape(lsm)[1]))
result = lsm if i == 0 else self.concat((result, lsm))
return result
class CEWithIgnoreIndex3D(_Loss):
'''CEWithIgnoreIndex3D'''
def __init__(self):
super(CEWithIgnoreIndex3D, self).__init__()
self.exp = P.Exp()
self.sum = P.ReduceSum()
self.reshape = P.Reshape()
@ -41,6 +58,7 @@ class CEWithIgnoreIndex3D(_Loss):
self.relu = P.ReLU()
self.maximum = P.Maximum()
self.resum = P.ReduceSum(keep_dims=False)
self.logsoftmax = log_softmax()
def construct(self, logit, label):
'''construct'''
@ -50,10 +68,7 @@ class CEWithIgnoreIndex3D(_Loss):
mask = self.relu(mask) / (mask)
logit = logit * mask
exp = self.exp(logit)
exp_sum = self.sum(exp, -1)
exp_sum = self.reshape(exp_sum, (F.shape(exp_sum)[0], F.shape(exp_sum)[1], 1))
softmax_result = self.log(exp / exp_sum + self.eps_const)
softmax_result = self.logsoftmax(logit)
one_hot_label = self.onehot(
self.cast(label, mstype.int32), F.shape(logit)[2], self.on_value, self.off_value)
loss = (softmax_result * self.cast(one_hot_label, mstype.float32) * self.cast(F.scalar_to_array(-1),
@ -62,7 +77,6 @@ class CEWithIgnoreIndex3D(_Loss):
loss = self.sum(loss, -1)
loss = self.sum(loss, -1)
loss = self.sum(loss, 0)
loss = loss
return loss

View File

@ -1,4 +1,4 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Copyright 2020-2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -38,8 +38,6 @@ from src.dataset import faceqa_dataset
from src.log import get_logger, AverageMeter
warnings.filterwarnings('ignore')
devid = int(os.getenv('DEVICE_ID'))
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False, device_id=devid)
mindspore.common.seed.set_seed(1)
def main(args):
@ -181,7 +179,14 @@ if __name__ == "__main__":
parser.add_argument('--is_distributed', type=int, default=0, help='if multi device')
parser.add_argument('--train_label_file', type=str, default='', help='image label list file, e.g. /home/label.txt')
parser.add_argument('--pretrained', type=str, default='', help='pretrained model to load')
parser.add_argument('--device_target', type=str, choices=['Ascend', 'GPU', 'CPU'], default='Ascend',
help='device target')
arg = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=arg.device_target, save_graphs=False)
if arg.device_target == 'Ascend':
devid = int(os.getenv('DEVICE_ID'))
context.set_context(device_id=devid)
main(arg)

View File

@ -16,7 +16,7 @@
if [ $# -lt 3 ]
then
echo "Usage: sh run_distributed_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [DATASET_PATH]
echo "Usage: sh run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [DATASET_PATH]
[PRE_TRAINED](optional)"
exit 1
fi
@ -42,8 +42,23 @@ cd ../train || exit
export CUDA_VISIBLE_DEVICES="$2"
if [ $4 ] #pretrained ckpt
if [ $4 ] # pretrained ckpt
then
if [ $1 -gt 1 ]
then
mpirun -n $1 --allow-run-as-root python3 ${BASEPATH}/../train.py \
--data_dir=$3 \
--is_distributed=1 \
--device_target='GPU' \
--pretrained=$4
else
python3 ${BASEPATH}/../train.py \
--data_dir=$3 \
--is_distributed=0 \
--device_target='GPU' \
--pretrained=$4
fi
else
if [ $1 -gt 1 ]
then
mpirun -n $1 --allow-run-as-root python3 ${BASEPATH}/../train.py \