!18782 Add Textcnn and Googlenet CPU Scripts

Merge pull request !18782 from wuxuejian/textcnn
This commit is contained in:
i-robot 2021-06-25 08:25:49 +00:00 committed by Gitee
commit d3a53ff1f6
10 changed files with 272 additions and 10 deletions

View File

@ -71,8 +71,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
# [Environment Requirements](#contents)
- HardwareAscend/GPU
- Prepare hardware environment with Ascend or GPU processor.
- HardwareAscend/GPU/CPU
- Prepare hardware environment with Ascend/GPU/CPU processor.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
@ -125,6 +125,16 @@ After installing MindSpore via the official website, you can start training and
sh run_eval_gpu.sh [CHECKPOINT_PATH]
```
- running on CPU
```python
# run training example
nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
# run evaluation example
nohup python eval.py --checkpoint_path=[CHECKPOINT_PATH] > eval.log 2>&1 &
```
We use CIFAR-10 dataset by default. Your can also pass `$dataset_type` to the scripts so that select different datasets. For more details, please refer the specify script.
- ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start training as follows)
@ -232,9 +242,11 @@ We use CIFAR-10 dataset by default. Your can also pass `$dataset_type` to the sc
├── scripts
│ ├──run_train.sh // shell script for distributed on Ascend
│ ├──run_train_gpu.sh // shell script for distributed on GPU
│ ├──run_train_cpu.sh // shell script for training on CPU
│ ├──run_eval.sh // shell script for evaluation on Ascend
│ ├──run_infer_310.sh // shell script for 310 inference
│ ├──run_eval_gpu.sh // shell script for evaluation on GPU
│ ├──run_eval_cpu.sh // shell script for evaluation on CPU
├── src
│ ├──dataset.py // creating dataset
│ ├──googlenet.py // googlenet architecture
@ -338,6 +350,16 @@ For more configuration details, please refer the script `config.py`.
After training, you'll get some checkpoint files under the folder `./ckpt_0/` by default.
- running on CPU
```python
nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
```
The python command above will run in the background, you can view the results through the file `train.log`.
After training, you'll get some checkpoint files under the folder defined in config.yaml.
### Distributed Training
- running on Ascend

View File

@ -78,8 +78,8 @@ GoogleNet由多个inception模块串联起来可以更加深入。 降维的
# 环境要求
- 硬件Ascend/GPU
- 使用Ascend或GPU处理器来搭建硬件环境。
- 硬件Ascend/GPU/CPU
- 使用Ascend/GPU/CPU处理器来搭建硬件环境。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- 如需查看详情,请参见如下资源:
@ -132,6 +132,16 @@ GoogleNet由多个inception模块串联起来可以更加深入。 降维的
sh run_eval_gpu.sh [CHECKPOINT_PATH]
```
- CPU处理器环境运行
```python
# 运行训练示例
nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
# 运行评估示例
nohup python eval.py --checkpoint_path=[CHECKPOINT_PATH] > eval.log 2>&1 &
```
默认使用CIFAR-10数据集。您也可以将`$dataset_type`传入脚本,以便选择其他数据集。如需查看更多详情,请参考指定脚本。
- 在 ModelArts 进行训练 (如果你想在modelarts上运行可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
@ -239,9 +249,11 @@ GoogleNet由多个inception模块串联起来可以更加深入。 降维的
├── scripts
│ ├──run_train.sh // 分布式到Ascend的shell脚本
│ ├──run_train_gpu.sh // 分布式到GPU处理器的shell脚本
│ ├──run_train_cpu.sh // CPU处理器训练的shell脚本
│ ├──run_eval.sh // Ascend评估的shell脚本
│ ├──run_infer_310.sh // Ascend推理shell脚本
│ ├──run_eval_gpu.sh // GPU处理器评估的shell脚本
│ ├──run_eval_cpu.sh // CPU处理器评估的shell脚本
├── src
│ ├──dataset.py // 创建数据集
│ ├──googlenet.py // googlenet架构
@ -313,6 +325,16 @@ GoogleNet由多个inception模块串联起来可以更加深入。 降维的
训练结束后,您可在默认`./ckpt_0/`脚本文件夹下找到检查点文件。
- CPU处理器环境运行
```bash
nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
```
上述python命令将在后台运行您可以通过train.log文件查看结果。
训练结束后您可在yaml文件中配置的文件夹下找到检查点文件。
### 分布式训练
- Ascend处理器环境运行

View File

@ -0,0 +1,51 @@
# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing)
enable_modelarts: False
# Url for modelarts
data_url: ""
train_url: ""
checkpoint_url: ""
# Path for local
data_path: "/cache/data"
output_path: "/cache/train"
load_path: "/cache/checkpoint_path"
device_target: "CPU"
need_modelarts_dataset_unzip: True
modelarts_dataset_unzip_name: "cifar10"
# ==============================================================================
# options
dataset_name: "cifar10"
name: "cifar10"
pre_trained: False
num_classes: 10
lr_init: 0.1
batch_size: 128
epoch_size: 125
momentum: 0.9
weight_decay: 0.0005 #5e-4
image_height: 224
image_width: 224
train_data_path: "/cache/data/cifar10/"
val_data_path: "/cache/data/cifar10/"
keep_checkpoint_max: 10
checkpoint_path: "./train_googlenet_cifar10-125_390.ckpt"
onnx_filename: "googlenet"
air_filename: "googlenet"
ckpt_save_dir: "./ckpt/"
# export option
ckpt_file: ""
file_name: "googlenet"
file_format: "MINDIR"
#batch_size: 1
---
# Help description for each configuration
enable_modelarts: "Whether training on modelarts, default: False"
data_url: "Url for modelarts"
train_url: "Url for modelarts"
data_path: "The location of the input data."
output_path: "The location of the output file."
device_target: 'Target device type'
enable_profiling: 'Whether enable profiling while training, default: False'

View File

@ -0,0 +1,38 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
ulimit -u unlimited
# check checkpoint file
if [ ! -f $1 ]
then
echo "error: CHECKPOINT_PATH=$1 is not a file"
exit 1
fi
dataset_type='cifar10'
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
if [ -d "../eval" ];
then
rm -rf ../eval
fi
mkdir ../eval
cd ../eval || exit
config_path="${BASEPATH}/../${dataset_type}_config_cpu.yaml"
echo "config path is : ${config_path}"
nohup python ${BASEPATH}/../eval.py --config_path=$config_path --checkpoint_path=$1 --dataset_name=$dataset_type > ./eval.log 2>&1 &

View File

@ -0,0 +1,30 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
BASEPATH=$(cd "`dirname $0`" || exit; pwd)
export PYTHONPATH=${BASEPATH}:$PYTHONPATH
if [ -d "./train" ];
then
rm -rf ./train
fi
mkdir ./train
cd ./train || exit
dataset_type='cifar10'
config_path="${BASEPATH}/../${dataset_type}_config_cpu.yaml"
echo "config path is : ${config_path}"
nohup python ${BASEPATH}/../train.py --config_path=$config_path --dataset_name=$dataset_type > train.log 2>&1 &

View File

@ -184,8 +184,6 @@ def run_train():
context.reset_auto_parallel_context()
context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL,
gradients_mean=True)
else:
raise ValueError("Unsupported platform.")
if cfg.dataset_name == "cifar10":
dataset = create_dataset_cifar10(cfg.train_data_path, 1, cifar_cfg=cfg)

View File

@ -41,7 +41,7 @@ Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-r
# [Environment Requirements](#contents)
- HardwareAscend
- HardwareAscend/CPU
- Prepare hardware environment with Ascend processor.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
@ -53,7 +53,7 @@ Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-r
After installing MindSpore via the official website, you can start training and evaluation as follows:
- running on Ascend
- running on Ascend/CPU
```python
# run training example
@ -110,6 +110,8 @@ If you want to run in modelarts, please check the official documentation of [mod
├──scripts
│ ├── run_train.sh // shell script for distributed on Ascend
│ ├── run_eval.sh // shell script for evaluation on Ascend
│ ├── run_train_cpu.sh // shell script for training on CPU
│ ├── run_eval_cpu.sh // shell script for evaluation on CPU
├── src
│ ├── dataset.py // Processing dataset
│ ├── textcnn.py // textcnn architecture
@ -119,6 +121,7 @@ If you want to run in modelarts, please check the official documentation of [mod
│ ├──moxing_adapter.py // moxing adapter
│ ├──config.py // parameter analysis
├── mr_config.yaml // parameter configuration
├── mr_config_cpu.yaml // parameter configuration
├── sst2_config.yaml // parameter configuration
├── subj_config.yaml // parameter configuration
├── train.py // training script
@ -152,7 +155,7 @@ For more configuration details, please refer the script `*.yaml`.
## [Training Process](#contents)
- running on Ascend
- running on Ascend/CPU
```python
# need set config_path in config.py file and set data_path in yaml file
@ -176,7 +179,7 @@ For more configuration details, please refer the script `*.yaml`.
## [Evaluation Process](#contents)
- evaluation on movie review dataset when running on Ascend
- evaluation on movie review dataset when running on Ascend/CPU
Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/textcnn/ckpt/train_textcnn.ckpt".

View File

@ -0,0 +1,59 @@
# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing)
enable_modelarts: False
# Url for modelarts
data_url: ""
train_url: ""
checkpoint_url: ""
# Path for local
data_path: "./data"
output_path: "./train"
load_path: "./checkpoint_path/"
device_target: 'CPU'
enable_profiling: False
# ==============================================================================
# Training options
dataset: 'MR'
pre_trained: False
num_classes: 2
batch_size: 64
epoch_size: 4
weight_decay: 3e-5
keep_checkpoint_max: 1
checkpoint_path: './checkpoint/'
checkpoint_file_path: './train/checkpoint/train_textcnn-4_149.ckpt'
word_len: 51
vec_length: 40
base_lr: 1e-3
label_dir: ""
result_dir: ""
result_path: './preprocess_Result/'
# Export options
device_id: 0
file_name: "textcnn"
file_format: "AIR"
---
# Help description for each configuration
enable_modelarts: 'Whether training on modelarts, default: False'
data_url: 'Dataset url for obs'
train_url: 'Training output url for obs'
checkpoint_url: 'The location of checkpoint for obs'
data_path: 'Dataset path for local'
output_path: 'Training output path for local'
load_path: 'The location of checkpoint for obs'
device_target: 'Target device type, available: [Ascend, GPU, CPU]'
enable_profiling: 'Whether enable profiling while training, default: False'
dataset: "Dataset to be trained and evaluated, choice: ['MR, SUBJ, SST2']"
train_epochs: "The number of epochs used to train."
pre_trained: 'If need load pre_trained checkpoint, default: False'
num_classes: 'Class for dataset'
batch_size: "Batch size for training and evaluation"
epoch_size: "Total training epochs."
weight_decay: "Weight decay."
keep_checkpoint_max: "keep the last keep_checkpoint_max checkpoint"
num_factors: "The Embedding size of MF model."
checkpoint_path: "The location of the checkpoint file."
eval_file_name: "Eval output file."
checkpoint_file_path: "The location of the checkpoint file."

View File

@ -0,0 +1,19 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
BASE_PATH=$(dirname "$(dirname "$(readlink -f $0)")")
dataset_type='MR'
CONFIG_FILE="${BASE_PATH}/mr_config_cpu.yaml"
python eval.py --checkpoint_file_path="$1" --dataset=$dataset_type --config_path=$CONFIG_FILE > eval.log 2>&1 &

View File

@ -0,0 +1,20 @@
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
BASE_PATH=$(dirname "$(dirname "$(readlink -f $0)")")
dataset_type='MR'
echo $BASE_PATH
CONFIG_FILE="${BASE_PATH}/mr_config_cpu.yaml"
python train.py --dataset=$dataset_type --config_path=$CONFIG_FILE > train.log 2>&1 &