!18782 Add Textcnn and Googlenet CPU Scripts

Merge pull request !18782 from wuxuejian/textcnn
2021-06-25 08:25:49 +00:00 · 2021-06-25 08:25:49 +00:00 · d3a53ff1f6
parent 246f96f21b 4b90d90e1f
commit d3a53ff1f6
10 changed files with 272 additions and 10 deletions
--- a/model_zoo/official/cv/googlenet/README.md
+++ b/model_zoo/official/cv/googlenet/README.md
@ -71,8 +71,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil

 # [Environment Requirements](#contents)

- Hardware（Ascend/GPU）
-    - Prepare hardware environment with Ascend or GPU processor.
+- Hardware（Ascend/GPU/CPU）
+    - Prepare hardware environment with Ascend/GPU/CPU processor.
 - Framework
    - [MindSpore](https://www.mindspore.cn/install/en)
 - For more information, please check the resources below：
@ -125,6 +125,16 @@ After installing MindSpore via the official website, you can start training and
  sh run_eval_gpu.sh [CHECKPOINT_PATH]
  ```

+- running on CPU
+
+  ```python
+  # run training example
+  nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
+
+  # run evaluation example
+  nohup python eval.py --checkpoint_path=[CHECKPOINT_PATH] > eval.log 2>&1 &  
+  ```
+
 We use CIFAR-10 dataset by default. Your can also pass `$dataset_type` to the scripts so that select different datasets. For more details, please refer the specify script.

 - ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start training as follows)
@ -232,9 +242,11 @@ We use CIFAR-10 dataset by default. Your can also pass `$dataset_type` to the sc
        ├── scripts
        │   ├──run_train.sh             // shell script for distributed on Ascend
        │   ├──run_train_gpu.sh         // shell script for distributed on GPU
+        │   ├──run_train_cpu.sh         // shell script for training on CPU
        │   ├──run_eval.sh              // shell script for evaluation on Ascend
        │   ├──run_infer_310.sh         // shell script for 310 inference
        │   ├──run_eval_gpu.sh          // shell script for evaluation on GPU
+        │   ├──run_eval_cpu.sh          // shell script for evaluation on CPU
        ├── src
        │   ├──dataset.py             // creating dataset
        │   ├──googlenet.py          // googlenet architecture
@ -338,6 +350,16 @@ For more configuration details, please refer the script `config.py`.

  After training, you'll get some checkpoint files under the folder `./ckpt_0/` by default.

+- running on CPU
+
+  ```python
+  nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
+  ```
+
+  The python command above will run in the background, you can view the results through the file `train.log`.
+
+  After training, you'll get some checkpoint files under the folder defined in config.yaml.
+
 ### Distributed Training

 - running on Ascend
--- a/model_zoo/official/cv/googlenet/README_CN.md
+++ b/model_zoo/official/cv/googlenet/README_CN.md
@ -78,8 +78,8 @@ GoogleNet由多个inception模块串联起来，可以更加深入。  降维的

 # 环境要求

- 硬件（Ascend/GPU）
-    - 使用Ascend或GPU处理器来搭建硬件环境。
+- 硬件（Ascend/GPU/CPU）
+    - 使用Ascend/GPU/CPU处理器来搭建硬件环境。
 - 框架
    - [MindSpore](https://www.mindspore.cn/install/en)
 - 如需查看详情，请参见如下资源：
@ -132,6 +132,16 @@ GoogleNet由多个inception模块串联起来，可以更加深入。  降维的
  sh run_eval_gpu.sh [CHECKPOINT_PATH]
  ```

+- CPU处理器环境运行
+
+  ```python
+  # 运行训练示例
+  nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10  > train.log 2>&1 &
+
+  # 运行评估示例
+  nohup python eval.py --checkpoint_path=[CHECKPOINT_PATH] > eval.log 2>&1 &
+  ```
+
 默认使用CIFAR-10数据集。您也可以将`$dataset_type`传入脚本，以便选择其他数据集。如需查看更多详情，请参考指定脚本。

 - 在 ModelArts 进行训练 (如果你想在modelarts上运行，可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
@ -239,9 +249,11 @@ GoogleNet由多个inception模块串联起来，可以更加深入。  降维的
        ├── scripts
        │   ├──run_train.sh             // 分布式到Ascend的shell脚本
        │   ├──run_train_gpu.sh         // 分布式到GPU处理器的shell脚本
+        │   ├──run_train_cpu.sh         // CPU处理器训练的shell脚本
        │   ├──run_eval.sh              // Ascend评估的shell脚本
        │   ├──run_infer_310.sh         // Ascend推理shell脚本
        │   ├──run_eval_gpu.sh          // GPU处理器评估的shell脚本
+        │   ├──run_eval_cpu.sh          // CPU处理器评估的shell脚本
        ├── src
        │   ├──dataset.py             // 创建数据集
        │   ├──googlenet.py          //  googlenet架构
@ -313,6 +325,16 @@ GoogleNet由多个inception模块串联起来，可以更加深入。  降维的

  训练结束后，您可在默认`./ckpt_0/`脚本文件夹下找到检查点文件。

+- CPU处理器环境运行
+
+  ```bash
+  nohup python train.py --config_path=cifar10_config_cpu.yaml --dataset_name=cifar10 > train.log 2>&1 &
+  ```
+
+  上述python命令将在后台运行，您可以通过train.log文件查看结果。
+
+  训练结束后，您可在yaml文件中配置的文件夹下找到检查点文件。
+
 ### 分布式训练

 - Ascend处理器环境运行
--- a/model_zoo/official/cv/googlenet/cifar10_config_cpu.yaml
+++ b/model_zoo/official/cv/googlenet/cifar10_config_cpu.yaml
@ -0,0 +1,51 @@
+# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing)
+enable_modelarts: False
+# Url for modelarts
+data_url: ""
+train_url: ""
+checkpoint_url: ""
+# Path for local
+data_path: "/cache/data"
+output_path: "/cache/train"
+load_path: "/cache/checkpoint_path"
+device_target: "CPU"
+need_modelarts_dataset_unzip: True
+modelarts_dataset_unzip_name: "cifar10"
+
+# ==============================================================================
+# options
+dataset_name: "cifar10"
+name: "cifar10"
+pre_trained: False
+num_classes: 10
+lr_init: 0.1
+batch_size: 128
+epoch_size: 125
+momentum: 0.9
+weight_decay: 0.0005 #5e-4
+image_height: 224
+image_width: 224
+train_data_path: "/cache/data/cifar10/"
+val_data_path: "/cache/data/cifar10/"
+keep_checkpoint_max: 10
+checkpoint_path: "./train_googlenet_cifar10-125_390.ckpt"
+onnx_filename: "googlenet"
+air_filename: "googlenet"
+ckpt_save_dir: "./ckpt/"
+
+# export option
+ckpt_file: ""
+file_name: "googlenet"
+file_format: "MINDIR"
+#batch_size: 1
+
+---
+
+# Help description for each configuration
+enable_modelarts: "Whether training on modelarts, default: False"
+data_url: "Url for modelarts"
+train_url: "Url for modelarts"
+data_path: "The location of the input data."
+output_path: "The location of the output file."
+device_target: 'Target device type'
+enable_profiling: 'Whether enable profiling while training, default: False'
--- a/model_zoo/official/cv/googlenet/scripts/run_eval_cpu.sh
+++ b/model_zoo/official/cv/googlenet/scripts/run_eval_cpu.sh
@ -0,0 +1,38 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+ulimit -u unlimited
+# check checkpoint file
+if [ ! -f $1 ]
+then
+    echo "error: CHECKPOINT_PATH=$1 is not a file"
+exit 1
+fi
+
+dataset_type='cifar10'
+BASEPATH=$(cd "`dirname $0`" || exit; pwd)
+export PYTHONPATH=${BASEPATH}:$PYTHONPATH
+if [ -d "../eval" ];
+then
+    rm -rf ../eval
+fi
+mkdir ../eval
+cd ../eval || exit
+
+config_path="${BASEPATH}/../${dataset_type}_config_cpu.yaml"
+echo "config path is : ${config_path}"
+
+nohup python ${BASEPATH}/../eval.py --config_path=$config_path --checkpoint_path=$1 --dataset_name=$dataset_type > ./eval.log 2>&1 &
--- a/model_zoo/official/cv/googlenet/scripts/run_train_cpu.sh
+++ b/model_zoo/official/cv/googlenet/scripts/run_train_cpu.sh
@ -0,0 +1,30 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+BASEPATH=$(cd "`dirname $0`" || exit; pwd)
+export PYTHONPATH=${BASEPATH}:$PYTHONPATH
+if [ -d "./train" ];
+then
+    rm -rf ./train
+fi
+mkdir ./train
+cd ./train || exit
+
+dataset_type='cifar10'
+
+config_path="${BASEPATH}/../${dataset_type}_config_cpu.yaml"
+echo "config path is : ${config_path}"
+nohup python ${BASEPATH}/../train.py --config_path=$config_path --dataset_name=$dataset_type > train.log 2>&1 &
--- a/model_zoo/official/cv/googlenet/train.py
+++ b/model_zoo/official/cv/googlenet/train.py
@ -184,8 +184,6 @@ def run_train():
            context.reset_auto_parallel_context()
            context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL,
                                              gradients_mean=True)
-    else:
-        raise ValueError("Unsupported platform.")

    if cfg.dataset_name == "cifar10":
        dataset = create_dataset_cifar10(cfg.train_data_path, 1, cifar_cfg=cfg)
--- a/model_zoo/official/nlp/textcnn/README.md
+++ b/model_zoo/official/nlp/textcnn/README.md
@ -41,7 +41,7 @@ Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-r

 # [Environment Requirements](#contents)

- Hardware（Ascend）
+- Hardware（Ascend/CPU）
    - Prepare hardware environment with Ascend processor.
 - Framework
    - [MindSpore](https://www.mindspore.cn/install/en)
@ -53,7 +53,7 @@ Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-r

 After installing MindSpore via the official website, you can start training and evaluation as follows:

- running on Ascend
+- running on Ascend/CPU

  ```python
  # run training example
@ -110,6 +110,8 @@ If you want to run in modelarts, please check the official documentation of [mod
        ├──scripts
        │   ├── run_train.sh              // shell script for distributed on Ascend
        │   ├── run_eval.sh               // shell script for evaluation on Ascend
+        │   ├── run_train_cpu.sh          // shell script for training on CPU
+        │   ├── run_eval_cpu.sh           // shell script for evaluation on CPU
        ├── src
        │   ├── dataset.py                // Processing dataset
        │   ├── textcnn.py                // textcnn architecture
@ -119,6 +121,7 @@ If you want to run in modelarts, please check the official documentation of [mod
        │   ├──moxing_adapter.py          // moxing adapter
        │   ├──config.py                  // parameter analysis
        ├── mr_config.yaml                 // parameter configuration
+        ├── mr_config_cpu.yaml             // parameter configuration
        ├── sst2_config.yaml               // parameter configuration
        ├── subj_config.yaml               // parameter configuration
        ├── train.py                       // training script
@ -152,7 +155,7 @@ For more configuration details, please refer the script `*.yaml`.

 ## [Training Process](#contents)

- running on Ascend
+- running on Ascend/CPU

  ```python
  # need set config_path in config.py file and set data_path in yaml file
@ -176,7 +179,7 @@ For more configuration details, please refer the script `*.yaml`.

 ## [Evaluation Process](#contents)

- evaluation on movie review dataset when running on Ascend
+- evaluation on movie review dataset when running on Ascend/CPU

  Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/textcnn/ckpt/train_textcnn.ckpt".

--- a/model_zoo/official/nlp/textcnn/mr_config_cpu.yaml
+++ b/model_zoo/official/nlp/textcnn/mr_config_cpu.yaml
@ -0,0 +1,59 @@
+# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing)
+enable_modelarts: False
+# Url for modelarts
+data_url: ""
+train_url: ""
+checkpoint_url: ""
+# Path for local
+data_path: "./data"
+output_path: "./train"
+load_path: "./checkpoint_path/"
+device_target: 'CPU'
+enable_profiling: False
+
+# ==============================================================================
+# Training options
+dataset: 'MR'
+pre_trained: False
+num_classes: 2
+batch_size: 64
+epoch_size: 4
+weight_decay: 3e-5
+keep_checkpoint_max: 1
+checkpoint_path: './checkpoint/'
+checkpoint_file_path: './train/checkpoint/train_textcnn-4_149.ckpt'
+word_len: 51
+vec_length: 40
+base_lr: 1e-3
+label_dir: ""
+result_dir: ""
+result_path: './preprocess_Result/'
+
+# Export options
+device_id: 0
+file_name: "textcnn"
+file_format: "AIR"
+
+---
+# Help description for each configuration
+enable_modelarts: 'Whether training on modelarts, default: False'
+data_url: 'Dataset url for obs'
+train_url: 'Training output url for obs'
+checkpoint_url: 'The location of checkpoint for obs'
+data_path: 'Dataset path for local'
+output_path: 'Training output path for local'
+load_path: 'The location of checkpoint for obs'
+device_target: 'Target device type, available: [Ascend, GPU, CPU]'
+enable_profiling: 'Whether enable profiling while training, default: False'
+dataset: "Dataset to be trained and evaluated, choice: ['MR, SUBJ, SST2']"
+train_epochs: "The number of epochs used to train."
+pre_trained: 'If need load pre_trained checkpoint, default: False'
+num_classes: 'Class for dataset'
+batch_size: "Batch size for training and evaluation"
+epoch_size: "Total training epochs."
+weight_decay: "Weight decay."
+keep_checkpoint_max: "keep the last keep_checkpoint_max checkpoint"
+num_factors: "The Embedding size of MF model."
+checkpoint_path: "The location of the checkpoint file."
+eval_file_name: "Eval output file."
+checkpoint_file_path: "The location of the checkpoint file."
--- a/model_zoo/official/nlp/textcnn/scripts/run_eval_cpu.sh
+++ b/model_zoo/official/nlp/textcnn/scripts/run_eval_cpu.sh
@ -0,0 +1,19 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+BASE_PATH=$(dirname "$(dirname "$(readlink -f $0)")")
+dataset_type='MR'
+CONFIG_FILE="${BASE_PATH}/mr_config_cpu.yaml"
+python eval.py --checkpoint_file_path="$1" --dataset=$dataset_type --config_path=$CONFIG_FILE > eval.log 2>&1 &
--- a/model_zoo/official/nlp/textcnn/scripts/run_train_cpu.sh
+++ b/model_zoo/official/nlp/textcnn/scripts/run_train_cpu.sh
@ -0,0 +1,20 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+BASE_PATH=$(dirname "$(dirname "$(readlink -f $0)")")
+dataset_type='MR'
+echo $BASE_PATH
+CONFIG_FILE="${BASE_PATH}/mr_config_cpu.yaml"
+python train.py --dataset=$dataset_type --config_path=$CONFIG_FILE > train.log 2>&1 &