fix resnext152 eval data path and use bash in readme

2021-08-10 17:43:38 +08:00 · 2021-08-10 17:43:38 +08:00 · e91cd800bd
parent 3b0c3e640b
commit e91cd800bd
5 changed files with 71 additions and 75 deletions
--- a/model_zoo/research/cv/resnext152_64x4d/README.md
+++ b/model_zoo/research/cv/resnext152_64x4d/README.md
@ -37,8 +37,8 @@ The overall network architecture of ResNeXt is show below:
 Dataset used: [imagenet](http://www.image-net.org/)

 - Dataset size: ~125G, 1.2W colorful images in 1000 classes
- Train: 120G, 1.2W images
- Test: 5G, 50000 images
+    - Train: 120G, 1.2W images
+    - Test: 5G, 50000 images
 - Data format: RGB images
 - Note: Data will be processed in src/dataset.py

@ -53,12 +53,11 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
 # [Environment Requirements](#contents)

 - Hardware（Ascend）
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
 - Framework
- [MindSpore](https://www.mindspore.cn/install/en)
+    - [MindSpore](https://www.mindspore.cn/install)
 - For more information, please check the resources below：
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
+    - [MindSpore Tutorials](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
+    - [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)

 # [Script description](#contents)

@ -145,18 +144,18 @@ or shell script:
 ```script
 Ascend:
    # distribute training example(8p)
-    sh run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
+    bash run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
    # standalone training
-    sh run_standalone_train.sh DEVICE_ID DATA_PATH
+    bash run_standalone_train.sh DEVICE_ID DATA_PATH
 ```

 #### Launch

 ```bash
 # distributed training example(8p) for Ascend
-sh scripts/run_distribute_train.sh RANK_TABLE_FILE /dataset/train
+bash scripts/run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
 # standalone training example for Ascend
-sh scripts/run_standalone_train.sh 0 /dataset/train
+bash scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
 ```

 You can find checkpoint file together with result in log.
@ -175,7 +174,7 @@ or shell script:

 ```script
 # Evaluation
-sh run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
+bash run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
 ```

 PLATFORM is Ascend, default is Ascend.
@ -184,10 +183,10 @@ PLATFORM is Ascend, default is Ascend.

 ```bash
 # Evaluation with checkpoint
-sh scripts/run_eval.sh 0 /opt/npu/datasets/classification/val /resnext152_100.ckpt Ascend
+bash scripts/run_eval.sh DEVICE_ID PRETRAINED_CKPT_PATH PLATFORM

-#Directly use the script to run
-python eval.py --data_dir /opt/npu/pvc/dataset/storage/imagenet/val/ --platform Ascend --pretrained /root/test/resnext152_64x4d/outputs_demo/best_acc_4.ckpt
+# Directly use the script to run
+python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained ~/best_acc_4.ckpt
 ```

 #### Result
@ -213,31 +212,31 @@ python export.py --device_target [PLATFORM] --ckpt_file [CKPT_PATH] --file_forma

 ### Training Performance

-| Parameters                 | ResNeXt152                                    |      |
-| -------------------------- | --------------------------------------------- | ---- |
-| Resource                   | Ascend 910, cpu:2.60GHz 192cores, memory:755G |      |
-| uploaded Date              | 06/30/2021                                    |      |
-| MindSpore Version          | 1.2                                           |      |
-| Dataset                    | ImageNet                                      |      |
-| Training Parameters        | src/config.py                                 |      |
-| Optimizer                  | Momentum                                      |      |
-| Loss Function              | SoftmaxCrossEntropy                           |      |
-| Loss                       | 1.28923                                       |      |
-| Accuracy                   | 80.08%(TOP1)                                  |      |
-| Total time                 | 7.8 h 8ps                                     |      |
-| Checkpoint for Fine tuning | 192 M(.ckpt file)                             |      |
+| Parameters                 | ResNeXt152                                    |
+| -------------------------- | --------------------------------------------- |
+| Resource                   | Ascend 910, cpu:2.60GHz 192cores, memory:755G |
+| uploaded Date              | 06/30/2021                                    |
+| MindSpore Version          | 1.2                                           |
+| Dataset                    | ImageNet                                      |
+| Training Parameters        | src/config.py                                 |
+| Optimizer                  | Momentum                                      |
+| Loss Function              | SoftmaxCrossEntropy                           |
+| Loss                       | 1.28923                                       |
+| Accuracy                   | 80.08%(TOP1)                                  |
+| Total time                 | 7.8 h 8ps                                     |
+| Checkpoint for Fine tuning | 192 M(.ckpt file)                             |

 #### Inference Performance

-| Parameters        |      |      |                  |
-| ----------------- | ---- | ---- | ---------------- |
-| Resource          |      |      | Ascend 910       |
-| uploaded Date     |      |      | 06/20/2021       |
-| MindSpore Version |      |      | 1.2              |
-| Dataset           |      |      | ImageNet, 1.2W   |
-| batch_size        |      |      | 1                |
-| outputs           |      |      | probability      |
-| Accuracy          |      |      | acc=80.08%(TOP1) |
+| Parameters        |                  |
+| ----------------- | ---------------- |
+| Resource          | Ascend 910       |
+| uploaded Date     | 06/20/2021       |
+| MindSpore Version | 1.2              |
+| Dataset           | ImageNet, 1.2W   |
+| batch_size        | 1                |
+| outputs           | probability      |
+| Accuracy          | acc=80.08%(TOP1) |

 # [Description of Random Situation](#contents)

--- a/model_zoo/research/cv/resnext152_64x4d/README_CN.md
+++ b/model_zoo/research/cv/resnext152_64x4d/README_CN.md
@ -57,13 +57,11 @@ ResNeXt整体网络架构如下：

 # 环境要求

- 硬件（Ascend）
-    - 准备Ascend处理器搭建硬件环境。如需试用昇腾处理器，请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com，审核通过即可获得资源。
 - 框架
    - [MindSpore](https://www.mindspore.cn/install)
 - 如需查看详情，请参见如下资源：
-    - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
-    - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
+    - [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
+    - [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)

 # 脚本说明

@ -149,18 +147,18 @@ python train.py --data_dir ~/imagenet/train/ --platform Ascend --is_distributed
 ```shell
 Ascend:
    # 分布式训练示例（8卡）
-    sh run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
+    bash run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
    # 单机训练
-    sh run_standalone_train.sh DEVICE_ID DATA_PATH
+    bash run_standalone_train.sh DEVICE_ID DATA_PATH
 ```

 ### 样例

 ```shell
 # Ascend分布式训练示例（8卡）
-sh scripts/run_distribute_train.sh RANK_TABLE_FILE /dataset/train
+bash scripts/run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
 # Ascend单机训练示例
-sh scripts/run_standalone_train.sh 0 /dataset/train
+bash scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
 ```

 您可以在日志中找到检查点文件和结果。
@ -179,7 +177,7 @@ python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained resnext

 ```shell
 # 评估
-sh run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
+bash run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
 ```

 PLATFORM is Ascend, default is Ascend.
@ -188,10 +186,10 @@ PLATFORM is Ascend, default is Ascend.

 ```shell
 # 检查点评估
-sh scripts/run_eval.sh 0 /opt/npu/datasets/classification/val /resnext152_100.ckpt Ascend
+bash scripts/run_eval.sh DEVICE_ID PRETRAINED_CKPT_PATH PLATFORM

 #或者直接使用脚本运行
-python eval.py --data_dir /opt/npu/pvc/dataset/storage/imagenet/val/ --platform Ascend --pretrained /root/test/resnext152_64x4d/outputs_demo/best_acc_0.ckpt
+python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained ~/best_acc_0.ckpt
 ```

 #### 结果
@ -217,31 +215,31 @@ python export.py --device_target [PLATFORM] --ckpt_file [CKPT_PATH] --file_forma

 ### 训练性能

-| 参数 | ResNeXt152 | |
-| -------------------------- | ---------------------------------------------------------- | ------------------------- |
-| 资源                   | Ascend 910；CPU：2.60GHz，192核；内存：755GB              |           |
-| 上传日期              | 2021-6-30                                          |       |
-| MindSpore版本          | 1.2                                                    |                      |
-| 数据集 | ImageNet |  |
-| 训练参数        | src/config.py                                           |           |
-| 优化器                  | Momentum                                                        |                  |
-| 损失函数             | Softmax交叉熵 |  |
-| 损失                       | 1.2892 |  |
-| 准确率 | 80.08%(TOP1)                                          |      |
-| 总时长                 | 7.8小时 （8卡） |  |
-| 调优检查点 | 192 M（.ckpt文件） |      |
+| 参数 | ResNeXt152 |
+| -------------------------- | ---------------------------------------------------------- |
+| 资源                   | Ascend 910；CPU：2.60GHz，192核；内存：755GB              |
+| 上传日期              | 2021-6-30                                          |
+| MindSpore版本          | 1.2                                                    |
+| 数据集 | ImageNet |
+| 训练参数        | src/config.py                                           |
+| 优化器                  | Momentum                                                        |
+| 损失函数             | Softmax交叉熵 |
+| 损失                       | 1.2892 |
+| 准确率 | 80.08%(TOP1)                                          |
+| 总时长                 | 7.8小时 （8卡） |
+| 调优检查点 | 192 M（.ckpt文件） |

 #### 推理性能

-| 参数                 |                               |                           |                      |
-| -------------------------- | ----------------------------- | ------------------------- | -------------------- |
-| 资源                   |                     |  | Ascend 910          |
-| 上传日期              |                                            |    | 2021-6-20 |
-| MindSpore版本         |      |                      | 1.2             |
-| 数据集 |      |      | ImageNet， 1.2万 |
-| batch_size                 |      |      | 1                    |
-| 输出 |      |      | 概率 |
-| 准确率 |               |           | acc=80.08%(TOP1) |
+| 参数                 |                      |
+| -------------------------- | -------------------- |
+| 资源                   | Ascend 910          |
+| 上传日期              | 2021-6-20 |
+| MindSpore版本         | 1.2             |
+| 数据集 | ImageNet， 1.2万 |
+| batch_size                 | 1                    |
+| 输出 | 概率 |
+| 准确率 | acc=80.08%(TOP1) |

 # 随机情况说明

--- a/model_zoo/research/cv/resnext152_64x4d/scripts/run_distribute_train.sh
+++ b/model_zoo/research/cv/resnext152_64x4d/scripts/run_distribute_train.sh
@ -52,6 +52,7 @@ do
    --is_distribute=1 \
    --device_id=$DEVICE_ID \
    --pretrained=$PATH_CHECKPOINT \
-    --data_dir=$DATA_DIR > log_less.txt 2>&1 &
+    --data_dir=$DATA_DIR \
+    --run_eval=False > log_less.txt 2>&1 &
    cd ../
 done
--- a/model_zoo/research/cv/resnext152_64x4d/scripts/run_standalone_train.sh
+++ b/model_zoo/research/cv/resnext152_64x4d/scripts/run_standalone_train.sh
@ -26,5 +26,6 @@ python train.py  \
    --is_distribute=0 \
    --device_id=$DEVICE_ID \
    --pretrained=$PATH_CHECKPOINT \
-    --data_dir=$DATA_DIR > log.txt 2>&1 &
+    --data_dir=$DATA_DIR \
+    --run_eval=False > log.txt 2>&1 &

--- a/model_zoo/research/cv/resnext152_64x4d/train.py
+++ b/model_zoo/research/cv/resnext152_64x4d/train.py
@ -146,7 +146,7 @@ def parse_args(cloud_args=None):
    #dataset of eval dataset
    parser.add_argument('--eval_data_dir',
                        type=str,
-                        default='/opt/npu/pvc/dataset/storage/imagenet/val',
+                        default='',
                        help='eval data dir')
    parser.add_argument('--eval_per_batch_size',
                        default=32,
@ -289,9 +289,6 @@ def train(cloud_args=None):
    # checkpoint save
    progress_cb = ProgressMonitor(args)
    callbacks = [progress_cb,]
-    #eval dataset
-    if args.eval_data_dir is None or (not os.path.isdir(args.eval_data_dir)):
-        raise ValueError("{} is not a existing path.".format(args.eval_data_dir))
    #code like eval.py
    #if run eval
    if args.run_eval: