forked from mindspore-Ecosystem/mindspore
fix resnext152 eval data path and use bash in readme
This commit is contained in:
parent
3b0c3e640b
commit
e91cd800bd
|
@ -37,8 +37,8 @@ The overall network architecture of ResNeXt is show below:
|
|||
Dataset used: [imagenet](http://www.image-net.org/)
|
||||
|
||||
- Dataset size: ~125G, 1.2W colorful images in 1000 classes
|
||||
- Train: 120G, 1.2W images
|
||||
- Test: 5G, 50000 images
|
||||
- Train: 120G, 1.2W images
|
||||
- Test: 5G, 50000 images
|
||||
- Data format: RGB images
|
||||
- Note: Data will be processed in src/dataset.py
|
||||
|
||||
|
@ -53,12 +53,11 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
|
|||
# [Environment Requirements](#contents)
|
||||
|
||||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||
- Framework
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- [MindSpore](https://www.mindspore.cn/install)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
||||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
|
||||
|
||||
# [Script description](#contents)
|
||||
|
||||
|
@ -145,18 +144,18 @@ or shell script:
|
|||
```script
|
||||
Ascend:
|
||||
# distribute training example(8p)
|
||||
sh run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
bash run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
# standalone training
|
||||
sh run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
bash run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
```
|
||||
|
||||
#### Launch
|
||||
|
||||
```bash
|
||||
# distributed training example(8p) for Ascend
|
||||
sh scripts/run_distribute_train.sh RANK_TABLE_FILE /dataset/train
|
||||
bash scripts/run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
# standalone training example for Ascend
|
||||
sh scripts/run_standalone_train.sh 0 /dataset/train
|
||||
bash scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
```
|
||||
|
||||
You can find checkpoint file together with result in log.
|
||||
|
@ -175,7 +174,7 @@ or shell script:
|
|||
|
||||
```script
|
||||
# Evaluation
|
||||
sh run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
|
||||
bash run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
|
||||
```
|
||||
|
||||
PLATFORM is Ascend, default is Ascend.
|
||||
|
@ -184,10 +183,10 @@ PLATFORM is Ascend, default is Ascend.
|
|||
|
||||
```bash
|
||||
# Evaluation with checkpoint
|
||||
sh scripts/run_eval.sh 0 /opt/npu/datasets/classification/val /resnext152_100.ckpt Ascend
|
||||
bash scripts/run_eval.sh DEVICE_ID PRETRAINED_CKPT_PATH PLATFORM
|
||||
|
||||
#Directly use the script to run
|
||||
python eval.py --data_dir /opt/npu/pvc/dataset/storage/imagenet/val/ --platform Ascend --pretrained /root/test/resnext152_64x4d/outputs_demo/best_acc_4.ckpt
|
||||
# Directly use the script to run
|
||||
python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained ~/best_acc_4.ckpt
|
||||
```
|
||||
|
||||
#### Result
|
||||
|
@ -213,31 +212,31 @@ python export.py --device_target [PLATFORM] --ckpt_file [CKPT_PATH] --file_forma
|
|||
|
||||
### Training Performance
|
||||
|
||||
| Parameters | ResNeXt152 | |
|
||||
| -------------------------- | --------------------------------------------- | ---- |
|
||||
| Resource | Ascend 910, cpu:2.60GHz 192cores, memory:755G | |
|
||||
| uploaded Date | 06/30/2021 | |
|
||||
| MindSpore Version | 1.2 | |
|
||||
| Dataset | ImageNet | |
|
||||
| Training Parameters | src/config.py | |
|
||||
| Optimizer | Momentum | |
|
||||
| Loss Function | SoftmaxCrossEntropy | |
|
||||
| Loss | 1.28923 | |
|
||||
| Accuracy | 80.08%(TOP1) | |
|
||||
| Total time | 7.8 h 8ps | |
|
||||
| Checkpoint for Fine tuning | 192 M(.ckpt file) | |
|
||||
| Parameters | ResNeXt152 |
|
||||
| -------------------------- | --------------------------------------------- |
|
||||
| Resource | Ascend 910, cpu:2.60GHz 192cores, memory:755G |
|
||||
| uploaded Date | 06/30/2021 |
|
||||
| MindSpore Version | 1.2 |
|
||||
| Dataset | ImageNet |
|
||||
| Training Parameters | src/config.py |
|
||||
| Optimizer | Momentum |
|
||||
| Loss Function | SoftmaxCrossEntropy |
|
||||
| Loss | 1.28923 |
|
||||
| Accuracy | 80.08%(TOP1) |
|
||||
| Total time | 7.8 h 8ps |
|
||||
| Checkpoint for Fine tuning | 192 M(.ckpt file) |
|
||||
|
||||
#### Inference Performance
|
||||
|
||||
| Parameters | | | |
|
||||
| ----------------- | ---- | ---- | ---------------- |
|
||||
| Resource | | | Ascend 910 |
|
||||
| uploaded Date | | | 06/20/2021 |
|
||||
| MindSpore Version | | | 1.2 |
|
||||
| Dataset | | | ImageNet, 1.2W |
|
||||
| batch_size | | | 1 |
|
||||
| outputs | | | probability |
|
||||
| Accuracy | | | acc=80.08%(TOP1) |
|
||||
| Parameters | |
|
||||
| ----------------- | ---------------- |
|
||||
| Resource | Ascend 910 |
|
||||
| uploaded Date | 06/20/2021 |
|
||||
| MindSpore Version | 1.2 |
|
||||
| Dataset | ImageNet, 1.2W |
|
||||
| batch_size | 1 |
|
||||
| outputs | probability |
|
||||
| Accuracy | acc=80.08%(TOP1) |
|
||||
|
||||
# [Description of Random Situation](#contents)
|
||||
|
||||
|
|
|
@ -57,13 +57,11 @@ ResNeXt整体网络架构如下:
|
|||
|
||||
# 环境要求
|
||||
|
||||
- 硬件(Ascend)
|
||||
- 准备Ascend处理器搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com,审核通过即可获得资源。
|
||||
- 框架
|
||||
- [MindSpore](https://www.mindspore.cn/install)
|
||||
- 如需查看详情,请参见如下资源:
|
||||
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
|
||||
- [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
|
||||
|
||||
# 脚本说明
|
||||
|
||||
|
@ -149,18 +147,18 @@ python train.py --data_dir ~/imagenet/train/ --platform Ascend --is_distributed
|
|||
```shell
|
||||
Ascend:
|
||||
# 分布式训练示例(8卡)
|
||||
sh run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
bash run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
# 单机训练
|
||||
sh run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
bash run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
```
|
||||
|
||||
### 样例
|
||||
|
||||
```shell
|
||||
# Ascend分布式训练示例(8卡)
|
||||
sh scripts/run_distribute_train.sh RANK_TABLE_FILE /dataset/train
|
||||
bash scripts/run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
|
||||
# Ascend单机训练示例
|
||||
sh scripts/run_standalone_train.sh 0 /dataset/train
|
||||
bash scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
|
||||
```
|
||||
|
||||
您可以在日志中找到检查点文件和结果。
|
||||
|
@ -179,7 +177,7 @@ python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained resnext
|
|||
|
||||
```shell
|
||||
# 评估
|
||||
sh run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
|
||||
bash run_eval.sh DEVICE_ID DATA_PATH PRETRAINED_CKPT_PATH PLATFORM
|
||||
```
|
||||
|
||||
PLATFORM is Ascend, default is Ascend.
|
||||
|
@ -188,10 +186,10 @@ PLATFORM is Ascend, default is Ascend.
|
|||
|
||||
```shell
|
||||
# 检查点评估
|
||||
sh scripts/run_eval.sh 0 /opt/npu/datasets/classification/val /resnext152_100.ckpt Ascend
|
||||
bash scripts/run_eval.sh DEVICE_ID PRETRAINED_CKPT_PATH PLATFORM
|
||||
|
||||
#或者直接使用脚本运行
|
||||
python eval.py --data_dir /opt/npu/pvc/dataset/storage/imagenet/val/ --platform Ascend --pretrained /root/test/resnext152_64x4d/outputs_demo/best_acc_0.ckpt
|
||||
python eval.py --data_dir ~/imagenet/val/ --platform Ascend --pretrained ~/best_acc_0.ckpt
|
||||
```
|
||||
|
||||
#### 结果
|
||||
|
@ -217,31 +215,31 @@ python export.py --device_target [PLATFORM] --ckpt_file [CKPT_PATH] --file_forma
|
|||
|
||||
### 训练性能
|
||||
|
||||
| 参数 | ResNeXt152 | |
|
||||
| -------------------------- | ---------------------------------------------------------- | ------------------------- |
|
||||
| 资源 | Ascend 910;CPU:2.60GHz,192核;内存:755GB | |
|
||||
| 上传日期 | 2021-6-30 | |
|
||||
| MindSpore版本 | 1.2 | |
|
||||
| 数据集 | ImageNet | |
|
||||
| 训练参数 | src/config.py | |
|
||||
| 优化器 | Momentum | |
|
||||
| 损失函数 | Softmax交叉熵 | |
|
||||
| 损失 | 1.2892 | |
|
||||
| 准确率 | 80.08%(TOP1) | |
|
||||
| 总时长 | 7.8小时 (8卡) | |
|
||||
| 调优检查点 | 192 M(.ckpt文件) | |
|
||||
| 参数 | ResNeXt152 |
|
||||
| -------------------------- | ---------------------------------------------------------- |
|
||||
| 资源 | Ascend 910;CPU:2.60GHz,192核;内存:755GB |
|
||||
| 上传日期 | 2021-6-30 |
|
||||
| MindSpore版本 | 1.2 |
|
||||
| 数据集 | ImageNet |
|
||||
| 训练参数 | src/config.py |
|
||||
| 优化器 | Momentum |
|
||||
| 损失函数 | Softmax交叉熵 |
|
||||
| 损失 | 1.2892 |
|
||||
| 准确率 | 80.08%(TOP1) |
|
||||
| 总时长 | 7.8小时 (8卡) |
|
||||
| 调优检查点 | 192 M(.ckpt文件) |
|
||||
|
||||
#### 推理性能
|
||||
|
||||
| 参数 | | | |
|
||||
| -------------------------- | ----------------------------- | ------------------------- | -------------------- |
|
||||
| 资源 | | | Ascend 910 |
|
||||
| 上传日期 | | | 2021-6-20 |
|
||||
| MindSpore版本 | | | 1.2 |
|
||||
| 数据集 | | | ImageNet, 1.2万 |
|
||||
| batch_size | | | 1 |
|
||||
| 输出 | | | 概率 |
|
||||
| 准确率 | | | acc=80.08%(TOP1) |
|
||||
| 参数 | |
|
||||
| -------------------------- | -------------------- |
|
||||
| 资源 | Ascend 910 |
|
||||
| 上传日期 | 2021-6-20 |
|
||||
| MindSpore版本 | 1.2 |
|
||||
| 数据集 | ImageNet, 1.2万 |
|
||||
| batch_size | 1 |
|
||||
| 输出 | 概率 |
|
||||
| 准确率 | acc=80.08%(TOP1) |
|
||||
|
||||
# 随机情况说明
|
||||
|
||||
|
|
|
@ -52,6 +52,7 @@ do
|
|||
--is_distribute=1 \
|
||||
--device_id=$DEVICE_ID \
|
||||
--pretrained=$PATH_CHECKPOINT \
|
||||
--data_dir=$DATA_DIR > log_less.txt 2>&1 &
|
||||
--data_dir=$DATA_DIR \
|
||||
--run_eval=False > log_less.txt 2>&1 &
|
||||
cd ../
|
||||
done
|
||||
|
|
|
@ -26,5 +26,6 @@ python train.py \
|
|||
--is_distribute=0 \
|
||||
--device_id=$DEVICE_ID \
|
||||
--pretrained=$PATH_CHECKPOINT \
|
||||
--data_dir=$DATA_DIR > log.txt 2>&1 &
|
||||
--data_dir=$DATA_DIR \
|
||||
--run_eval=False > log.txt 2>&1 &
|
||||
|
||||
|
|
|
@ -146,7 +146,7 @@ def parse_args(cloud_args=None):
|
|||
#dataset of eval dataset
|
||||
parser.add_argument('--eval_data_dir',
|
||||
type=str,
|
||||
default='/opt/npu/pvc/dataset/storage/imagenet/val',
|
||||
default='',
|
||||
help='eval data dir')
|
||||
parser.add_argument('--eval_per_batch_size',
|
||||
default=32,
|
||||
|
@ -289,9 +289,6 @@ def train(cloud_args=None):
|
|||
# checkpoint save
|
||||
progress_cb = ProgressMonitor(args)
|
||||
callbacks = [progress_cb,]
|
||||
#eval dataset
|
||||
if args.eval_data_dir is None or (not os.path.isdir(args.eval_data_dir)):
|
||||
raise ValueError("{} is not a existing path.".format(args.eval_data_dir))
|
||||
#code like eval.py
|
||||
#if run eval
|
||||
if args.run_eval:
|
||||
|
|
Loading…
Reference in New Issue