forked from mindspore-Ecosystem/mindspore
!19155 export on modelarts
Merge pull request !19155 from huchunmei/export1
This commit is contained in:
commit
8ecfc11dbf
|
@ -71,6 +71,109 @@ sh run_standalone_train_ascend.sh [DATA_PATH] [CKPT_SAVE_PATH]
|
|||
sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME]
|
||||
```
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size: 30" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "dataset_path=/cache/data" on the website UI interface.
|
||||
# Add "epoch_size: 30" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/alexnet" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size: 30" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add "epoch_size: 30" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/alexnet" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint='./alexnet/alexnet_trained.ckpt'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint='./alexnet/alexnet_trained.ckpt'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/alexnet" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='alexnet'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='alexnet'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/alexnet" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
## [Script Description](#contents)
|
||||
|
||||
### [Script and Sample Code](#contents)
|
||||
|
|
|
@ -139,6 +139,31 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME]
|
|||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='alexnet'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='alexnet'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/alexnet"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
## 脚本说明
|
||||
|
||||
### 脚本及样例代码
|
||||
|
|
|
@ -18,6 +18,7 @@ python export.py
|
|||
"""
|
||||
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.alexnet import AlexNet
|
||||
|
||||
import numpy as np
|
||||
|
@ -29,7 +30,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
|||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=config.device_id)
|
||||
|
||||
if __name__ == '__main__':
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_alexnet():
|
||||
""" export_alexnet """
|
||||
if config.dataset_name == 'imagenet':
|
||||
net = AlexNet(num_classes=config.num_classes)
|
||||
param_dict = load_checkpoint(config.ckpt_file)
|
||||
|
@ -42,4 +48,6 @@ if __name__ == '__main__':
|
|||
load_param_into_net(net, param_dict)
|
||||
input_arr = Tensor(np.zeros([config.batch_size, 3, config.image_height, config.image_width]), ms.float32)
|
||||
export(net, input_arr, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_alexnet()
|
||||
|
|
|
@ -158,6 +158,31 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
|
|||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='inceptionv3'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='inceptionv3'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/inceptionv3" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script description](#contents)
|
||||
|
||||
## [Script and sample code](#contents)
|
||||
|
|
|
@ -79,7 +79,7 @@ InceptionV3的总体网络架构如下:
|
|||
|
||||
- 在 ModelArts 进行训练 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
```bash
|
||||
```python
|
||||
# 在 ModelArts 上使用8卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
|
@ -169,6 +169,31 @@ InceptionV3的总体网络架构如下:
|
|||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='inceptionv3'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='inceptionv3'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/inceptionv3"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
# 脚本说明
|
||||
|
||||
## 脚本和样例代码
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
import numpy as np
|
||||
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.inception_v3 import InceptionV3
|
||||
|
||||
|
@ -29,7 +30,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
|||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == '__main__':
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_inceptionv3():
|
||||
""" export_inceptionv3 """
|
||||
net = InceptionV3(num_classes=config.num_classes, is_training=False)
|
||||
param_dict = load_checkpoint(config.ckpt_file)
|
||||
load_param_into_net(net, param_dict)
|
||||
|
@ -37,3 +43,6 @@ if __name__ == '__main__':
|
|||
input_arr = Tensor(np.random.uniform(0.0, 1.0, size=[config.batch_size, 3, config.width, \
|
||||
config.height]), ms.float32)
|
||||
export(net, input_arr, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_inceptionv3()
|
||||
|
|
|
@ -144,6 +144,31 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME]
|
|||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='lenet'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='lenet'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/lenet" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
## [Script Description](#contents)
|
||||
|
||||
### [Script and Sample Code](#contents)
|
||||
|
|
|
@ -145,6 +145,31 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME]
|
|||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='lenet'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='lenet'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/lenet"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
## 脚本说明
|
||||
|
||||
### 脚本及样例代码
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.lenet import LeNet5
|
||||
|
||||
import numpy as np
|
||||
|
@ -27,8 +28,11 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
|||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == "__main__":
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_lenet():
|
||||
# define fusion network
|
||||
network = LeNet5(config.num_classes)
|
||||
# load network checkpoint
|
||||
|
@ -38,3 +42,6 @@ if __name__ == "__main__":
|
|||
# export network
|
||||
inputs = Tensor(np.ones([config.batch_size, 1, config.image_height, config.image_width]), mindspore.float32)
|
||||
export(network, inputs, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_lenet()
|
||||
|
|
|
@ -290,6 +290,31 @@ bash run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [DATA_PATH]
|
|||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='maskrcnn'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='maskrcnn'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/maskrcnn" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script Description](#contents)
|
||||
|
||||
## [Script and Sample Code](#contents)
|
||||
|
|
|
@ -282,6 +282,31 @@ bash run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
|
|||
# (7) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='maskrcnn'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='maskrcnn'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/maskrcnn"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
# 脚本说明
|
||||
|
||||
## 脚本和样例代码
|
||||
|
|
|
@ -18,10 +18,11 @@ import re
|
|||
import numpy as np
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.maskrcnn.mask_rcnn_r50 import MaskRcnn_Infer
|
||||
|
||||
from mindspore import Tensor, context, load_checkpoint, load_param_into_net, export
|
||||
|
||||
|
||||
lss = [int(re.findall(r'[0-9]+', i)[0]) for i in config.feature_shapes]
|
||||
config.feature_shapes = [(lss[2*i], lss[2*i+1]) for i in range(int(len(lss)/2))]
|
||||
config.roi_layer = dict(type='RoIAlign', out_size=7, mask_out_size=14, sample_num=2)
|
||||
|
@ -38,7 +39,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
|||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == '__main__':
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_maskrcnn():
|
||||
""" export_maskrcnn """
|
||||
net = MaskRcnn_Infer(config=config)
|
||||
param_dict = load_checkpoint(config.ckpt_file)
|
||||
|
||||
|
@ -49,10 +55,11 @@ if __name__ == '__main__':
|
|||
load_param_into_net(net, param_dict_new)
|
||||
net.set_train(False)
|
||||
|
||||
bs = config.test_batch_size
|
||||
|
||||
img = Tensor(np.zeros([config.batch_size, 3, config.img_height, config.img_width], np.float16))
|
||||
img_metas = Tensor(np.zeros([config.batch_size, 4], np.float16))
|
||||
|
||||
input_data = [img, img_metas]
|
||||
export(net, *input_data, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_maskrcnn()
|
||||
|
|
|
@ -38,7 +38,7 @@ It shows top results in all three tracks of the COCO suite of challenges, includ
|
|||
# [Model Architecture](#contents)
|
||||
|
||||
MaskRCNN is a two-stage target detection network. It extends FasterRCNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.This network uses a region proposal network (RPN), which can share the convolution features of the whole image with the detection network, so that the calculation of region proposal is almost cost free. The whole network further combines RPN and mask branch into a network by sharing the convolution features.
|
||||
This network uses MobileNetV1 as the backbone of the MaskRCNN network.
|
||||
This network uses MobileNetV1 as the backbone of the maskrcnn_mobilenetv1 network.
|
||||
|
||||
[Paper](http://cn.arxiv.org/pdf/1703.06870v3): Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick. "MaskRCNN"
|
||||
|
||||
|
@ -113,7 +113,7 @@ pip install mmcv=0.2.14
|
|||
Note:
|
||||
1. To speed up data preprocessing, MindSpore provide a data format named MindRecord, hence the first step is to generate MindRecord files based on COCO2017 dataset before training. The process of converting raw COCO2017 dataset to MindRecord format may take about 4 hours.
|
||||
2. For distributed training, a [hccl configuration file](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools) with JSON format needs to be created in advance.
|
||||
3. For large models like MaskRCNN, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.
|
||||
3. For large models like maskrcnn_mobilenetv1, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.
|
||||
|
||||
4. Execute eval script.
|
||||
After training, you can start evaluation as follows:
|
||||
|
@ -138,6 +138,146 @@ pip install mmcv=0.2.14
|
|||
1. MODEL_PATH is a model file, exported by export script file.
|
||||
2. ANN_FILE_PATH is a annotation file for inference.
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='cocodataset'" on default_config.yaml file.
|
||||
# Set "base_lr=0.02" on default_config.yaml file.
|
||||
# Set "mindrecord_dir='./MindRecord_COCO'" on default_config.yaml file.
|
||||
# Set "data_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "ann_file='./annotations/instances_val2017.json'" on default_config.yaml file.
|
||||
# Set "epoch_size=12" on default_config.yaml file.
|
||||
# Set "ckpt_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='cocodataset'" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "base_lr=0.02" on the website UI interface.
|
||||
# Add "mindrecord_dir='./MindRecord_COCO'" on the website UI interface.
|
||||
# Add "data_path='/cache/data'" on the website UI interface.
|
||||
# Add "ann_file='./annotations/instances_val2017.json'" on the website UI interface.
|
||||
# Add "epoch_size=12" on the website UI interface.
|
||||
# Set "ckpt_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on default_config.yaml file.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, run "train.py" like the following to create MindRecord dataset locally from coco2017.
|
||||
# "python train.py --only_create_dataset=True --mindrecord_dir=$MINDRECORD_DIR --data_path=$DATA_PATH --ann_file=$ANNO_PATH"
|
||||
# Second, zip MindRecord dataset to one zip file.
|
||||
# Finally, Upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/maskrcnn" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='cocodataset'" on default_config.yaml file.
|
||||
# Set "mindrecord_dir='./MindRecord_COCO'" on default_config.yaml file.
|
||||
# Set "data_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "ann_file='./annotations/instances_val2017.json'" on default_config.yaml file.
|
||||
# Set "epoch_size=12" on default_config.yaml file.
|
||||
# Set "ckpt_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='cocodataset'" on the website UI interface.
|
||||
# Add "mindrecord_dir='./MindRecord_COCO'" on the website UI interface.
|
||||
# Add "data_path='/cache/data'" on the website UI interface.
|
||||
# Add "ann_file='./annotations/instances_val2017.json'" on the website UI interface.
|
||||
# Add "epoch_size=12" on the website UI interface.
|
||||
# Set "ckpt_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on default_config.yaml file.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, run "train.py" like the following to create MindRecord dataset locally from coco2017.
|
||||
# "python train.py --only_create_dataset=True --mindrecord_dir=$MINDRECORD_DIR --data_path=$DATA_PATH --ann_file=$ANNO_PATH"
|
||||
# Second, zip MindRecord dataset to one zip file.
|
||||
# Finally, Upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/maskrcnn" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='cocodataset'" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on default_config.yaml file.
|
||||
# Set "mindrecord_file='/cache/data/cocodataset/MindRecord_COCO'" on default_config.yaml file.
|
||||
# Set "data_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "ann_file='./annotations/instances_val2017.json'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='cocodataset'" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint_path='./ckpt_maskrcnn/mask_rcnn-12_7393.ckpt'" on the website UI interface.
|
||||
# Set "mindrecord_file='/cache/data/cocodataset/MindRecord_COCO'" on default_config.yaml file.
|
||||
# Add "data_path='/cache/data'" on the website UI interface.
|
||||
# Set "ann_file='./annotations/instances_val2017.json'" on default_config.yaml file.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, run "eval.py" like the following to create MindRecord dataset locally from coco2017.
|
||||
# "python eval.py --only_create_dataset=True --mindrecord_dir=$MINDRECORD_DIR --data_path=$DATA_PATH --ann_file=$ANNO_PATH \
|
||||
# --checkpoint_path=$CHECKPOINT_PATH"
|
||||
# Second, zip MindRecord dataset to one zip file.
|
||||
# Finally, Upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/maskrcnn" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='maskrcnn_mobilenetv1'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='maskrcnn_mobilenetv1'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/maskrcnn_mobilenetv1" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script Description](#contents)
|
||||
|
||||
## [Script and Sample Code](#contents)
|
||||
|
@ -158,7 +298,7 @@ pip install mmcv=0.2.14
|
|||
├─anchor_generator.py # generate base bounding box anchors
|
||||
├─bbox_assign_sample.py # filter positive and negative bbox for the first stage learning
|
||||
├─bbox_assign_sample_stage2.py # filter positive and negative bbox for the second stage learning
|
||||
├─mask_rcnn_mobilenetv1.py # main network architecture of maskrcnn
|
||||
├─mask_rcnn_mobilenetv1.py # main network architecture of maskrcnn_mobilenetv1
|
||||
├─fpn_neck.py # fpn network
|
||||
├─proposal_generator.py # generate proposals based on feature map
|
||||
├─rcnn_cls.py # rcnn bounding box regression branch
|
||||
|
@ -169,7 +309,7 @@ pip install mmcv=0.2.14
|
|||
├─config.py # network configuration
|
||||
├─dataset.py # dataset utils
|
||||
├─lr_schedule.py # leanring rate geneatore
|
||||
├─network_define.py # network define for maskrcnn
|
||||
├─network_define.py # network define for maskrcnn_mobilenetv1
|
||||
└─util.py # routine operation
|
||||
├─mindspore_hub_conf.py # mindspore hub interface
|
||||
├─export.py #script to export AIR,MINDIR model
|
||||
|
@ -338,7 +478,7 @@ Usage: sh run_standalone_train.sh [PRETRAINED_MODEL]
|
|||
|
||||
### [Training](#content)
|
||||
|
||||
- Run `run_standalone_train.sh` for non-distributed training of MaskRCNN model.
|
||||
- Run `run_standalone_train.sh` for non-distributed training of maskrcnn_mobilenetv1 model.
|
||||
|
||||
```bash
|
||||
# standalone training
|
||||
|
|
|
@ -18,6 +18,7 @@ import numpy as np
|
|||
from mindspore import Tensor, context, load_checkpoint, export, load_param_into_net
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.network_define import MaskRcnn_Mobilenetv1_Infer
|
||||
|
||||
def config_(cfg):
|
||||
|
@ -36,7 +37,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
|||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == '__main__':
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_maskrcnn_mobilenetv1():
|
||||
""" export_maskrcnn_mobilenetv1 """
|
||||
config.test_batch_size = config.batch_size_export
|
||||
net = MaskRcnn_Mobilenetv1_Infer(config)
|
||||
|
||||
|
@ -55,3 +61,6 @@ if __name__ == '__main__':
|
|||
|
||||
input_data = [img_data, img_metas]
|
||||
export(net, *input_data, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_maskrcnn_mobilenetv1()
|
||||
|
|
|
@ -71,6 +71,121 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
|
|||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size=90" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "dataset_path=/cache/data" on the website UI interface.
|
||||
# Add "epoch_size=90" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv1" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size=90" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add "epoch_size=90" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv1" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint='./mobilenetv1/mobilenetv1_trained.ckpt'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint='./mobilenetv1/mobilenetv1_trained.ckpt'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original coco dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv1" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='mobilenetv1'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='mobilenetv1'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/mobilenetv1" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
## Script description
|
||||
|
||||
### Script and sample code
|
||||
|
|
|
@ -11,9 +11,6 @@ device_target: Ascend
|
|||
enable_profiling: False
|
||||
|
||||
# ==============================================================================
|
||||
modelarts_dataset_unzip_name: 'cifar10'
|
||||
need_modelarts_dataset_unzip: True
|
||||
|
||||
# config for mobilenet, cifar10
|
||||
class_num: 10
|
||||
batch_size: 32
|
||||
|
@ -86,6 +83,8 @@ device_id: 'Device id, default is 0.'
|
|||
device_num: 'Use device nums, default is 1.'
|
||||
rank_id: 'Rank id, default is 0.'
|
||||
file_format: 'file format'
|
||||
dataset_path: "Change to your own dataset path."
|
||||
checkpoint_path: "Change to your own ckpt path."
|
||||
|
||||
---
|
||||
device_target: ['Ascend', 'GPU', 'CPU']
|
||||
|
|
|
@ -3,7 +3,7 @@ enable_modelarts: False
|
|||
data_url: ""
|
||||
train_url: ""
|
||||
checkpoint_url: ""
|
||||
data_path: "/store4/ImageNet_Original/"
|
||||
data_path: "/cache/data"
|
||||
output_path: "/cache/train"
|
||||
load_path: "/cache/checkpoint_path"
|
||||
checkpoint_file: './checkpoint/mobilenetv1-90_625.ckpt'
|
||||
|
@ -38,12 +38,12 @@ lr_end: 0.0
|
|||
dataset: 'imagenet2012'
|
||||
run_distribute: False
|
||||
device_num: 4
|
||||
dataset_path: "/cache/data" #change to your own dataset path
|
||||
dataset_path: "/cache/data"
|
||||
pre_trained: ''
|
||||
parameter_server: False
|
||||
|
||||
# Image classification - eval
|
||||
checkpoint_path: "./ckpt_0/mobilenetv1-90_1251.ckpt" #change to your own ckpt path
|
||||
checkpoint_path: "./ckpt_0/mobilenetv1-90_1251.ckpt"
|
||||
|
||||
# mobilenetv1 export
|
||||
device_id: 0
|
||||
|
@ -88,6 +88,8 @@ device_id: 'Device id, default is 0.'
|
|||
device_num: 'Use device nums, default is 1.'
|
||||
rank_id: 'Rank id, default is 0.'
|
||||
file_format: 'file format'
|
||||
dataset_path: "Change to your own dataset path."
|
||||
checkpoint_path: "Change to your own ckpt path."
|
||||
|
||||
---
|
||||
device_target: ['Ascend', 'GPU', 'CPU']
|
||||
|
|
|
@ -88,6 +88,8 @@ device_id: 'Device id, default is 0.'
|
|||
device_num: 'Use device nums, default is 1.'
|
||||
rank_id: 'Rank id, default is 0.'
|
||||
file_format: 'file format'
|
||||
dataset_path: "Change to your own dataset path."
|
||||
checkpoint_path: "Change to your own ckpt path."
|
||||
|
||||
---
|
||||
device_target: ['Ascend', 'GPU', 'CPU']
|
||||
|
|
|
@ -94,7 +94,9 @@ def modelarts_process():
|
|||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def eval_mobilenetv1():
|
||||
config.dataset_path = os.path.join(config.dataset_path, 'validation_preprocess')
|
||||
""" eval_mobilenetv1 """
|
||||
if config.dataset == 'imagenet2012':
|
||||
config.dataset_path = os.path.join(config.dataset_path, 'validation_preprocess')
|
||||
print('\nconfig:\n', config)
|
||||
target = config.device_target
|
||||
|
||||
|
|
|
@ -21,17 +21,26 @@ from mindspore.train.serialization import export, load_checkpoint
|
|||
from src.mobilenet_v1 import mobilenet_v1 as mobilenet
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
||||
|
||||
if __name__ == "__main__":
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_mobilenetv1():
|
||||
""" export_mobilenetv1 """
|
||||
target = config.device_target
|
||||
if target != "GPU":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
network = mobilenet(class_num=config.class_num)
|
||||
param_dict = load_checkpoint(config.ckpt_file, net=network)
|
||||
load_checkpoint(config.ckpt_file, net=network)
|
||||
network.set_train(False)
|
||||
input_data = Tensor(np.zeros([config.batch_size, 3, config.height, config.width]).astype(np.float32))
|
||||
export(network, input_data, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_mobilenetv1()
|
||||
|
|
|
@ -98,12 +98,13 @@ def modelarts_pre_process():
|
|||
|
||||
config.dataset_path = os.path.join(config.data_path, config.modelarts_dataset_unzip_name)
|
||||
config.save_checkpoint_path = config.output_path
|
||||
# config.pre_trained = os.path.join(config.dataset_path, config.pre_trained)
|
||||
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_pre_process)
|
||||
def train_mobilenetv1():
|
||||
config.dataset_path = os.path.join(config.dataset_path, 'train')
|
||||
""" train_mobilenetv1 """
|
||||
if config.dataset == 'imagenet2012':
|
||||
config.dataset_path = os.path.join(config.dataset_path, 'train')
|
||||
target = config.device_target
|
||||
ckpt_save_dir = config.save_checkpoint_path
|
||||
|
||||
|
|
|
@ -61,6 +61,121 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
|
|||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size: 200" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "dataset_path=/cache/data" on the website UI interface.
|
||||
# Add "epoch_size: 200" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv2" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "epoch_size: 200" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add "epoch_size: 200" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv2" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "need_modelarts_dataset_unzip=True" on default_config.yaml file.
|
||||
# Set "modelarts_dataset_unzip_name='ImageNet_Original'" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint='./mobilenetv2/mobilenetv2_trained.ckpt'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "need_modelarts_dataset_unzip=True" on the website UI interface.
|
||||
# Add "modelarts_dataset_unzip_name='ImageNet_Original'" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint='./mobilenetv2/mobilenetv2_trained.ckpt'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/mobilenetv2" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='mobilenetv2'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='mobilenetv2'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/mobilenetv2" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script description](#contents)
|
||||
|
||||
## [Script and sample code](#contents)
|
||||
|
|
|
@ -67,6 +67,121 @@ MobileNetV2总体网络架构如下:
|
|||
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
|
||||
|
||||
- 在 ModelArts 进行训练 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
```python
|
||||
# 在 ModelArts 上使用8卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "distribute=True"
|
||||
# 在 default_config.yaml 文件中设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在 default_config.yaml 文件中设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "epoch_size: 200"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在网页上设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在网页上设置 "distribute=True"
|
||||
# 在网页上设置 "dataset_path=/cache/data"
|
||||
# 在网页上设置 "epoch_size: 200"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,请上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/mobilenetv2"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在 default_config.yaml 文件中设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "epoch_size: 200"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在网页上设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 "epoch_size: 200"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/mobilenetv2"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡验证
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在 default_config.yaml 文件中设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint='./mobilenetv2/mobilenetv2_trained.ckpt'"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "need_modelarts_dataset_unzip=True"
|
||||
# 在网页上设置 "modelarts_dataset_unzip_name='ImageNet_Original'"
|
||||
# 在网页上设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在网页上设置 "checkpoint='./mobilenetv2/mobilenetv2_trained.ckpt'"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 上传你训练好的模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/mobilenetv2"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='mobilenetv2'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='mobilenetv2'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/mobilenetv2"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
# 脚本说明
|
||||
|
||||
## 脚本和样例代码
|
||||
|
|
|
@ -21,13 +21,13 @@ from src.models import define_net, load_ckpt
|
|||
from src.utils import set_context
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id, get_device_num, get_rank_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
|
||||
|
||||
config.device_id = get_device_id()
|
||||
config.rank_id = get_rank_id()
|
||||
config.rank_size = get_device_num()
|
||||
config.run_distribute = config.rank_size > 1.
|
||||
|
||||
config.batch_size = config.batch_size_export
|
||||
config.is_training = config.is_training_export
|
||||
|
||||
|
@ -35,7 +35,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target=config.platform)
|
|||
if config.platform == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == '__main__':
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_mobilenetv2():
|
||||
""" export_mobilenetv2 """
|
||||
print('\nconfig: \n', config)
|
||||
set_context(config)
|
||||
_, _, net = define_net(config, config.is_training)
|
||||
|
@ -44,3 +49,6 @@ if __name__ == '__main__':
|
|||
input_shp = [config.batch_size, 3, config.image_height, config.image_width]
|
||||
input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp).astype(np.float32))
|
||||
export(net, input_array, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_mobilenetv2()
|
||||
|
|
|
@ -137,6 +137,31 @@ After installing MindSpore via the official website and Dataset is correctly gen
|
|||
#
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "file_name='gat'" on default_config.yaml file.
|
||||
# Set "file_format='MINDIR'" on default_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on default_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='gat'" on the website UI interface.
|
||||
# Add "file_format='MINDIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/gat" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
## [Script Description](#contents)
|
||||
|
||||
## [Script and Sample Code](#contents)
|
||||
|
|
|
@ -133,6 +133,31 @@
|
|||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "file_name='gat'"
|
||||
# 在 default_config.yaml 文件中设置 "file_format='MINDIR'"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 default_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='gat'"
|
||||
# 在网页上设置 "file_format='MINDIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/gat"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
## 脚本说明
|
||||
|
||||
### 脚本及样例代码
|
||||
|
|
|
@ -16,16 +16,21 @@
|
|||
import numpy as np
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.gat import GAT
|
||||
from mindspore import Tensor, context, load_checkpoint, export
|
||||
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
||||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id()) # config.device_id
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == "__main__":
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_gat():
|
||||
""" export_gat """
|
||||
if config.dataset == "citeseer":
|
||||
feature_size = [1, 3312, 3703]
|
||||
biases_size = [1, 3312, 3312]
|
||||
|
@ -57,3 +62,6 @@ if __name__ == "__main__":
|
|||
gat_net.add_flags_recursive(fp16=True)
|
||||
|
||||
export(gat_net, Tensor(feature), Tensor(biases), file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_gat()
|
||||
|
|
|
@ -80,6 +80,109 @@ Note that you can run the scripts based on the dataset mentioned in original pap
|
|||
bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
|
||||
```
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "num_epochs: 20" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "dataset_path=/cache/data" on the website UI interface.
|
||||
# Add "num_epochs: 20" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/lstm" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "num_epochs: 20" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add "num_epochs: 20" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/lstm" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint='./lstm/lstm_trained.ckpt'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint='./lstm/lstm_trained.ckpt'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/lstm" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='lstm'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='lstm'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/lstm" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script Description](#contents)
|
||||
|
||||
## [Script and Sample Code](#contents)
|
||||
|
|
|
@ -87,6 +87,109 @@ LSTM模型包含嵌入层、编码器和解码器这几个模块,编码器模
|
|||
bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行训练 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
```python
|
||||
# 在 ModelArts 上使用8卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "distribute=True"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "num_epochs: 20"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "distribute=True"
|
||||
# 在网页上设置 "dataset_path=/cache/data"
|
||||
# 在网页上设置 "num_epochs: 20"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,请上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/lstm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "num_epochs: 20"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 "num_epochs: 20"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/lstm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡验证
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint='./lstm/lstm_trained.ckpt'"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在网页上设置 "checkpoint='./lstm/lstm_trained.ckpt'"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 上传你训练好的模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/lstm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='lstm'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='lstm'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/lstm"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
# 脚本说明
|
||||
|
||||
## 脚本和样例代码
|
||||
|
|
|
@ -141,6 +141,109 @@ After installing MindSpore via the official website, you can start training and
|
|||
--device_target='CPU' > ms_log/eval_output.log 2>&1 &
|
||||
```
|
||||
|
||||
- Running on [ModelArts](https://support.huaweicloud.com/modelarts/)
|
||||
|
||||
```bash
|
||||
# Train 8p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "distribute=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "train_epochs: 5" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "distribute=True" on the website UI interface.
|
||||
# Add "dataset_path=/cache/data" on the website UI interface.
|
||||
# Add "train_epochs: 5" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/deepfm" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Train 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set "train_epochs: 5" on default_config.yaml file.
|
||||
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add "train_epochs: 5" on the website UI interface.
|
||||
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/deepfm" on the website UI interface.
|
||||
# (6) Set the startup file to "train.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
#
|
||||
# Eval 1p with Ascend
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on default_config.yaml file.
|
||||
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on base_config.yaml file.
|
||||
# Set "checkpoint='./deepfm/deepfm_trained.ckpt'" on default_config.yaml file.
|
||||
# Set "dataset_path='/cache/data'" on default_config.yaml file.
|
||||
# Set other parameters on default_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
|
||||
# Add "checkpoint='./deepfm/deepfm_trained.ckpt'" on the website UI interface.
|
||||
# Add "dataset_path='/cache/data'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Prepare model code
|
||||
# (3) Upload or copy your trained model to S3 bucket.
|
||||
# (4) Perform a or b. (suggested option a)
|
||||
# a. First, zip MindRecord dataset to one zip file.
|
||||
# Second, upload your zip dataset to S3 bucket.(you could also upload the origin mindrecord dataset, but it can be so slow.)
|
||||
# b. Upload the original dataset to S3 bucket.
|
||||
# (Data set conversion occurs during training process and costs a lot of time. it happens every time you train.)
|
||||
# (5) Set the code directory to "/path/deepfm" on the website UI interface.
|
||||
# (6) Set the startup file to "eval.py" on the website UI interface.
|
||||
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='deepfm'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='deepfm'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/deepfm" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
# [Script Description](#contents)
|
||||
|
||||
## [Script and Sample Code](#contents)
|
||||
|
|
|
@ -126,6 +126,109 @@ FM和深度学习部分拥有相同的输入原样特征向量,让DeepFM能从
|
|||
sh scripts/run_eval.sh 0 GPU /dataset_path /checkpoint_path/deepfm.ckpt
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行训练 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
```python
|
||||
# 在 ModelArts 上使用8卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "distribute=True"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "train_epochs: 5"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "distribute=True"
|
||||
# 在网页上设置 "dataset_path=/cache/data"
|
||||
# 在网页上设置 "train_epochs: 5"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,请上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/deepfm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡训练
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 "train_epochs: 5"
|
||||
# (可选)在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 "train_epochs: 5"
|
||||
# (可选)在网页上设置 "checkpoint_url='s3://dir_to_your_pretrained/'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 如果选择微调您的模型,上传你的预训练模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/deepfm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
#
|
||||
# 在 ModelArts 上使用单卡验证
|
||||
# (1) 执行a或者b
|
||||
# a. 在 default_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在 default_config.yaml 文件中设置 "checkpoint='./deepfm/deepfm_trained.ckpt'"
|
||||
# 在 default_config.yaml 文件中设置 "dataset_path='/cache/data'"
|
||||
# 在 default_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "checkpoint_url='s3://dir_to_your_trained_model/'"
|
||||
# 在网页上设置 "checkpoint='./deepfm/deepfm_trained.ckpt'"
|
||||
# 在网页上设置 "dataset_path='/cache/data'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 准备模型代码
|
||||
# (3) 上传你训练好的模型到 S3 桶上
|
||||
# (4) 执行a或者b (推荐选择 a)
|
||||
# a. 第一, 将该数据集压缩为一个 ".zip" 文件。
|
||||
# 第二, 上传你的压缩数据集到 S3 桶上 (你也可以上传未压缩的数据集,但那可能会很慢。)
|
||||
# b. 上传原始数据集到 S3 桶上。
|
||||
# (数据集转换发生在训练过程中,需要花费较多的时间。每次训练的时候都会重新进行转换。)
|
||||
# (5) 在网页上设置你的代码路径为 "/path/deepfm"
|
||||
# (6) 在网页上设置启动文件为 "train.py"
|
||||
# (7) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (8) 创建训练作业
|
||||
```
|
||||
|
||||
- 在 ModelArts 进行导出 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
|
||||
|
||||
1. 使用voc val数据集评估多尺度和翻转s8。评估步骤如下:
|
||||
|
||||
```python
|
||||
# (1) 执行 a 或者 b.
|
||||
# a. 在 base_config.yaml 文件中设置 "enable_modelarts=True"
|
||||
# 在 base_config.yaml 文件中设置 "file_name='deepfm'"
|
||||
# 在 base_config.yaml 文件中设置 "file_format='AIR'"
|
||||
# 在 base_config.yaml 文件中设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在 base_config.yaml 文件中设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在 base_config.yaml 文件中设置 其他参数
|
||||
# b. 在网页上设置 "enable_modelarts=True"
|
||||
# 在网页上设置 "file_name='deepfm'"
|
||||
# 在网页上设置 "file_format='AIR'"
|
||||
# 在网页上设置 "checkpoint_url='/The path of checkpoint in S3/'"
|
||||
# 在网页上设置 "ckpt_file='/cache/checkpoint_path/model.ckpt'"
|
||||
# 在网页上设置 其他参数
|
||||
# (2) 上传你的预训练模型到 S3 桶上
|
||||
# (3) 在网页上设置你的代码路径为 "/path/deepfm"
|
||||
# (4) 在网页上设置启动文件为 "export.py"
|
||||
# (5) 在网页上设置"训练数据集"、"训练输出文件路径"、"作业日志路径"等
|
||||
# (6) 创建训练作业
|
||||
```
|
||||
|
||||
## 脚本说明
|
||||
|
||||
## 脚本和样例代码
|
||||
|
|
|
@ -21,15 +21,19 @@ from mindspore.train.serialization import export, load_checkpoint
|
|||
from src.deepfm import ModelBuilder
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.device_adapter import get_device_id
|
||||
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target)
|
||||
if config.device_target == "Ascend":
|
||||
context.set_context(device_id=get_device_id())
|
||||
|
||||
if __name__ == "__main__":
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_deepfm():
|
||||
""" export_deepfm """
|
||||
model_builder = ModelBuilder(config, config)
|
||||
_, network = model_builder.get_train_eval_net()
|
||||
network.set_train(False)
|
||||
|
@ -42,3 +46,6 @@ if __name__ == "__main__":
|
|||
|
||||
input_data = [batch_ids, batch_wts, labels]
|
||||
export(network, *input_data, file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_deepfm()
|
||||
|
|
|
@ -145,6 +145,31 @@ SLOG_PRINT_TO_STDOUT=1 python eval.py --device_id 0
|
|||
# (8) Create your job.
|
||||
```
|
||||
|
||||
- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start evaluating as follows)
|
||||
|
||||
1. Export s8 multiscale and flip with voc val dataset on modelarts, evaluating steps are as follows:
|
||||
|
||||
```python
|
||||
# (1) Perform a or b.
|
||||
# a. Set "enable_modelarts=True" on base_config.yaml file.
|
||||
# Set "file_name='fcn-4'" on base_config.yaml file.
|
||||
# Set "file_format='AIR'" on base_config.yaml file.
|
||||
# Set "checkpoint_url='/The path of checkpoint in S3/'" on beta_config.yaml file.
|
||||
# Set "ckpt_file='/cache/checkpoint_path/model.ckpt'" on base_config.yaml file.
|
||||
# Set other parameters on base_config.yaml file you need.
|
||||
# b. Add "enable_modelarts=True" on the website UI interface.
|
||||
# Add "file_name='fcn-4'" on the website UI interface.
|
||||
# Add "file_format='AIR'" on the website UI interface.
|
||||
# Add "checkpoint_url='/The path of checkpoint in S3/'" on the website UI interface.
|
||||
# Add "ckpt_file='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
|
||||
# Add other parameters on the website UI interface.
|
||||
# (2) Upload or copy your trained model to S3 bucket.
|
||||
# (3) Set the code directory to "/path/fcn-4" on the website UI interface.
|
||||
# (4) Set the startup file to "export.py" on the website UI interface.
|
||||
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
|
||||
# (6) Create your job.
|
||||
```
|
||||
|
||||
## [Script Description](#contents)
|
||||
|
||||
### [Script and Sample Code](#contents)
|
||||
|
|
|
@ -23,10 +23,16 @@ from mindspore import Tensor
|
|||
from mindspore.train.serialization import load_checkpoint, load_param_into_net
|
||||
|
||||
from src.model_utils.config import config
|
||||
from src.model_utils.moxing_adapter import moxing_wrapper
|
||||
from src.musictagger import MusicTaggerCNN
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
def modelarts_process():
|
||||
pass
|
||||
|
||||
@moxing_wrapper(pre_process=modelarts_process)
|
||||
def export_fcn4():
|
||||
""" export_fcn4 """
|
||||
network = MusicTaggerCNN(in_classes=[1, 128, 384, 768, 2048],
|
||||
kernel_size=[3, 3, 3, 3, 3],
|
||||
padding=[0] * 5,
|
||||
|
@ -36,3 +42,6 @@ if __name__ == "__main__":
|
|||
load_param_into_net(network, param_dict)
|
||||
input_data = np.random.uniform(0.0, 1.0, size=[config.batch_size, 1, 96, 1366]).astype(np.float32)
|
||||
export(network, Tensor(input_data), file_name=config.file_name, file_format=config.file_format)
|
||||
|
||||
if __name__ == '__main__':
|
||||
export_fcn4()
|
||||
|
|
Loading…
Reference in New Issue