edit readme

This commit is contained in:
yoonlee666 2020-10-10 09:36:42 +08:00
parent 8fa83cca87
commit 263a591473
7 changed files with 15 additions and 12 deletions

View File

@ -411,20 +411,22 @@ epoch: 0.0, current epoch percent: 0.002, step: 200, outpus are (Tensor(shape=[1
Before running the command below, please check the load pretrain checkpoint path has been set. Please set the checkpoint path to be the absolute full path, e.g:"/username/pretrain/checkpoint_100_300.ckpt".
```
bash scripts/run_classifier.sh
```
The command above will run in the background, you can view training logs in classfier_log.txt.
If you choose accuracy as assessment method, the result will be as follows:
```
acc_num XXX, total_num XXX, accuracy 0.588986
```
#### evaluation on cluener dataset when running on Ascend
```
bash scripts/ner.sh
```
The command above will run in the background, you can view training logs in ner_log.txt.
If you choose F1 as assessment method, the result will be as follows:
```
Precision 0.920507
Recall 0.948683
F1 0.920507
@ -433,9 +435,10 @@ F1 0.920507
#### evaluation on squad v1.1 dataset when running on Ascend
```
bash scripts/squad.sh
```
The command above will run in the background, you can view training logs in squad_log.txt.
The result will be as follows:
```
{"exact_match": 80.3878923040233284, "f1": 87.6902384023850329}
```

View File

@ -1,11 +1,11 @@
# Run distribute pretrain
## description
The number of D chips can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that.
The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that.
## how to use
For example, if we want to generate the launch command of the distributed training of Bert model on D chip, we can run the following command in `/bert/` dir:
For example, if we want to generate the launch command of the distributed training of Bert model on Ascend accelerators, we can run the following command in `/bert/` dir:
```
python ./scripts/ascend_distributed_launcher/get_distribute_pretrain_cmd.py --run_script_dir ./run_pretrain.py --hyper_parameter_config_dir ./scripts/ascend_distributed_launcher/hyper_parameter_config.ini --data_dir /path/dataset/ --hccl_config_dir model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json
```

View File

@ -59,7 +59,7 @@ def append_cmd_env(cmd, key, value):
def distribute_pretrain():
"""
distribute pretrain scripts. The number of D chips can be automatically allocated
distribute pretrain scripts. The number of Ascend accelerators can be automatically allocated
based on the device_num set in hccl config file, You don not need to specify that.
"""
cmd = ""

View File

@ -1,11 +1,11 @@
# Run distribute pretrain
## description
The number of D chips can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that.
The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that.
## how to use
For example, if we want to generate the launch command of the distributed training of Bert model on D chip, we can run the following command in `/bert/` dir:
For example, if we want to generate the launch command of the distributed training of Bert model on Ascend accelerators, we can run the following command in `/bert/` dir:
```
python ./scripts/ascend_distributed_launcher/get_distribute_pretrain_cmd.py --run_script_dir ./run_pretrain.py --hyper_parameter_config_dir ./scripts/ascend_distributed_launcher/hyper_parameter_config.ini --data_dir /path/dataset/ --hccl_config_dir model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json
```

View File

@ -59,7 +59,7 @@ def append_cmd_env(cmd, key, value):
def distribute_pretrain():
"""
distribute pretrain scripts. The number of D chips can be automatically allocated
distribute pretrain scripts. The number of Ascend accelerators can be automatically allocated
based on the device_num set in hccl config file, You don not need to specify that.
"""
cmd = ""

View File

@ -1,6 +1,6 @@
# description
mindspore distributed training launch helper utilty that will generate hccl config file.
MindSpore distributed training launch helper utilty that will generate hccl config file.
# use
@ -14,4 +14,4 @@ hccl_[device_num]p_[which device]_[server_ip].json
```
# Note
Please note that the D chips used must be continuous, such [0,4) means to use four chips 0123; [0,1) means to use chip 0; The first four chips are a group, and the last four chips are a group. In addition to the [0,8) chips are allowed, other cross-group such as [3,6) are prohibited.
Please note that the Ascend accelerators used must be continuous, such [0,4) means to use four chips 0123; [0,1) means to use chip 0; The first four chips are a group, and the last four chips are a group. In addition to the [0,8) chips are allowed, other cross-group such as [3,6) are prohibited.

View File

@ -37,7 +37,7 @@ def parse_args():
"helper utilty that will generate hccl"
" config file")
parser.add_argument("--device_num", type=str, default="[0,8)",
help="The number of the D chip used. please note that the D chips"
help="The number of the Ascend accelerators used. please note that the Ascend accelerators"
"used must be continuous, such [0,4) means to use four chips "
"0123; [0,1) means to use chip 0; The first four chips are"
"a group, and the last four chips are a group. In addition to"