fixed the bad links
This commit is contained in:
parent
7697492fb2
commit
1df083b8d7
|
@ -1,6 +1,6 @@
|
|||
![](https://www.mindspore.cn/static/img/logo.a3e472c9.png)
|
||||
# ![MindSpore Logo](https://www.mindspore.cn/static/img/logo_black.6a5c850d.png)
|
||||
|
||||
# Welcome to the Model Zoo for MindSpore
|
||||
## Welcome to the Model Zoo for MindSpore
|
||||
|
||||
In order to facilitate developers to enjoy the benefits of MindSpore framework, we will continue to add typical networks and some of the related pre-trained models. If you have needs for the model zoo, you can file an issue on [gitee](https://gitee.com/mindspore/mindspore/issues) or [MindSpore](https://bbs.huaweicloud.com/forum/forum-1076-1.html), We will consider it in time.
|
||||
|
||||
|
@ -10,23 +10,23 @@ In order to facilitate developers to enjoy the benefits of MindSpore framework,
|
|||
|
||||
- Officially maintained and supported
|
||||
|
||||
# Table of Contents
|
||||
## Table of Contents
|
||||
|
||||
- [Official](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official)
|
||||
- [Computer Vision](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv)
|
||||
- [Image Classification](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv)
|
||||
- [GoogleNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/googlenet/README.md)
|
||||
- [ResNet50[benchmark]](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet/README.md)
|
||||
- [ResNet50_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet/resnet_quant/README.md)
|
||||
- [ResNet50_Quant](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/resnet50_quant/README.md)
|
||||
- [ResNet101](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet/README.md)
|
||||
- [ResNext50](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnext50/README.md)
|
||||
- [VGG16](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/vgg16/README.md)
|
||||
- [AlexNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet/README.md)
|
||||
- [LeNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet/README.md)
|
||||
- [LeNet_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet_quant/README.md)
|
||||
- [LeNet_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet_quant/Readme.md)
|
||||
- [MobileNetV2](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv2/README.md)
|
||||
- [MobileNetV2_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv2_quant/README.md)
|
||||
- [MobileNetV3](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv3/README.md)
|
||||
- [MobileNetV2_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv2_quant/Readme.md)
|
||||
- [MobileNetV3](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv3/Readme.md)
|
||||
- [InceptionV3](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/inceptionv3/README.md)
|
||||
- [InceptionV4](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/inceptionv4/README.md)
|
||||
- [Object Detection and Segmentation](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv)
|
||||
|
@ -59,9 +59,9 @@ In order to facilitate developers to enjoy the benefits of MindSpore framework,
|
|||
|
||||
- [Research](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research)
|
||||
- [Computer Vision](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv)
|
||||
- [GhostNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ghostnet/README.md)
|
||||
- [GhostNet_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ghostnet_quant/README.md)
|
||||
- [ResNet50-0.65x](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/resnet50_adv_pruning/README.md)
|
||||
- [GhostNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ghostnet/Readme.md)
|
||||
- [GhostNet_Quant](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ghostnet_quant/Readme.md)
|
||||
- [ResNet50-0.65x](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/resnet50_adv_pruning/Readme.md)
|
||||
- [SSD_GhostNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ssd_ghostnet/README.md)
|
||||
- [TinyNet](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/tinynet/README.md)
|
||||
- [CycleGAN](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/cycle_gan/README.md)
|
||||
|
@ -80,7 +80,7 @@ In order to facilitate developers to enjoy the benefits of MindSpore framework,
|
|||
|
||||
- [Community](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/community)
|
||||
|
||||
# Announcements
|
||||
## Announcements
|
||||
|
||||
| Date | News |
|
||||
| ------------ | ------------------------------------------------------------ |
|
||||
|
@ -88,7 +88,7 @@ In order to facilitate developers to enjoy the benefits of MindSpore framework,
|
|||
| September 01, 2020 | Support [MindSpore v0.7.0-beta](https://www.mindspore.cn/news/newschildren/en?id=246) |
|
||||
| July 31, 2020 | Support [MindSpore v0.6.0-beta](https://www.mindspore.cn/news/newschildren/en?id=237) |
|
||||
|
||||
# Disclaimers
|
||||
## Disclaimers
|
||||
|
||||
Mindspore only provides scripts that downloads and preprocesses public datasets. We do not own these datasets and are not responsible for their quality or maintenance. Please make sure you have permission to use the dataset under the dataset’s license. The models trained on these dataset are for non-commercial research and educational purpose only.
|
||||
|
||||
|
@ -96,6 +96,6 @@ To dataset owners: we will remove or update all public content upon request if y
|
|||
|
||||
MindSpore is Apache 2.0 licensed. Please see the LICENSE file.
|
||||
|
||||
# License
|
||||
## License
|
||||
|
||||
[Apache License 2.0](https://gitee.com/mindspore/mindspore/blob/master/LICENSE)
|
||||
|
|
|
@ -79,7 +79,7 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
|
|||
- Framework
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- Download the VGG19 model of the MindSpore version:
|
||||
- [vgg19-0-97_5004.ckpt](http://10.154.33.38:51203/tutorials/image_classification.html)
|
||||
- vgg19-0-97_5004.ckpt
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
||||
|
@ -218,8 +218,3 @@ For more configuration details, please refer the script `config.py`.
|
|||
| Speed | 1pcs: 35fps, 8pcs: 230fps
|
||||
| Total time | 1pcs: 22.5h, 8pcs: 5.1h
|
||||
| Checkpoint for Fine tuning | 602.33M (.ckpt file)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,6 +1,4 @@
|
|||
![](https://www.mindspore.cn/static/img/logo.a3e472c9.png)
|
||||
|
||||
<!-- TOC -->
|
||||
# Contents
|
||||
|
||||
- [MASS: Masked Sequence to Sequence Pre-training for Language Generation Description](#googlenet-description)
|
||||
- [Model architecture](#model-architecture)
|
||||
|
@ -35,12 +33,9 @@
|
|||
- [others](#others)
|
||||
- [ModelZoo Homepage](#modelzoo-homepage)
|
||||
|
||||
<!-- /TOC -->
|
||||
|
||||
|
||||
# MASS: Masked Sequence to Sequence Pre-training for Language Generation Description
|
||||
|
||||
[MASS: Masked Sequence to Sequence Pre-training for Language Generation](https://www.microsoft.com/en-us/research/uploads/prod/2019/06/MASS-paper-updated-002.pdf) was released by MicroSoft in June 2019.
|
||||
[MASS: Masked Sequence to Sequence Pre-training for Language Generation](https://www.microsoft.com/en-us/research/uploads/prod/2019/06/MASS-paper-updated-002.pdf) was released by MicroSoft in June 2019.
|
||||
|
||||
BERT(Devlin et al., 2018) have achieved SOTA in natural language understanding area by pre-training the encoder part of Transformer(Vaswani et al., 2017) with masked rich-resource text. Likewise, GPT(Raddford et al., 2018) pre-trains the decoder part of Transformer with masked(encoder inputs are masked) rich-resource text. Both of them build a robust language model by pre-training with masked rich-resource text.
|
||||
|
||||
|
@ -50,7 +45,6 @@ Inspired by BERT, GPT and other language models, MicroSoft addressed [MASS: Mask
|
|||
|
||||
[Paper](https://www.microsoft.com/en-us/research/uploads/prod/2019/06/MASS-paper-updated-002.pdf): Song, Kaitao, Xu Tan, Tao Qin, Jianfeng Lu and Tie-Yan Liu. “MASS: Masked Sequence to Sequence Pre-training for Language Generation.” ICML (2019).
|
||||
|
||||
|
||||
# Model architecture
|
||||
|
||||
The overall network architecture of MASS is shown below, which is Transformer(Vaswani et al., 2017):
|
||||
|
@ -59,22 +53,21 @@ MASS is consisted of 6-layer encoder and 6-layer decoder with 1024 embedding/hid
|
|||
|
||||
# Dataset
|
||||
|
||||
Dataset used:
|
||||
Dataset used:
|
||||
|
||||
- monolingual English data from News Crawl dataset(WMT 2019) for pre-training.
|
||||
- Gigaword Corpus(Graff et al., 2003) for Text Summarization.
|
||||
- Cornell movie dialog corpus(DanescuNiculescu-Mizil & Lee, 2011).
|
||||
|
||||
Details about those dataset could be found in [MASS: Masked Sequence to Sequence Pre-training for Language Generation](https://www.microsoft.com/en-us/research/uploads/prod/2019/06/MASS-paper-updated-002.pdf).
|
||||
|
||||
|
||||
# Features
|
||||
|
||||
Mass is designed to jointly pre train encoder and decoder to complete the task of language generation.
|
||||
Mass is designed to jointly pre train encoder and decoder to complete the task of language generation.
|
||||
First of all, through a sequence to sequence framework, mass only predicts the blocked token, which forces the encoder to understand the meaning of the unshielded token, and encourages the decoder to extract useful information from the encoder.
|
||||
Secondly, by predicting the continuous token of the decoder, the decoder can build better language modeling ability than only predicting discrete token.
|
||||
Third, by further shielding the input token of the decoder which is not shielded in the encoder, the decoder is encouraged to extract more useful information from the encoder side, rather than using the rich information in the previous token.
|
||||
|
||||
|
||||
# Script description
|
||||
|
||||
MASS script and code structure are as follow:
|
||||
|
@ -86,7 +79,7 @@ MASS script and code structure are as follow:
|
|||
│ ├──config.py // Configuration instance definition.
|
||||
│ ├──config.json // Configuration file.
|
||||
├── src
|
||||
│ ├──dataset
|
||||
│ ├──dataset
|
||||
│ ├──bi_data_loader.py // Dataset loader for fine-tune or inferring.
|
||||
│ ├──mono_data_loader.py // Dataset loader for pre-training.
|
||||
│ ├──language_model
|
||||
|
@ -125,7 +118,7 @@ MASS script and code structure are as follow:
|
|||
│ ├──run_gpu.sh // GPU train & evaluate model script.
|
||||
│ ├──learn_subword.sh // Learn BPE codes.
|
||||
│ ├──stop_training.sh // Stop training.
|
||||
├── requirements.txt // Requirements of third party package.
|
||||
├── requirements.txt // Requirements of third party package.
|
||||
├── train.py // Train API entry.
|
||||
├── eval.py // Infer API entry.
|
||||
├── tokenize_corpus.py // Corpus tokenization.
|
||||
|
@ -137,40 +130,40 @@ MASS script and code structure are as follow:
|
|||
|
||||
```
|
||||
|
||||
|
||||
## Data Preparation
|
||||
|
||||
The data preparation of a natural language processing task contains data cleaning, tokenization, encoding and vocabulary generation steps.
|
||||
|
||||
In our experiments, using [Byte Pair Encoding(BPE)](https://arxiv.org/abs/1508.07909) could reduce size of vocabulary, and relieve the OOV influence effectively.
|
||||
|
||||
Vocabulary could be created using `src/utils/dictionary.py` with text dictionary which is learnt from BPE.
|
||||
Vocabulary could be created using `src/utils/dictionary.py` with text dictionary which is learnt from BPE.
|
||||
For more detail about BPE, please refer to [Subword-nmt lib](https://www.cnpython.com/pypi/subword-nmt) or [paper](https://arxiv.org/abs/1508.07909).
|
||||
|
||||
In our experiments, vocabulary was learned based on 1.9M sentences from News Crawl Dataset, size of vocabulary is 45755.
|
||||
|
||||
Here, we have a brief introduction of data preparation scripts.
|
||||
|
||||
|
||||
### Tokenization
|
||||
|
||||
Using `tokenize_corpus.py` could tokenize corpus whose text files are in format of `.txt`.
|
||||
|
||||
Major parameters in `tokenize_corpus.py`:
|
||||
|
||||
```bash
|
||||
--corpus_folder: Corpus folder path, if multi-folders are provided, use ',' split folders.
|
||||
--output_folder: Output folder path.
|
||||
--corpus_folder: Corpus folder path, if multi-folders are provided, use ',' split folders.
|
||||
--output_folder: Output folder path.
|
||||
--tokenizer: Tokenizer to be used, nltk or jieba, if nltk is not installed fully, use jieba instead.
|
||||
--pool_size: Processes pool size.
|
||||
```
|
||||
|
||||
Sample code:
|
||||
|
||||
```bash
|
||||
python tokenize_corpus.py --corpus_folder /{path}/corpus --output_folder /{path}/tokenized_corpus --tokenizer {nltk|jieba} --pool_size 16
|
||||
```
|
||||
|
||||
|
||||
### Byte Pair Encoding
|
||||
|
||||
After tokenization, BPE is applied to tokenized corpus with provided `all.bpe.codes`.
|
||||
|
||||
Apply BPE script can be found in `apply_bpe_encoding.py`.
|
||||
|
@ -188,6 +181,7 @@ Major parameters in `apply_bpe_encoding.py`:
|
|||
```
|
||||
|
||||
Sample code:
|
||||
|
||||
```bash
|
||||
python tokenize_corpus.py --codes /{path}/all.bpe.codes \
|
||||
--src_folder /{path}/tokenized_corpus \
|
||||
|
@ -197,9 +191,10 @@ python tokenize_corpus.py --codes /{path}/all.bpe.codes \
|
|||
--processes 32
|
||||
```
|
||||
|
||||
|
||||
### Build Vocabulary
|
||||
|
||||
Support that you want to create a new vocabulary, there are two options:
|
||||
|
||||
1. Learn BPE codes from scratch, and create vocabulary with multi vocabulary files from `subword-nmt`.
|
||||
2. Create from an existing vocabulary file which lines in the format of `word frequency`.
|
||||
3. *Optional*, Create a small vocabulary based on `vocab/all_en.dict.bin` with method of `shink` from `src/utils/dictionary.py`.
|
||||
|
@ -213,6 +208,7 @@ Major interface of `src/utils/dictionary.py` are as follow:
|
|||
4. `persistence(self, path)`: Save vocabulary object to binary file.
|
||||
|
||||
Sample code:
|
||||
|
||||
```python
|
||||
from src.utils import Dictionary
|
||||
|
||||
|
@ -228,11 +224,12 @@ print([vocabulary.index[s] for s in sentence])
|
|||
|
||||
For more detail, please refer to the source file.
|
||||
|
||||
|
||||
### Generate Dataset
|
||||
|
||||
As mentioned above, three corpus are used in MASS mode, dataset generation scripts for them are provided.
|
||||
|
||||
#### News Crawl Corpus
|
||||
|
||||
Script can be found in `news_crawl.py`.
|
||||
|
||||
Major parameters in `news_crawl.py`:
|
||||
|
@ -261,8 +258,8 @@ python news_crawl.py --src_folder /{path}/news_crawl \
|
|||
--processes 32
|
||||
```
|
||||
|
||||
|
||||
#### Gigaword Corpus
|
||||
|
||||
Script can be found in `gigaword.py`.
|
||||
|
||||
Major parameters in `gigaword.py`:
|
||||
|
@ -292,8 +289,8 @@ python gigaword.py --train_src /{path}/gigaword/train_src.txt \
|
|||
--max_len 64
|
||||
```
|
||||
|
||||
|
||||
#### Cornell Movie Dialog Corpus
|
||||
|
||||
Script can be found in `cornell_dialog.py`.
|
||||
|
||||
Major parameters in `cornell_dialog.py`:
|
||||
|
@ -320,32 +317,37 @@ python cornell_dialog.py --src_folder /{path}/cornell_dialog \
|
|||
--max_len 64
|
||||
```
|
||||
|
||||
|
||||
## Configuration
|
||||
Json file under the path `config/` is the template configuration file.
|
||||
Almost all of the options and arguments needed could be assigned conveniently, including the training platform, configurations of dataset and model, arguments of optimizer etc. Optional features such as loss scale and checkpoint are also available by setting the options correspondingly.
|
||||
|
||||
Json file under the path `config/` is the template configuration file.
|
||||
Almost all of the options and arguments needed could be assigned conveniently, including the training platform, configurations of dataset and model, arguments of optimizer etc. Optional features such as loss scale and checkpoint are also available by setting the options correspondingly.
|
||||
For more detailed information about the attributes, refer to the file `config/config.py`.
|
||||
|
||||
## Training & Evaluation process
|
||||
|
||||
For training a model, the shell script `run_ascend.sh` or `run_gpu.sh` is all you need. In this scripts, the environment variable is set and the training script `train.py` under `mass` is executed.
|
||||
You may start a task training with single device or multiple devices by assigning the options and run the command in bash:
|
||||
|
||||
Ascend:
|
||||
|
||||
```ascend
|
||||
sh run_ascend.sh [--options]
|
||||
```
|
||||
|
||||
GPU:
|
||||
|
||||
```gpu
|
||||
sh run_gpu.sh [--options]
|
||||
```
|
||||
|
||||
The usage of `run_ascend.sh` is shown as bellow:
|
||||
|
||||
```text
|
||||
Usage: run_ascend.sh [-h, --help] [-t, --task <CHAR>] [-n, --device_num <N>]
|
||||
[-i, --device_id <N>] [-j, --hccl_json <FILE>]
|
||||
[-c, --config <FILE>] [-o, --output <FILE>]
|
||||
[-v, --vocab <FILE>]
|
||||
|
||||
|
||||
options:
|
||||
-h, --help show usage
|
||||
-t, --task select task: CHAR, 't' for train and 'i' for inference".
|
||||
|
@ -357,14 +359,16 @@ options:
|
|||
-v, --vocab set the vocabulary.
|
||||
-m, --metric set the metric.
|
||||
```
|
||||
|
||||
Notes: Be sure to assign the hccl_json file while running a distributed-training.
|
||||
|
||||
The usage of `run_gpu.sh` is shown as bellow:
|
||||
|
||||
```text
|
||||
Usage: run_gpu.sh [-h, --help] [-t, --task <CHAR>] [-n, --device_num <N>]
|
||||
[-i, --device_id <N>] [-c, --config <FILE>]
|
||||
[-o, --output <FILE>] [-v, --vocab <FILE>]
|
||||
|
||||
|
||||
options:
|
||||
-h, --help show usage
|
||||
-t, --task select task: CHAR, 't' for train and 'i' for inference".
|
||||
|
@ -378,28 +382,32 @@ options:
|
|||
|
||||
The command followed shows a example for training with 2 devices.
|
||||
Ascend:
|
||||
|
||||
```ascend
|
||||
sh run_ascend.sh --task t --device_num 2 --hccl_json /{path}/rank_table.json --config /{path}/config.json
|
||||
```
|
||||
|
||||
ps. Discontinuous device id is not supported in `run_ascend.sh` at present, device id in `rank_table.json` must start from 0.
|
||||
|
||||
GPU:
|
||||
|
||||
```gpu
|
||||
sh run_gpu.sh --task t --device_num 2 --config /{path}/config.json
|
||||
```
|
||||
|
||||
If use a single chip, it would be like this:
|
||||
Ascend:
|
||||
|
||||
```ascend
|
||||
sh run_ascend.sh --task t --device_num 1 --device_id 0 --config /{path}/config.json
|
||||
```
|
||||
|
||||
GPU:
|
||||
|
||||
```gpu
|
||||
sh run_gpu.sh --task t --device_num 1 --device_id 0 --config /{path}/config.json
|
||||
```
|
||||
|
||||
|
||||
## Weights average
|
||||
|
||||
```python
|
||||
|
@ -407,6 +415,7 @@ python weights_average.py --input_files your_checkpoint_list --output_file model
|
|||
```
|
||||
|
||||
The input_files is a list of you checkpoints file. To use model.npz as the weights, add its path in config.json at "existed_ckpt".
|
||||
|
||||
```json
|
||||
{
|
||||
...
|
||||
|
@ -419,7 +428,6 @@ The input_files is a list of you checkpoints file. To use model.npz as the weigh
|
|||
}
|
||||
```
|
||||
|
||||
|
||||
## Learning rate scheduler
|
||||
|
||||
Two learning rate scheduler are provided in our model:
|
||||
|
@ -430,6 +438,7 @@ Two learning rate scheduler are provided in our model:
|
|||
LR scheduler could be config in `config/config.json`.
|
||||
|
||||
For Polynomial decay scheduler, config could be like:
|
||||
|
||||
```json
|
||||
{
|
||||
...
|
||||
|
@ -447,6 +456,7 @@ For Polynomial decay scheduler, config could be like:
|
|||
```
|
||||
|
||||
For Inverse square root scheduler, config could be like:
|
||||
|
||||
```json
|
||||
{
|
||||
...
|
||||
|
@ -464,22 +474,21 @@ For Inverse square root scheduler, config could be like:
|
|||
|
||||
More detail about LR scheduler could be found in `src/utils/lr_scheduler.py`.
|
||||
|
||||
|
||||
# Model description
|
||||
|
||||
The MASS network is implemented by Transformer, which has multi-encoder layers and multi-decoder layers.
|
||||
For pre-training, we use the Adam optimizer and loss-scale to get the pre-trained model.
|
||||
During fine-turning, we fine-tune this pre-trained model with different dataset according to different tasks.
|
||||
During testing, we use the fine-turned model to predict the result, and adopt a beam search algorithm to
|
||||
The MASS network is implemented by Transformer, which has multi-encoder layers and multi-decoder layers.
|
||||
For pre-training, we use the Adam optimizer and loss-scale to get the pre-trained model.
|
||||
During fine-turning, we fine-tune this pre-trained model with different dataset according to different tasks.
|
||||
During testing, we use the fine-turned model to predict the result, and adopt a beam search algorithm to
|
||||
get the most possible prediction results.
|
||||
|
||||
|
||||
## Performance
|
||||
|
||||
### Results
|
||||
|
||||
#### Fine-Tuning on Text Summarization
|
||||
The comparisons between MASS and two other pre-training methods in terms of ROUGE score on the text summarization task
|
||||
|
||||
The comparisons between MASS and two other pre-training methods in terms of ROUGE score on the text summarization task
|
||||
with 3.8M training data are as follows:
|
||||
|
||||
| Method | RG-1(F) | RG-2(F) | RG-L(F) |
|
||||
|
@ -487,6 +496,7 @@ with 3.8M training data are as follows:
|
|||
| MASS | Ongoing | Ongoing | Ongoing |
|
||||
|
||||
#### Fine-Tuning on Conversational ResponseGeneration
|
||||
|
||||
The comparisons between MASS and other baseline methods in terms of PPL on Cornell Movie Dialog corpus are as follows:
|
||||
|
||||
| Method | Data = 10K | Data = 110K |
|
||||
|
@ -515,7 +525,6 @@ The comparisons between MASS and other baseline methods in terms of PPL on Corne
|
|||
| Model for inference | ---Mb, --, [A link]() |
|
||||
| Scripts | [A link]() |
|
||||
|
||||
|
||||
#### Inference Performance
|
||||
|
||||
| Parameters | Masked Sequence to Sequence Pre-training for Language Generation |
|
||||
|
@ -532,18 +541,17 @@ The comparisons between MASS and other baseline methods in terms of PPL on Corne
|
|||
| Total time | --/-- |
|
||||
| Model for inference | ---Mb, --, [A link]() |
|
||||
|
||||
|
||||
# Environment Requirements
|
||||
|
||||
## Platform
|
||||
|
||||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you could get the resources for trial.
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you could get the resources for trial.
|
||||
- Framework
|
||||
- [MindSpore](http://10.90.67.50/mindspore/archive/20200506/OpenSource/me_vm_x86/)
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
|
||||
## Requirements
|
||||
|
||||
|
@ -554,19 +562,22 @@ subword-nmt
|
|||
rouge
|
||||
```
|
||||
|
||||
https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html
|
||||
|
||||
<https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html>
|
||||
|
||||
# Get started
|
||||
|
||||
MASS pre-trains a sequence to sequence model by predicting the masked fragments in an input sequence. After this, downstream tasks including text summarization and conversation response are candidated for fine-tuning the model and for inference.
|
||||
Here we provide a practice example to demonstrate the basic usage of MASS for pre-training, fine-tuning a model, and the inference process. The overall process is as follows:
|
||||
|
||||
1. Download and process the dataset.
|
||||
2. Modify the `config.json` to config the network.
|
||||
3. Run a task for pre-training and fine-tuning.
|
||||
4. Perform inference and validation.
|
||||
|
||||
## Pre-training
|
||||
|
||||
For pre-training a model, config the options in `config.json` firstly:
|
||||
|
||||
- Assign the `pre_train_dataset` under `dataset_config` node to the dataset path.
|
||||
- Choose the optimizer('momentum/adam/lamb' is available).
|
||||
- Assign the 'ckpt_prefix' and 'ckpt_path' under `checkpoint_path` to save the model files.
|
||||
|
@ -588,7 +599,9 @@ sh run_gpu.sh -t t -n 1 -i 1 -c /mass/config/config.json
|
|||
Get the log and output files under the path `./train_mass_*/`, and the model file under the path assigned in the `config/config.json` file.
|
||||
|
||||
## Fine-tuning
|
||||
|
||||
For fine-tuning a model, config the options in `config.json` firstly:
|
||||
|
||||
- Assign the `fine_tune_dataset` under `dataset_config` node to the dataset path.
|
||||
- Assign the `existed_ckpt` under `checkpoint_path` node to the existed model file generated by pre-training.
|
||||
- Choose the optimizer('momentum/adam/lamb' is available).
|
||||
|
@ -610,8 +623,10 @@ sh run_gpu.sh -t t -n 1 -i 1 -c config/config.json
|
|||
Get the log and output files under the path `./train_mass_*/`, and the model file under the path assigned in the `config/config.json` file.
|
||||
|
||||
## Inference
|
||||
|
||||
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html).
|
||||
For inference, config the options in `config.json` firstly:
|
||||
|
||||
- Assign the `test_dataset` under `dataset_config` node to the dataset path.
|
||||
- Assign the `existed_ckpt` under `checkpoint_path` node to the model file produced by fine-tuning.
|
||||
- Choose the optimizer('momentum/adam/lamb' is available).
|
||||
|
@ -632,12 +647,12 @@ sh run_gpu.sh -t i -n 1 -i 1 -c config/config.json -o {outputfile}
|
|||
|
||||
# Description of random situation
|
||||
|
||||
MASS model contains dropout operations, if you want to disable dropout, please set related dropout_rate to 0 in `config/config.json`.
|
||||
|
||||
MASS model contains dropout operations, if you want to disable dropout, please set related dropout_rate to 0 in `config/config.json`.
|
||||
|
||||
# others
|
||||
The model has been validated on Ascend environment, not validated on CPU and GPU.
|
||||
|
||||
The model has been validated on Ascend environment, not validated on CPU and GPU.
|
||||
|
||||
# ModelZoo Homepage
|
||||
|
||||
[Link](https://gitee.com/mindspore/mindspore/tree/master/mindspore/model_zoo)
|
||||
|
|
|
@ -88,7 +88,7 @@ We use about 91K face images as training dataset and 11K as evaluating dataset i
|
|||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||
- Framework
|
||||
- [MindSpore](http://10.90.67.50/mindspore/archive/20200506/OpenSource/me_vm_x86/)
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
|
@ -268,4 +268,4 @@ sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
|
|||
|
||||
# [ModelZoo Homepage](#contents)
|
||||
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
|
|
|
@ -63,7 +63,7 @@ We use about 13K images as training dataset and 3K as evaluating dataset in this
|
|||
|
||||
If your dataset is too big to convert at a time, you can add data to an existed mindrecord in turn:
|
||||
|
||||
```
|
||||
```shell
|
||||
python data_to_mindrecord_train_append.py
|
||||
```
|
||||
|
||||
|
@ -72,7 +72,7 @@ We use about 13K images as training dataset and 3K as evaluating dataset in this
|
|||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||
- Framework
|
||||
- [MindSpore](http://10.90.67.50/mindspore/archive/20200506/OpenSource/me_vm_x86/)
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
|
@ -242,4 +242,4 @@ sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
|
|||
|
||||
# [ModelZoo Homepage](#contents)
|
||||
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
|
|
|
@ -70,7 +70,7 @@ We use about 122K face images as training dataset and 2K as evaluating dataset i
|
|||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||
- Framework
|
||||
- [MindSpore](http://10.90.67.50/mindspore/archive/20200506/OpenSource/me_vm_x86/)
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
|
@ -234,4 +234,4 @@ sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
|
|||
|
||||
# [ModelZoo Homepage](#contents)
|
||||
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
|
|
|
@ -58,7 +58,7 @@ The directory structure is as follows:
|
|||
- Hardware(Ascend)
|
||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||
- Framework
|
||||
- [MindSpore](http://10.90.67.50/mindspore/archive/20200506/OpenSource/me_vm_x86/)
|
||||
- [MindSpore](https://www.mindspore.cn/install/en)
|
||||
- For more information, please check the resources below:
|
||||
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
|
||||
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
|
||||
|
@ -225,4 +225,4 @@ sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
|
|||
|
||||
# [ModelZoo Homepage](#contents)
|
||||
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
||||
|
|
Loading…
Reference in New Issue