History

lilei 4984c6e9d5 modify model_zoo ssd_ghostnet for clould		2021-05-18 21:26:14 +08:00
..
ascend310_infer	bert & lstm & gru & textcnn 310 inference	2021-05-07 11:41:37 +08:00
model_utils	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
scripts	!15753 bert & lstm & gru & textcnn 310 inference	2021-05-08 17:11:33 +08:00
src	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
README.md	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
eval.py	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
export.py	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
mindspore_hub_conf.py	some model lack 'hub_conf.py' file at model_zoo	2021-04-29 09:07:52 +08:00
mr_config.yaml	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
postprocess.py	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
preprocess.py	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
sst2_config.yaml	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
subj_config.yaml	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00
train.py	modify model_zoo ssd_ghostnet for clould	2021-05-18 21:26:14 +08:00

TextCNN Description

TextCNN is an algorithm that uses convolutional neural networks to classify text. It was proposed by Yoon Kim in the article "Convolutional Neural Networks for Sentence Classification" in 2014. It is widely used in various tasks of text classification (such as sentiment analysis). It has become the standard benchmark for the new text classification framework. Each module of TextCNN can complete text classification tasks independently, and it is convenient for distributed configuration and parallel execution. TextCNN is very suitable for the semantic analysis of short texts such as Weibo/News/E-commerce reviews and video bullet screens.

Paper: Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014.

Model Architecture

The basic network structure design of TextCNN can refer to the paper "Convolutional Neural Networks for Sentence Classification". The specific implementation takes reading a sentence "I like this movie very much!" as an example. First, the word segmentation algorithm is used to divide the words into 7 words, and then the words in each part are expanded into a five-dimensional vector through the embedding method. Then use different convolution kernels ([3,4,5]*5) to perform convolution operations on them to obtain feature maps. The default number of convolution kernels is 2. Then use the maxpool operation to pool all the feature maps, and finally merge the pooling result into a one-dimensional feature vector through the connection operation. At last, it can be divided into 2 categories with softmax, and the positive/negative emotions are obtained.

Dataset

Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.

Dataset used: Movie Review Data

Dataset size：1.18M，5331 positive and 5331 negative processed sentences / snippets.
- Train：1.06M, 9596 sentences / snippets
- Test：0.12M, 1066 sentences / snippets
Data format：text
- Please click here to download the data, and change the files into utf-8. Then put it into the data directory.
- Note：Data will be processed in src/dataset.py

Environment Requirements

Hardware（Ascend）
- Prepare hardware environment with Ascend processor.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Quick Start

After installing MindSpore via the official website, you can start training and evaluation as follows:

running on Ascend

# run training example
python train.py > train.log 2>&1 &
OR
sh scripts/run_train.sh

# run evaluation example
python eval.py > eval.log 2>&1 &
OR
sh scripts/run_eval.sh ckpt_path

If you want to run in modelarts, please check the official documentation of modelarts, and you can start training and evaluation as follows:

# run distributed training on modelarts example
# (1) First, Perform a or b.
#       a. Set "enable_modelarts=True" on yaml file.
#          Set other parameters on yaml file you need.
#       b. Add "enable_modelarts=True" on the website UI interface.
#          Add other parameters on the website UI interface.
# (2) Set the code directory to "/path/textcnn" on the website UI interface.
# (3) Set the startup file to "train.py" on the website UI interface.
# (4) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (5) Create your job.

# run evaluation on modelarts example
# (1) Copy or upload your trained model to S3 bucket.
# (2) Perform a or b.
#       a. Set "checkpoint_file_path='/cache/checkpoint_path/model.ckpt'" on yaml file.
#          Set "checkpoint_url=/The path of checkpoint in S3/" on yaml file.
#       b. Add "checkpoint_file_path='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
#          Add "checkpoint_url=/The path of checkpoint in S3/" on the website UI interface.
# (3) Set the code directory to "/path/textcnn" on the website UI interface.
# (4) Set the startup file to "eval.py" on the website UI interface.
# (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (6) Create your job.

Script Description

Script and Sample Code

├── model_zoo
    ├── README.md                          // descriptions about all the models
    ├── textcnn
        ├── README.md                      // descriptions about textcnn
        ├──scripts
        │   ├── run_train.sh              // shell script for distributed on Ascend
        │   ├── run_eval.sh               // shell script for evaluation on Ascend
        ├── src
        │   ├── dataset.py                // Processing dataset
        │   ├── textcnn.py                // textcnn architecture
        ├── model_utils
        │   ├──device_adapter.py          // device adapter
        │   ├──local_adapter.py           // local adapter
        │   ├──moxing_adapter.py          // moxing adapter
        │   ├──config.py                  // parameter analysis
        ├── mr_config.yaml                 // parameter configuration
        ├── sst2_config.yaml               // parameter configuration
        ├── subj_config.yaml               // parameter configuration
        ├── train.py                       // training script
        ├── eval.py                        //  evaluation script
        ├── export.py                      //  export checkpoint to other format file

Script Parameters

Parameters for both training and evaluation can be set in config.py

config for movie review dataset

'pre_trained': 'False'    # whether training based on the pre-trained model
'nump_classes': 2         # the number of classes in the dataset
'batch_size': 64          # training batch size
'epoch_size': 4           # total training epochs
'weight_decay': 3e-5      # weight decay value
'data_path': './data/'    # absolute full path to the train and evaluation datasets
'device_target': 'Ascend' # device running the program
'device_id': 0            # device ID used to train or evaluate the dataset. Ignore it when you use run_train.sh for distributed training
'keep_checkpoint_max': 1  # only keep the last keep_checkpoint_max checkpoint
'checkpoint_path': './train_textcnn.ckpt'  # the absolute full path to save the checkpoint file
'word_len': 51            # The length of the word
'vec_length': 40          # The length of the vector
'base_lr': 1e-3          # The base learning rate

For more configuration details, please refer the script config.py.

Training Process

running on Ascend
```
python train.py > train.log 2>&1 &
OR
sh scripts/run_train.sh
```
The python command above will run in the background, you can view the results through the file train.log.

After training, you'll get some checkpoint files in ckpt. The loss value will be achieved as follows:
```
# grep "loss is " train.log
epoch: 1 step 149, loss is 0.6194226145744324
epoch: 2 step 149, loss is 0.38729554414749146
...
```
The model checkpoint will be saved in the ckpt directory.

Evaluation Process

evaluation on movie review dataset when running on Ascend

Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/textcnn/ckpt/train_textcnn.ckpt".
```
python eval.py --checkpoint_path=ckpt_path > eval.log 2>&1 &
OR
sh scripts/run_eval.sh ckpt_path
```
The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
```
# grep "accuracy: " eval.log
accuracy: {'acc': 0.7971428571428572}
```

Export MindIR

python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]

The ckpt_file parameter is required, EXPORT_FORMAT should be in ["AIR", "MINDIR"]

Inference Process

Usage

Before performing inference, the mindir file must be exported by export.py. Input files must be in bin format.

# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DATASET_NAME] [NEED_PREPROCESS] [DEVICE_ID]

DATASET_NAME must choose from ['MR', 'SUBJ', 'SST2'] NEED_PREPROCESS means weather need preprocess or not, it's value is 'y' or 'n'." DEVICE_ID is optional, default value is 0.

result

Inference result is saved in current path, you can find result in acc.log file.

# grep "accuracy: " acc.log
accuracy: 0.7971428571428572

Model Description

Performance

TextCNN on Movie Review Dataset

Parameters	Ascend
Model Version	TextCNN
Resource	Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8
uploaded Date	11/10/2020 (month/day/year)
MindSpore Version	1.0.1
Dataset	Movie Review Data
Training Parameters	epoch=4, steps=149, batch_size = 64
Optimizer	Adam
Loss Function	Softmax Cross Entropy
outputs	probability
Loss	0.1724
Speed	1pc: 12.069 ms/step
Total time	1pc: 13s
Scripts	textcnn script

ModelZoo Homepage

Please check the official homepage.

README.md Unescape Escape

Contents

Usage

result

TextCNN on Movie Review Dataset

README.md