History

陈劢 4129b93a1e modify textrcnn readme		2021-03-01 10:23:54 +08:00
..
scripts	move textrcnn from official to research, and raise acc when using lstm as RNN arch	2020-12-23 22:54:33 +08:00
src	enable export textrcnn ,remove useless code in eval.py	2020-12-26 17:08:49 +08:00
data_helpers.py	enable export textrcnn ,remove useless code in eval.py	2020-12-26 17:08:49 +08:00
eval.py	enable export textrcnn ,remove useless code in eval.py	2020-12-26 17:08:49 +08:00
export.py	enable export textrcnn ,remove useless code in eval.py	2020-12-26 17:08:49 +08:00
readme.md	modify textrcnn readme	2021-03-01 10:23:54 +08:00
sample.txt	move textrcnn from official to research, and raise acc when using lstm as RNN arch	2020-12-23 22:54:33 +08:00
train.py	enable export textrcnn ,remove useless code in eval.py	2020-12-26 17:08:49 +08:00

readme.md

TextRCNN

TextRCNN Description
Model Architecture
Dataset
Environment Requirements
Quick Start
Script Description
ModelZoo Homepage

TextRCNN Description

TextRCNN, a model for text classification, which is proposed by the Chinese Academy of Sciences in 2015. TextRCNN actually combines RNN and CNN, first uses bidirectional RNN to obtain upper semantic and grammatical information of the input text, and then uses maximum pooling to automatically filter out the most important feature. Then connect a fully connected layer for classification.

The TextCNN network structure contains a convolutional layer and a pooling layer. In RCNN, the feature extraction function of the convolutional layer is replaced by RNN. The overall structure consists of RNN and pooling layer, so it is called RCNN.

Paper: Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao: Recurrent Convolutional Neural Networks for Text Classification. AAAI 2015: 2267-2273

Model Architecture

Specifically, the TextRCNN is mainly composed of three parts: a recurrent structure layer, a max-pooling layer, and a fully connected layer. In the paper, the length of the word vector |e|=50, the length of the context vector |c|=50, the hidden layer size H=100, the learning rate \alpha=0.01, the amount of words is |V|, the input is a sequence of words, and the output is a vector containing categories.

Dataset

Dataset used: Sentence polarity dataset v1.0

Dataset size：10662 movie comments in 2 classes, 9596 comments for train set, 1066 comments for test set.
Data format：text files. The processed data is in ./data/

Environment Requirements

Hardware: Ascend
Framework: MindSpore
For more information, please check the resources below：MindSpore tutorials, MindSpore Python API.

Quick Start

Preparing environment

  # download the pretrained GoogleNews-vectors-negative300.bin, put it into /tmp
  # you can download from https://code.google.com/archive/p/word2vec/,
  # or from https://pan.baidu.com/s/1NC2ekA_bJ0uSL7BF3SjhIg, code: yk9a

  mv /tmp/GoogleNews-vectors-negative300.bin ./word2vec/

Preparing data

  # split the dataset by the following scripts.
  mkdir -p data/test && mkdir -p data/train
  python data_helpers.py --task dataset_split --data_dir dataset_dir

Running on Ascend

# run training
DEVICE_ID=7 python train.py
# or you can use the shell script to train in background
bash scripts/run_train.sh

# run evaluating
DEVICE_ID=7 python eval.py --ckpt_path {checkpoint path}
# or you can use the shell script to evaluate in background
bash scripts/run_eval.sh

Script Description

Script and Sample Code

├── model_zoo
    ├── README.md                          // descriptions about all the models
    ├── textrcnn
        ├── README.md                    // descriptions about TextRCNN
        ├── data_src
        │   ├──rt-polaritydata            // directory to save the source data
        │   ├──rt-polaritydata.README.1.0.txt    // readme file of dataset
        ├── scripts
        │   ├──run_train.sh             // shell script for train on Ascend
        │   ├──run_eval.sh              // shell script for evaluation on Ascend
        │   ├──sample.txt              // example shell to run the above the two scripts
        ├── src
        │   ├──dataset.py             // creating dataset
        │   ├──textrcnn.py          // textrcnn architecture
        │   ├──config.py            // parameter configuration
        ├── train.py               // training script
        ├── export.py             // export script
        ├── eval.py               //  evaluation script
        ├── data_helpers.py               //  dataset split script
        ├── sample.txt               //  the shell to train and eval the model without scripts

Script Parameters

Parameters for both training and evaluation can be set in config.py

config for Textrcnn, Sentence polarity dataset v1.0.

'num_epochs': 10, # total training epochs
'lstm_num_epochs': 15, # total training epochs when using lstm
'batch_size': 64, # training batch size
'cell': 'gru', # the RNN architecture, can be 'vanilla', 'gru' and 'lstm'.
'ckpt_folder_path': './ckpt', # the path to save the checkpoints
'preprocess_path': './preprocess', # the directory to save the processed data
'preprocess' : 'false', # whethere to preprocess the data
'data_path': './data/', # the path to store the splited data
'lr': 1e-3, # the training learning rate
'lstm_lr_init': 2e-3, # learning rate initial value when using lstm
'lstm_lr_end': 5e-4, # learning rate end value when using lstm
'lstm_lr_max': 3e-3, # learning eate max value when using lstm
'lstm_lr_warm_up_epochs': 2 # warm up epoch num when using lstm
'lstm_lr_adjust_epochs': 9 # lr adjust in lr_adjust_epoch, after that, the lr is lr_end when using lstm
'emb_path': './word2vec', # the directory to save the embedding file
'embed_size': 300, # the dimension of the word embedding
'save_checkpoint_steps': 149, # per step to save the checkpoint
'keep_checkpoint_max': 10 # max checkpoints to save

Performance

Model	MindSpore + Ascend	TensorFlow+GPU
Resource	Ascend 910	NV SMX2 V100-32G
Version	1.0.1	1.4.0
Dataset	Sentence polarity dataset v1.0	Sentence polarity dataset v1.0
batch_size	64	64
Accuracy	0.78	0.78
Speed	35ms/step	77ms/step

ModelZoo Homepage

Please check the official homepage.

readme.md Unescape Escape

TextRCNN

Contents

Performance

readme.md