陈劢 4129b93a1e | ||
---|---|---|
.. | ||
scripts | ||
src | ||
data_helpers.py | ||
eval.py | ||
export.py | ||
readme.md | ||
sample.txt | ||
train.py |
readme.md
TextRCNN
Contents
- TextRCNN Description
- Model Architecture
- Dataset
- Environment Requirements
- Quick Start
- Script Description
- ModelZoo Homepage
TextRCNN Description
TextRCNN, a model for text classification, which is proposed by the Chinese Academy of Sciences in 2015. TextRCNN actually combines RNN and CNN, first uses bidirectional RNN to obtain upper semantic and grammatical information of the input text, and then uses maximum pooling to automatically filter out the most important feature. Then connect a fully connected layer for classification.
The TextCNN network structure contains a convolutional layer and a pooling layer. In RCNN, the feature extraction function of the convolutional layer is replaced by RNN. The overall structure consists of RNN and pooling layer, so it is called RCNN.
Paper: Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao: Recurrent Convolutional Neural Networks for Text Classification. AAAI 2015: 2267-2273
Model Architecture
Specifically, the TextRCNN is mainly composed of three parts: a recurrent structure layer, a max-pooling layer, and a fully connected layer. In the paper, the length of the word vector |e|=50
, the length of the context vector |c|=50
, the hidden layer size H=100
, the learning rate \alpha=0.01
, the amount of words is |V|
, the input is a sequence of words, and the output is a vector containing categories.
Dataset
Dataset used: Sentence polarity dataset v1.0
- Dataset size:10662 movie comments in 2 classes, 9596 comments for train set, 1066 comments for test set.
- Data format:text files. The processed data is in
./data/
Environment Requirements
- Hardware: Ascend
- Framework: MindSpore
- For more information, please check the resources below:MindSpore tutorials, MindSpore Python API.
Quick Start
- Preparing environment
# download the pretrained GoogleNews-vectors-negative300.bin, put it into /tmp
# you can download from https://code.google.com/archive/p/word2vec/,
# or from https://pan.baidu.com/s/1NC2ekA_bJ0uSL7BF3SjhIg, code: yk9a
mv /tmp/GoogleNews-vectors-negative300.bin ./word2vec/
- Preparing data
# split the dataset by the following scripts.
mkdir -p data/test && mkdir -p data/train
python data_helpers.py --task dataset_split --data_dir dataset_dir
- Running on Ascend
# run training
DEVICE_ID=7 python train.py
# or you can use the shell script to train in background
bash scripts/run_train.sh
# run evaluating
DEVICE_ID=7 python eval.py --ckpt_path {checkpoint path}
# or you can use the shell script to evaluate in background
bash scripts/run_eval.sh
Script Description
Script and Sample Code
├── model_zoo
├── README.md // descriptions about all the models
├── textrcnn
├── README.md // descriptions about TextRCNN
├── data_src
│ ├──rt-polaritydata // directory to save the source data
│ ├──rt-polaritydata.README.1.0.txt // readme file of dataset
├── scripts
│ ├──run_train.sh // shell script for train on Ascend
│ ├──run_eval.sh // shell script for evaluation on Ascend
│ ├──sample.txt // example shell to run the above the two scripts
├── src
│ ├──dataset.py // creating dataset
│ ├──textrcnn.py // textrcnn architecture
│ ├──config.py // parameter configuration
├── train.py // training script
├── export.py // export script
├── eval.py // evaluation script
├── data_helpers.py // dataset split script
├── sample.txt // the shell to train and eval the model without scripts
Script Parameters
Parameters for both training and evaluation can be set in config.py
-
config for Textrcnn, Sentence polarity dataset v1.0.
'num_epochs': 10, # total training epochs 'lstm_num_epochs': 15, # total training epochs when using lstm 'batch_size': 64, # training batch size 'cell': 'gru', # the RNN architecture, can be 'vanilla', 'gru' and 'lstm'. 'ckpt_folder_path': './ckpt', # the path to save the checkpoints 'preprocess_path': './preprocess', # the directory to save the processed data 'preprocess' : 'false', # whethere to preprocess the data 'data_path': './data/', # the path to store the splited data 'lr': 1e-3, # the training learning rate 'lstm_lr_init': 2e-3, # learning rate initial value when using lstm 'lstm_lr_end': 5e-4, # learning rate end value when using lstm 'lstm_lr_max': 3e-3, # learning eate max value when using lstm 'lstm_lr_warm_up_epochs': 2 # warm up epoch num when using lstm 'lstm_lr_adjust_epochs': 9 # lr adjust in lr_adjust_epoch, after that, the lr is lr_end when using lstm 'emb_path': './word2vec', # the directory to save the embedding file 'embed_size': 300, # the dimension of the word embedding 'save_checkpoint_steps': 149, # per step to save the checkpoint 'keep_checkpoint_max': 10 # max checkpoints to save
Performance
Model | MindSpore + Ascend | TensorFlow+GPU |
---|---|---|
Resource | Ascend 910 | NV SMX2 V100-32G |
Version | 1.0.1 | 1.4.0 |
Dataset | Sentence polarity dataset v1.0 | Sentence polarity dataset v1.0 |
batch_size | 64 | 64 |
Accuracy | 0.78 | 0.78 |
Speed | 35ms/step | 77ms/step |
ModelZoo Homepage
Please check the official homepage.