History

zhouneng 07484756e1 solve the problem of decreased accuracy caused by changes in the optimizer mechanism		2021-03-09 20:01:22 +08:00
..
scripts	Modify readme of deeptext	2021-01-15 11:03:25 +08:00
src	solve the problem of decreased accuracy caused by changes in the optimizer mechanism	2021-03-09 20:01:22 +08:00
README.md	fix error links for master	2021-01-20 10:44:12 +08:00
eval.py	fix typos in lincense	2021-03-03 14:15:28 +08:00
train.py	fix typos in lincense	2021-03-03 14:15:28 +08:00

README.md

DeepText for Ascend

DeepText Description
Model Architecture
Dataset
Features
- Mixed Precision
Environment Requirements
Script Description
Model Description
- Performance
  - Training Performance
  - Inference Performance
Description of Random Situation
ModelZoo Homepage

DeepText Description

DeepText is a convolutional neural network architecture for text detection in non-specific scenarios. The DeepText system is based on the elegant framework of Faster R-CNN. This idea was proposed in the paper "DeepText: A new approach for text proposal generation and text detection in natural images.", published in 2017.

Paper Zhuoyao Zhong, Lianwen Jin, Shuangping Huang, South China University of Technology (SCUT), Published in ICASSP 2017.

Model architecture

The overall network architecture of InceptionV4 is show below:

Link

Dataset

Here we used 4 datasets for training, and 1 datasets for Evaluation.

Dataset1: ICDAR 2013: Focused Scene Text
- Train: 142MB, 229 images
- Test: 110MB, 233 images
Dataset2: ICDAR 2013: Born-Digital Images
- Train: 27.7MB, 410 images
Dataset3: SCUT-FORU: Flickr OCR Universal Database
- Train: 388MB, 1715 images
Dataset4: CocoText v2(Subset of MSCOCO2017):
- Train: 13GB, 63686 images

Features

Environment Requirements

Hardware（Ascend）
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the application form to ascend@huawei.com. Once approved, you can get the resources.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Script description

Script and sample code

.
└─deeptext
  ├─README.md
  ├─scripts
    ├─run_standalone_train_ascend.sh    # launch standalone training with ascend platform(1p)
    ├─run_distribute_train_ascend.sh    # launch distributed training with ascend platform(8p)
    └─run_eval_ascend.sh                # launch evaluating with ascend platform
  ├─src
    ├─DeepText
      ├─__init__.py                     # package init file
      ├─anchor_genrator.py              # anchor generator
      ├─bbox_assign_sample.py           # proposal layer for stage 1
      ├─bbox_assign_sample_stage2.py    # proposal layer for stage 2
      ├─deeptext_vgg16.py               # main network definition
      ├─proposal_generator.py           # proposal generator
      ├─rcnn.py                         # rcnn
      ├─roi_align.py                    # roi_align cell wrapper
      ├─rpn.py                          # region-proposal network
      └─vgg16.py                        # backbone
    ├─config.py                       # training configuration
    ├─dataset.py                      # data proprocessing
    ├─lr_schedule.py                  # learning rate scheduler
    ├─network_define.py               # network definition
    └─utils.py                        # some functions which is commonly used
  ├─eval.py                           # eval net
  ├─export.py                         # export checkpoint, surpport .onnx, .air, .mindir convert
  └─train.py                          # train net

Training process

Usage

Ascend:

# distribute training example(8p)
sh run_distribute_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [RANK_TABLE_FILE] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH]
# standalone training
sh run_standalone_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
# evaluation:
sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]

Notes: RANK_TABLE_FILE can refer to Link , and the device_ip can be got as Link. For large models like InceptionV4, it's better to export an external environment variable export HCCL_CONNECT_TIMEOUT=600 to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.

This is processor cores binding operation regarding the device_num and total processor numbers. If you are not expect to do it, remove the operations taskset in scripts/run_distribute_train.sh

The pretrained_path should be a checkpoint of vgg16 trained on Imagenet2012. The name of weight in dict should be totally the same, also the batch_norm should be enabled in the trainig of vgg16, otherwise fails in further steps. COCO_TEXT_PARSER_PATH coco_text.py can refer to Link.

Launch

# training example
  shell:
    Ascend:
      # distribute training example(8p)
      sh run_distribute_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [RANK_TABLE_FILE] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH]
      # standalone training
      sh run_standalone_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]

Result

Training result will be stored in the example path. Checkpoints will be stored at ckpt_path by default, and training log will be redirected to ./log, also the loss will be redirected to ./loss_0.log like followings.

469 epoch: 1 step: 982 ,rpn_loss: 0.03940, rcnn_loss: 0.48169, rpn_cls_loss: 0.02910, rpn_reg_loss: 0.00344, rcnn_cls_loss: 0.41943, rcnn_reg_loss: 0.06223, total_loss: 0.52109
659 epoch: 2 step: 982 ,rpn_loss: 0.03607, rcnn_loss: 0.32129, rpn_cls_loss: 0.02916, rpn_reg_loss: 0.00230, rcnn_cls_loss: 0.25732, rcnn_reg_loss: 0.06390, total_loss: 0.35736
847 epoch: 3 step: 982 ,rpn_loss: 0.07074, rcnn_loss: 0.40527, rpn_cls_loss: 0.03494, rpn_reg_loss: 0.01193, rcnn_cls_loss: 0.30591, rcnn_reg_loss: 0.09937, total_loss: 0.47601

Eval process

Usage

You can start training using python or shell scripts. The usage of shell scripts as follows:

Ascend:

  sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]

Launch

# eval example
  shell:
      Ascend:
            sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]

checkpoint can be produced in training process.

Result

Evaluation result will be stored in the example path, you can find result like the followings in log.

========================================

class 1 precision is 88.01%, recall is 82.77%

Model description

Performance

Training Performance

Parameters	Ascend
Model Version	Deeptext
Resource	Ascend 910, cpu:2.60GHz 192cores, memory:755G
uploaded Date	12/26/2020
MindSpore Version	1.1.0
Dataset	66040 images
Batch_size	2
Training Parameters	src/config.py
Optimizer	Momentum
Loss Function	SoftmaxCrossEntropyWithLogits for classification, SmoothL2Loss for bbox regression
Loss	~0.008
Total time (8p)	4h
Scripts	deeptext script

Inference Performance

Parameters	Ascend
Model Version	Deeptext
Resource	Ascend 910, cpu:2.60GHz 192cores, memory:755G
Uploaded Date	12/26/2020
MindSpore Version	1.1.0
Dataset	229 images
Batch_size	2
Accuracy	precision=0.8801, recall=0.8277
Total time	1 min
Model for inference	3492M (.ckpt file)

Training performance results

Ascend	train performance
1p	14 img/s

Ascend	train performance
8p	50 img/s

Description of Random Situation

We set seed to 1 in train.py.

ModelZoo Homepage

Please check the official homepage.

README.md Unescape Escape

DeepText for Ascend

Usage

Launch

Result

Usage

Launch

Result

Training Performance

Inference Performance

Training performance results

README.md