!8461 add hccl timeout annotation in README for maskrcnn

From: @gengdongjie
Reviewed-by: @yingjy,@c_34
Signed-off-by: @yingjy
This commit is contained in:
mindspore-ci-bot 2020-11-12 16:44:00 +08:00 committed by Gitee
commit f1f92fa4f9
1 changed files with 2 additions and 0 deletions

View File

@ -101,6 +101,8 @@ pip install mmcv=0.2.14
1. To speed up data preprocessing, MindSpore provide a data format named MindRecord, hence the first step is to generate MindRecord files based on COCO2017 dataset before training. The process of converting raw COCO2017 dataset to MindRecord format may take about 4 hours.
2. For distributed training, a [hccl configuration file](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools) with JSON format needs to be created in advance.
3. PRETRAINED_CKPT is a resnet50 checkpoint that trained over ImageNet2012.
4. For large models like MaskRCNN, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.
4. Execute eval script.
After training, you can start evaluation as follows: