mindspore

History

mindspore-ci-bot ac2f85dc92 !16679 modify code of the file "mindspore_hub_conf.py" which network is gat and mass at the ver master From: @dinglinhe123 Reviewed-by: @wuxuejian,@liangchenghui Signed-off-by: @wuxuejian		2021-06-04 16:22:06 +08:00
..
ascend310_infer	gat & gcn 310 infer	2021-05-25 16:29:01 +08:00
scripts	adapt for new pkg	2021-05-31 11:44:43 +08:00
src	clould	2021-05-21 11:56:08 +08:00
README.md	clould	2021-06-02 17:08:27 +08:00
README_CN.md	clould	2021-06-02 17:08:27 +08:00
default_config.yaml	clould	2021-05-21 11:56:08 +08:00
export.py	clould	2021-05-21 11:56:08 +08:00
mindspore_hub_conf.py	modify code of the file mindspore_hub_conf.py which network is gat and mass	2021-05-21 09:12:53 +08:00
postprocess.py	gat & gcn 310 infer	2021-05-25 16:29:01 +08:00
preprocess.py	gat & gcn 310 infer	2021-05-25 16:29:01 +08:00
train.py	clould	2021-05-21 11:56:08 +08:00

README.md

Graph Attention Networks Description
Model architecture
Dataset
Features
- Mixed Precision
Environment Requirements
Quick Start
Script Description
Model Description
- Performance
  - Evaluation Performance
  - Inference Performance
Description of random situation
ModelZoo Homepage

Graph Attention Networks Description

Graph Attention Networks(GAT) was proposed in 2017 by Petar Veličković et al. By leveraging masked self-attentional layers to address shortcomings of prior graph based method, GAT achieved or matched state of the art performance on both transductive datasets like Cora and inductive dataset like PPI. This is an example of training GAT with Cora dataset in MindSpore.

Paper: Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903.

Model architecture

Note that according to whether this attention layer is the output layer of the network or not, the node update function can be concatenate or average.

Dataset

Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.

Dataset size: Statistics of dataset used are summarized as below:

	Cora	Citeseer
Task	Transductive	Transductive
# Nodes	2708 (1 graph)	3327 (1 graph)
# Edges	5429	4732
# Features/Node	1433	3703
# Classes	7	6
# Training Nodes	140	120
# Validation Nodes	500	500
# Test Nodes	1000	1000

Data Preparation
- Place the dataset to any path you want, the folder should include files as follows(we use Cora dataset as an example):

  .
  └─data
      ├─ind.cora.allx
      ├─ind.cora.ally
      ├─ind.cora.graph
      ├─ind.cora.test.index
      ├─ind.cora.tx
      ├─ind.cora.ty
      ├─ind.cora.x
      └─ind.cora.y

Generate dataset in mindrecord format for cora or citeseer.

cd ./scripts
# SRC_PATH is the dataset file path you downloaded, DATASET_NAME is cora or citeseer
sh run_process_data_ascend.sh [SRC_PATH] [DATASET_NAME]

Launch

#Generate dataset in mindrecord format for cora
./run_process_data_ascend.sh ./data cora
#Generate dataset in mindrecord format for citeseer
./run_process_data_ascend.sh ./data citeseer

Features

Mixed Precision

To ultilize the strong computation power of Ascend chip, and accelerate the training process, the mixed training method is used. MindSpore is able to cope with FP32 inputs and FP16 operators. In GAT example, the model is set to FP16 mode except for the loss calculation part.

Environment Requirements

Hardware (Ascend)
Framework
- MindSpore
For more information, please check the resources below:
- MindSpore Tutorials
- MindSpore Python API

Quick Start

After installing MindSpore via the official website and Dataset is correctly generated, you can start training and evaluation as follows.

running on Ascend

# run training example with cora dataset, DATASET_NAME is cora
sh run_train_ascend.sh [DATASET_NAME]

Running on ModelArts


# Train/eval 1p with Ascend
# (1) Perform a or b.
#       a. Set "enable_modelarts=True" on default_config.yaml file.
#          Set "data_dir='/cache/data'" on default_config.yaml file.
#          (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on default_config.yaml file.
#          Set other parameters on default_config.yaml file you need.
#       b. Add "enable_modelarts=True" on the website UI interface.
#          Add "data_dir='/cache/data'" on the website UI interface.
#          (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
#          Add other parameters on the website UI interface.
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
# (4) Upload the original Cora/Citeseer dataset to S3 bucket.
# (5) Set the code directory to "/path/gat" on the website UI interface.
# (6) Set the startup file to "train.py" on the website UI interface.
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (8) Create your job.
#

Script Description

Script and Sample Code

.
└─gat
  ├─README.md
  ├─scripts
  | ├─run_process_data_ascend.sh  # Generate dataset in mindrecord format
  | └─run_train_ascend.sh         # Launch training
  |
  ├─src
  | ├─dataset.py           # Data preprocessing
  | ├─gat.py               # GAT model
  | ├─utils.py             # Utils for training gat
  | └─model_utils
  |   ├─config.py          # Processing configuration parameters
  |   ├─device_adapter.py  # Get cloud ID
  |   ├─local_adapter.py   # Get local ID
  |   └─moxing_adapter.py  # Parameter processing
  |
  ├─default_config.yaml    # Training parameter profile
  └─train.py               # Train net

Script Parameters

Parameters for both training and evaluation can be set in default_config.yaml.

config for GAT, CORA dataset

"learning_rate": 0.005,            # Learning rate
"num_epochs": 200,                 # Epoch sizes for training
"hid_units": [8],                  # Hidden units for attention head at each layer
"n_heads": [8, 1],                 # Num heads for each layer
"early_stopping": 100,             # Early stop patience
"l2_coeff": 0.0005                 # l2 coefficient
"attn_dropout": 0.6                # Attention dropout ratio
"feature_dropout":0.6              # Feature dropout ratio

Training Process

Training

running on Ascend

sh run_train_ascend.sh [DATASET_NAME]

Training result will be stored in the scripts path, whose folder name begins with "train". You can find the result like the followings in log.

Epoch:0, train loss=1.98498 train acc=0.17143 | val loss=1.97946 val acc=0.27200
Epoch:1, train loss=1.98345 train acc=0.15000 | val loss=1.97233 val acc=0.32600
Epoch:2, train loss=1.96968 train acc=0.21429 | val loss=1.96747 val acc=0.37400
Epoch:3, train loss=1.97061 train acc=0.20714 | val loss=1.96410 val acc=0.47600
Epoch:4, train loss=1.96864 train acc=0.13571 | val loss=1.96066 val acc=0.59600
...
Epoch:195, train loss=1.45111 train_acc=0.56429 | val_loss=1.44325 val_acc=0.81200
Epoch:196, train loss=1.52476 train_acc=0.52143 | val_loss=1.43871 val_acc=0.81200
Epoch:197, train loss=1.35807 train_acc=0.62857 | val_loss=1.43364 val_acc=0.81400
Epoch:198, train loss=1.47566 train_acc=0.51429 | val_loss=1.42948 val_acc=0.81000
Epoch:199, train loss=1.56411 train_acc=0.55000 | val_loss=1.42632 val_acc=0.80600
Test loss=1.5366285, test acc=0.84199995
...

Inference Process

Export MindIR

python export.py --config_path [CONFIG_PATH]--ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]

The ckpt_file parameter is required, EXPORT_FORMAT should be in ["AIR", "MINDIR"]

Infer on Ascend310

Before performing inference, the mindir file must be exported by export.py script. We only provide an example of inference using MINDIR model.

# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATASET_NAME] [DATASET_PATH] [NEED_PREPROCESS] [DEVICE_ID]

DATASET_NAME must be in ['cora', 'citeseer'].
NEED_PREPROCESS means weather need preprocess or not, it's value is 'y' or 'n'.
DEVICE_ID is optional, default value is 0.

result

Inference result is saved in current path, you can find result like this in acc.log file.

test acc=0.84199995

Model Description

Performance

Parameter	GAT
Resource	Ascend 910; OS Euler2.8
uploaded Date	06/16/2020(month/day/year)
MindSpore Version	1.0.0
Dataset	Cora/Citeseer
Training Parameter	epoch=200
Optimizer	Adam
Loss Function	Softmax Cross Entropy
Accuracy	83.0/72.5
Speed	0.195s/epoch
Total time	39s
Scripts	GAT Script

Description of random situation

GAT model contains lots of dropout operations, if you want to disable dropout, set the attn_dropout and feature_dropout to 0 in src/config.py. Note that this operation will cause the accuracy drop to approximately 80%.

ModelZoo Homepage

Please check the official homepage.