History

出蛰 b29637b237 add: star		2022-11-14 14:09:25 +08:00
..
cosql	add: star	2022-11-14 14:09:25 +08:00
sparc	add: star	2022-11-14 14:09:25 +08:00
README.md	add: star	2022-11-14 14:09:25 +08:00

README.md

Results of STAR + LGESQL

This section presents the results on CoSQL and SParC datasets with STAR fine-tuned with LGESQL.

Create conda environment

The following commands.

Create conda environment lgesql:

In our experiments, we use torch==1.7.0 with CUDA version 11.0:

conda create -n lgesql python=3.6
source activate lgesql
pip install torch==1.8.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Next, download dependencies:

python -c "import nltk; nltk.download('punkt')"
python -c "import stanza; stanza.download('en')"
python -c "import nltk; nltk.download('stopwords')"

Using our checkpoint to evaluation:

Download our processed datasets CoSQL or SParC and unzip them into the cosql/data and sparc/data respectively. Make sure the datasets are correctly located as:

data
├── database
├── dev_electra.json
├── dev_electra.bin
├── dev_electra.lgesql.bin
├── dev_gold.txt
├── label.json
├── tables_electra.bin
├── tables.json
├── train_electra.bin
├── train_electra.json
└── train_electra.lgesql.bin

Download our processed checkpoints CoSQL or SParC and unzip them into the cosql/checkpoints and sparc/checkpoints respectively. Make sure the checkpoints are correctly located as:

checkpoints
├── model_IM.bin
└── params.json

Execute the following command and the results are recorded in result_XXX.txt(it will take 10 to 30 minutes on one Tesla V100-PCIE-32GB GPU):
```
sh run/run_evaluation.sh
```

Train from scratch

You can train STAR yourself by following the process in the pretrain file or download our pre-trained STAR and unzip it into the pretrained_models/sss directory. Make sure the STAR are correctly located as:

pretrained_models
└── sss
      ├── config.json
      ├── pytorch_model.bin
      └── vocab.txt

You can preprocess the data with the process_data&&label.py file and refer to the methods in LGESQL, or download our processed data as described above directly.
Traning: (it will take 4 days on one Tesla V100-PCIE-32GB GPU)
```
sh run/run_lgesql_plm.sh
```