b29637b237 | ||
---|---|---|
.. | ||
asdl | ||
preprocessed | ||
snowball | ||
utils | ||
README.md | ||
convert2id_electra.ipynb | ||
preprocess.ipynb |
README.md
-
Firstly, create conda environment
data_construction
:conda create -n text2sql python=3.6 source activate text2sql pip install grakel python -c "import nltk; nltk.download('punkt')" cd snowball pip install -r requirements.txt
-
Download raw data and unzip it into the
raw_data
directory. Make sure the datasets are correctly located as:
data
├── database
├── tables.json
└── text_to_sql_data.json
-
Execute the command in the file
preprocess.ipynb
to generate three data filesalldata.json
,logic.json
,question_sql.json
in thepreprocessed
directory. -
Follow the paper
Logic-Consistency Text Generation from Semantic Parses
, train a snowball model from scratch or just download our pre-trained checkpoint snollball and unzip it into thesaves/checkpoint-epoch-10.0
directory. Then run the following command to generate thefinal_generation.json
file:cd snowball python eval.py
-
Run the command in the file
convert2id.ipynb
to generate final pre-train dataalltask_final.txt
in thefinal_data
directory.