From cfb64add91a54955d8d878ca0b2f032e160e2954 Mon Sep 17 00:00:00 2001 From: Yi Dai <38886373+debby1103@users.noreply.github.com> Date: Wed, 8 Nov 2023 15:20:43 +0800 Subject: [PATCH 1/3] Update oltqa/README.md --- oltqa/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/oltqa/README.md b/oltqa/README.md index 23065f1..3a9366c 100644 --- a/oltqa/README.md +++ b/oltqa/README.md @@ -6,7 +6,7 @@ The PyTorch implementation of paper [Long-Tailed Question Answering in an Open W cd LongTailQA pip install -r r.txt ``` -The raw dataset is available [here](https://drive.google.com/file/d/12-w1bMAevcXmjsl8DKcCrJe1kI-pyZTd/view?usp=share_link) to be placed in data_process/data +The raw dataset is available [here](https://drive.google.com/file/d/1yvWuMYKSEoeutA-o_1VviuD_lZKRorBP/view?usp=sharing) to be placed in data_process/data # Construct Pareto Long-Tail subset of raw data ```bash @@ -44,7 +44,7 @@ python gen_seltest.py ```bash bash ./train_stage1.sh ${train batch size} ``` -For a quickstart, pre-trained [bi-encoder](https://drive.google.com/file/d/1RRau7Y7PX2rv3CxVHK5FGM4aJPhsLbiv/view?usp=share_link) and [cross-encoder](https://drive.google.com/file/d/1YNi5TSBvo4eevdw7DPcLApX96LSJlu4p/view?usp=share_link) checkpoints are available. +For a quickstart, pre-trained [bi-encoder](https://drive.google.com/file/d/1j_i28_zvBuhcRE--Lr_PkIUPrYUZKB5O/view?usp=sharing) and [cross-encoder](https://drive.google.com/file/d/1S6Aa_8SSShz5EhwjTlsurGkH7gfhlfh5/view?usp=sharing) checkpoints are available. ## train and evaluate the framework ```bash @@ -66,4 +66,4 @@ and ```bash #w/o knowledge mining bash ./ablationknowledge.sh ${train batch size} ${eval batch size} ${epoch} -``` \ No newline at end of file +``` From 9faac71108e29dc14441eaff98a62030ca7014b3 Mon Sep 17 00:00:00 2001 From: Yi Dai <38886373+debby1103@users.noreply.github.com> Date: Mon, 13 Nov 2023 13:16:30 +0800 Subject: [PATCH 2/3] replace invalid links --- oltqa/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/oltqa/README.md b/oltqa/README.md index 3a9366c..61205cd 100644 --- a/oltqa/README.md +++ b/oltqa/README.md @@ -13,7 +13,7 @@ The raw dataset is available [here](https://drive.google.com/file/d/1yvWuMYKSEoe python gen_lt.py ``` -# large PLM inference +# preprocessing: large PLM inference ## BM25 candidates use [This Repo](https://github.com/OhadRubin/EPR) to select BM25 examples for PLM inference ```bash @@ -26,10 +26,10 @@ Install [GLM-10B](https://github.com/THUDM/GLM) or [GLM-130B](https://github.com ```bash cd plm bash ./install_glm.sh -bash ./run.sh ${input_file} +bash scripts/generate_block.sh \ + config_tasks/model_blocklm_10B_chinese.sh ``` - # Two-stage Training ## generate dataset for example selection From 33c94b51b56e9dfd6a7611467c7a9c911f3fddba Mon Sep 17 00:00:00 2001 From: Yi Dai <38886373+debby1103@users.noreply.github.com> Date: Mon, 13 Nov 2023 13:18:12 +0800 Subject: [PATCH 3/3] replace invalid links --- oltqa/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/oltqa/README.md b/oltqa/README.md index 61205cd..edafc41 100644 --- a/oltqa/README.md +++ b/oltqa/README.md @@ -15,7 +15,7 @@ python gen_lt.py # preprocessing: large PLM inference ## BM25 candidates -use [This Repo](https://github.com/OhadRubin/EPR) to select BM25 examples for PLM inference +use [This Repo](https://github.com/OhadRubin/EPR) to select BM25 examples for PLM inference (to construct a candidate pool for further selection) ```bash python find_bm25.py output_path=$PWD/data/{compute_bm25_outfile} \ dataset_split=train setup_type={bm25_setup_type} task_name={dataset} +ds_size={ds_size} L={finder_L} @@ -26,7 +26,7 @@ Install [GLM-10B](https://github.com/THUDM/GLM) or [GLM-130B](https://github.com ```bash cd plm bash ./install_glm.sh -bash scripts/generate_block.sh \ +bash ./scripts/generate_block.sh \ config_tasks/model_blocklm_10B_chinese.sh ```