Go to file
starlee ce85ae5782 done extracting commit and diff 2018-01-29 10:53:16 +08:00
.gitignore init 2018-01-05 10:29:43 +08:00
0_create_table.py able to get user email 2018-01-25 20:54:56 +08:00
0_push_token.py add tokens 2018-01-06 11:06:47 +08:00
1_1_push_repos.py init 2018-01-05 10:29:43 +08:00
1_2_get_pr_lst.py change send_headers for diff 2018-01-11 22:36:05 +08:00
2_pr_push.py alter sleep time 2018-01-06 12:40:19 +08:00
3_pr_info_download.py typo 2018-01-11 22:38:29 +08:00
4_1_push_error_page.py typos 2018-01-20 10:02:04 +08:00
4_2_comp_error_page.py ok 2018-01-26 14:02:54 +08:00
5_0_create_str_table.py done extracting commit and diff 2018-01-29 10:53:16 +08:00
5_1_extract_pr_info.py able to get user email 2018-01-25 20:54:56 +08:00
5_2_extract_cmit_diff_info.py done extracting commit and diff 2018-01-29 10:53:16 +08:00
6_0_create_table.py done extracting commit and diff 2018-01-29 10:53:16 +08:00
6_1_push_user.py ok 2018-01-26 14:02:54 +08:00
6_2_get_gh_user_email.py able to get user email 2018-01-25 20:54:56 +08:00
db_cfg.py init 2018-01-05 10:29:43 +08:00
readme.txt ok 2018-01-26 14:02:54 +08:00
repos_list.csv init 2018-01-05 10:29:43 +08:00
test_token.py typo 2018-01-06 16:52:33 +08:00
tokens.txt add tokens 2018-01-06 11:06:47 +08:00

readme.txt

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

0. 在新服务器上运行0_create_table.py 创建表结构
0. 创建log目录

1. 把token都放到池子里
    python 0_push_token.py 
2. 把待爬项目都放到消息队列中
    python 1_1_push_repos.py 
3. 后台运行程序,并行获取项目消息队列中的列表(d表示编号)
    nohup python 1_2_get_pr_lst.py [d] &
4. 后台运行程序把pr放到消息队列中
    nohup python 2_pr_push.py [commit|diff] &
5. 后台运行程序并行获取pr的相关信息
    nohup python 3_pr_info_download.py [commit|diff] [d] &

##### 补充错误页面
6. 4_*文件

#### 抽取字段
7. 5_0*、5_1*


##### 用户邮件等信息
8. 建表
    6_0*
9.  python 0_push_token.py 
10. nohup python 6_1* &
10. nohup python 6_2* [d] &