Go to file
zhangxunhui 81abe9b497 remove other files 2022-10-18 18:45:10 +08:00
bin add similarity to clone_relations 2022-08-13 16:24:04 +08:00
models add id attribute to RepoInfo class 2022-08-17 10:26:21 +08:00
sql_templates Merge pull request '截至到8月21日完成的任务' (#43) from zy into master 2022-09-02 23:13:41 +08:00
.flake8 update the method function relation extraction & final clone relation extraction 2022-08-14 16:05:37 +08:00
.gitignore finish fix the method function relations and get all popular java projects 2022-08-31 22:51:34 +08:00
.isort.cfg fix bug 2022-09-17 00:13:25 +08:00
.pre-commit-config.yaml 添加了pre-commit的hook,并增加了flake8, black, isoort的检查 2022-08-01 08:44:04 +08:00
BlobCommitRelationExtractor.py update all the tables and break points 2022-08-17 18:40:20 +08:00
BlobExtractor.py update all the tables and break points 2022-08-17 18:40:20 +08:00
BlobMethodExtractor.py update all the tables and break points 2022-08-17 18:40:20 +08:00
CloneOperator.py fix the riskevaluator 2022-09-02 23:11:16 +08:00
CloneRelationFunctionExtractor.py merge 2022-09-05 15:53:49 +08:00
CommitExtractor.py fix the bug of repeat commit and author name 2022-09-19 15:08:31 +08:00
CommitRelationExtractor.py merge 2022-09-05 15:53:49 +08:00
ConfigOperator.py update the extraction of blob_methods and clone_relations 2022-08-13 15:45:26 +08:00
FileOperator.py insert id 2022-08-19 13:08:19 +08:00
FunctionIdUpdater.py update all the tables and break points 2022-08-17 18:40:20 +08:00
GitOperator.py fix some problems 2022-08-27 22:10:44 +08:00
GlobalConstants.py Merge pull request '将riskevaluation作为一个步骤补充到了step中' (#49) from zy into master 2022-09-03 21:50:41 +08:00
GraphExtractor.py fix bug 2022-08-29 15:39:41 +08:00
MethodFunctionRelationExtractor.py finish fix the method function relations and get all popular java projects 2022-08-31 22:51:34 +08:00
MySQLOperator.py 去除了异步refresh操作 2022-09-03 23:54:26 -04:00
PointerOperator.py add repositories table for recording all the repositories 2022-08-17 09:56:18 +08:00
RCDMain.py change delete project operation from ownername reponame to id 2022-08-17 11:04:36 +08:00
README.md fix the repository 2022-08-19 01:00:38 +08:00
RepoExecutor.py fix the riskevaluation 2022-09-05 16:56:41 +08:00
RiskEvaluator.py add the handled in risk evaluate 2022-09-13 08:40:03 +08:00
TimeOperator.py change timestamp to int 2022-08-15 09:34:58 +08:00
config.template.yml merge 2022-09-05 15:53:49 +08:00
deleteProject.py fix the risk and delete 2022-09-05 17:15:57 +08:00
delete_handled_blobs.py delete handled blobs 2022-09-05 11:53:06 +08:00
delete_repos.example insert id 2022-08-19 13:08:19 +08:00
factorExtractor.py modified the factorextractor 2022-09-19 16:38:48 +08:00
repos.example add the risk evaluation as a step 2022-09-03 11:26:23 +08:00
requirements.txt add asyncio executor ofmysql connection refresher 2022-09-03 10:15:44 +08:00
utils.py 聚集了一下文件操作 2022-08-05 21:38:39 +08:00

README.md

RCD: Risky Clone Detection

This is a project for finding factors related to bad clones.

Install environments:

  • git:
  • python:
    • create python virtual environment based on Anaconda using command conda create -n bad_clone python=3.7.11.
    • activate the environment using command conda activate bad_clone
    • install dependent python packages using command pip install -r requirements.txt
  • Mysql:
    • this project uses Mysql 8.0.30
    • copy the configuration template and rename it using commandcp ./config.template.yml ./config.yml
    • set the section of the config with the hints in the template
  • Java
    • To run the clone detector NIL, jdk 1.8+ is needed.

run the project

  1. Start collecting data for repositories by running the following commands:
git clone https://gitlink.org.cn/MillerEvan/bad_clone_prediction.git
cp repos.example repos # You need to add your own repositories in the repos file
conda activate bad_clone
python RCDMain.py
  1. Delete the data for specific repositories:
cp delete_repos.example delete_repos # You need to add your own repositories in the delete_repos file
python deleteProject.py