llvm-project/compiler-rt/test/fuzzer/merge_two_step.test

RUN: %cpp_compiler %S/FullCoverageSetTest.cpp -o %t-FullCoverageSetTest

RUN: rm -rf %t/T0 %t/T1 %t/T2
RUN: mkdir -p %t/T0 %t/T1 %t/T2
RUN: echo F..... > %t/T1/1
RUN: echo .U.... > %t/T1/2
RUN: echo ..Z... > %t/T1/3

# T1 has 3 elements, T0 is empty.
RUN: rm -f %t/MCF
RUN: %run %t-FullCoverageSetTest -merge=1 -merge_control_file=%t/MCF %t/T0 %t/T1 2>&1 | FileCheck %s --check-prefix=CHECK1
CHECK1: MERGE-OUTER: 3 files, 0 in the initial corpus
CHECK1: MERGE-OUTER: 3 new files with {{.*}} new features added; 11 new coverage edges

RUN: echo ...Z.. > %t/T2/1
RUN: echo ....E. > %t/T2/2
RUN: echo .....R > %t/T2/3
RUN: echo F..... > %t/T2/a

RUN: rm -rf %t/T0
RUN: mkdir -p %t/T0

# T1 has 3 elements, T2 has 4 elements, T0 is empty.
RUN: %run %t-FullCoverageSetTest -merge=1 -merge_control_file=%t/MCF %t/T0 %t/T1 %t/T2 2>&1 | FileCheck %s --check-prefix=CHECK2
CHECK2: MERGE-OUTER: non-empty control file provided
CHECK2: MERGE-OUTER: control file ok, 3 files total, first not processed file 3
CHECK2: MERGE-OUTER: starting merge from scratch, but reusing coverage information from the given control file
CHECK2: MERGE-OUTER: 7 files, 0 in the initial corpus, 3 processed earlier
CHECK2: MERGE-INNER: using the control file 
CHECK2: MERGE-INNER: 4 total files; 0 processed earlier; will process 4 files now
CHECK2: MERGE-OUTER: 6 new files with {{.*}} new features added; 14 new coverage edges
[libFuzzer] Make -merge=1 to reuse coverage information from the control file. Summary: This change allows to perform corpus merging in two steps. This is useful when the user wants to address the following two points simultaneously: 1) Get trustworthy incremental stats for the coverage and corpus size changes when adding new corpus units. 2) Make sure the shorter units will be preferred when two or more units give the same unique signal (equivalent to the `REDUCE` logic). This solution was brainstormed together with @kcc, hopefully it looks good to the other people too. The proposed use case scenario: 1) We have a `fuzz_target` binary and `existing_corpus` directory. 2) We do fuzzing and write new units into the `new_corpus` directory. 3) We want to merge the new corpus into the existing corpus and satisfy the points mentioned above. 4) We create an empty directory `merged_corpus` and run the first merge step: ` ./fuzz_target -merge=1 -merge_control_file=MCF ./merged_corpus ./existing_corpus ` this provides the initial stats for `existing_corpus`, e.g. from the output: ` MERGE-OUTER: 3 new files with 11 new features added; 11 new coverage edges ` 5) We recreate `merged_corpus` directory and run the second merge step: ` ./fuzz_target -merge=1 -merge_control_file=MCF ./merged_corpus ./existing_corpus ./new_corpus ` this provides the final stats for the merged corpus, e.g. from the output: ` MERGE-OUTER: 6 new files with 14 new features added; 14 new coverage edges ` Alternative solutions to this approach are: A) Store precise coverage information for every unit (not only unique signal). B) Execute the same two steps without reusing the control file. Either of these would be suboptimal as it would impose an extra disk or CPU load respectively, which is bad given the quadratic complexity in the worst case. Tested on Linux, Mac, Windows. Reviewers: morehouse, metzman, hctim, kcc Reviewed By: morehouse Subscribers: JDevlieghere, delcypher, mgrang, #sanitizers, llvm-commits, kcc Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D66107 llvm-svn: 371620 2019-09-11 22:11:08 +08:00			`RUN: %cpp_compiler %S/FullCoverageSetTest.cpp -o %t-FullCoverageSetTest`

			`RUN: rm -rf %t/T0 %t/T1 %t/T2`
			`RUN: mkdir -p %t/T0 %t/T1 %t/T2`
			`RUN: echo F..... > %t/T1/1`
			`RUN: echo .U.... > %t/T1/2`
			`RUN: echo ..Z... > %t/T1/3`

			`# T1 has 3 elements, T0 is empty.`
			`RUN: rm -f %t/MCF`
			`RUN: %run %t-FullCoverageSetTest -merge=1 -merge_control_file=%t/MCF %t/T0 %t/T1 2>&1 \| FileCheck %s --check-prefix=CHECK1`
			`CHECK1: MERGE-OUTER: 3 files, 0 in the initial corpus`
[libFuzzer] Remove hardcoded number of new features in merge_two_step.test. Summary: The number of features can be different on different platforms. This should fixed broken builders, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/7946 Reviewers: Dor1s Reviewed By: Dor1s Subscribers: kristof.beyls, delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D67458 llvm-svn: 371647 2019-09-12 03:43:03 +08:00			`CHECK1: MERGE-OUTER: 3 new files with {{.*}} new features added; 11 new coverage edges`
[libFuzzer] Make -merge=1 to reuse coverage information from the control file. Summary: This change allows to perform corpus merging in two steps. This is useful when the user wants to address the following two points simultaneously: 1) Get trustworthy incremental stats for the coverage and corpus size changes when adding new corpus units. 2) Make sure the shorter units will be preferred when two or more units give the same unique signal (equivalent to the `REDUCE` logic). This solution was brainstormed together with @kcc, hopefully it looks good to the other people too. The proposed use case scenario: 1) We have a `fuzz_target` binary and `existing_corpus` directory. 2) We do fuzzing and write new units into the `new_corpus` directory. 3) We want to merge the new corpus into the existing corpus and satisfy the points mentioned above. 4) We create an empty directory `merged_corpus` and run the first merge step: ` ./fuzz_target -merge=1 -merge_control_file=MCF ./merged_corpus ./existing_corpus ` this provides the initial stats for `existing_corpus`, e.g. from the output: ` MERGE-OUTER: 3 new files with 11 new features added; 11 new coverage edges ` 5) We recreate `merged_corpus` directory and run the second merge step: ` ./fuzz_target -merge=1 -merge_control_file=MCF ./merged_corpus ./existing_corpus ./new_corpus ` this provides the final stats for the merged corpus, e.g. from the output: ` MERGE-OUTER: 6 new files with 14 new features added; 14 new coverage edges ` Alternative solutions to this approach are: A) Store precise coverage information for every unit (not only unique signal). B) Execute the same two steps without reusing the control file. Either of these would be suboptimal as it would impose an extra disk or CPU load respectively, which is bad given the quadratic complexity in the worst case. Tested on Linux, Mac, Windows. Reviewers: morehouse, metzman, hctim, kcc Reviewed By: morehouse Subscribers: JDevlieghere, delcypher, mgrang, #sanitizers, llvm-commits, kcc Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D66107 llvm-svn: 371620 2019-09-11 22:11:08 +08:00
			`RUN: echo ...Z.. > %t/T2/1`
			`RUN: echo ....E. > %t/T2/2`
			`RUN: echo .....R > %t/T2/3`
			`RUN: echo F..... > %t/T2/a`

			`RUN: rm -rf %t/T0`
			`RUN: mkdir -p %t/T0`

			`# T1 has 3 elements, T2 has 4 elements, T0 is empty.`
			`RUN: %run %t-FullCoverageSetTest -merge=1 -merge_control_file=%t/MCF %t/T0 %t/T1 %t/T2 2>&1 \| FileCheck %s --check-prefix=CHECK2`
			`CHECK2: MERGE-OUTER: non-empty control file provided`
			`CHECK2: MERGE-OUTER: control file ok, 3 files total, first not processed file 3`
			`CHECK2: MERGE-OUTER: starting merge from scratch, but reusing coverage information from the given control file`
			`CHECK2: MERGE-OUTER: 7 files, 0 in the initial corpus, 3 processed earlier`
			`CHECK2: MERGE-INNER: using the control file`
			`CHECK2: MERGE-INNER: 4 total files; 0 processed earlier; will process 4 files now`
[libFuzzer] Remove hardcoded number of new features in merge_two_step.test. Summary: The number of features can be different on different platforms. This should fixed broken builders, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/7946 Reviewers: Dor1s Reviewed By: Dor1s Subscribers: kristof.beyls, delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D67458 llvm-svn: 371647 2019-09-12 03:43:03 +08:00			`CHECK2: MERGE-OUTER: 6 new files with {{.*}} new features added; 14 new coverage edges`