Summary:
This test was broken by the tail duplication logic being changed in
r311139, update the test values and add a note about how to properly run
a benchmark to verify that the values are safe to update.
Reviewers: vitalybuka
Reviewed By: vitalybuka
Subscribers: dvyukov, kubamracek
Differential Revision: https://reviews.llvm.org/D36889
llvm-svn: 311189
r307338 enabled new optimization reducing number of operation in tested functions.
There is no any performance regression detectable with TsanRtlTest DISABLED_BENCH.Mop* tests.
llvm-svn: 307739
r299658 fixed a case where InstCombine was replicating instructions instead of combining. Fixing this reduced the number of pushes and pops in the __tsan_read and __tsan_write functions.
Adjust the expectations to account for this after talking to Dmitry Vyukov.
llvm-svn: 299661
We used to depend on host gcc. But some distributions got
new gcc recently which broke the check. Generally, we can't
depend that an arbitrary host gcc generates something stable.
Switch to clang.
This has an additional advantage of catching regressions in
clang codegen.
llvm-svn: 268382
r260695 caused extra push/pop instruction pair in __tsan_read1
implementation. Still, that change in InstCombine is believed to
be good, as it reduces the number of instructions performed.
Adjust the expectations to match the newly generated code.
llvm-svn: 260775
De-hardcode path to TSan-ified executable: pass it as an input to
the scripts. Fix them so that they don't write to the current
directory. Remove their invocation from Makefile.old: they are
broken there anyway, as check_analyze.sh now matches trunk Clang.
llvm-svn: 254936
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897