Commit Graph

9 Commits

Author SHA1 Message Date
Dehao Chen f03f51555a Using branch probability to guide critical edge splitting.
Summary:
The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.

The performance impact on speccpu2006 on Intel sandybridge machines:

spec/2006/fp/C++/444.namd                  25.3  +0.26%
spec/2006/fp/C++/447.dealII               45.96  -0.10%
spec/2006/fp/C++/450.soplex               41.97  +1.49%
spec/2006/fp/C++/453.povray               36.83  -0.96%
spec/2006/fp/C/433.milc                   23.81  +0.32%
spec/2006/fp/C/470.lbm                    41.17  +0.34%
spec/2006/fp/C/482.sphinx3                48.13  +0.69%
spec/2006/int/C++/471.omnetpp             22.45  +3.25%
spec/2006/int/C++/473.astar               21.35  -2.06%
spec/2006/int/C++/483.xalancbmk           36.02  -2.39%
spec/2006/int/C/400.perlbench              33.7  -0.17%
spec/2006/int/C/401.bzip2                  22.9  +0.52%
spec/2006/int/C/403.gcc                   32.42  -0.54%
spec/2006/int/C/429.mcf                   39.59  +0.19%
spec/2006/int/C/445.gobmk                 26.98  -0.00%
spec/2006/int/C/456.hmmer                 24.52  -0.18%
spec/2006/int/C/458.sjeng                 28.26  +0.02%
spec/2006/int/C/462.libquantum            55.44  +3.74%
spec/2006/int/C/464.h264ref               46.67  -0.39%

geometric mean                                   +0.20%

Manually checked 473 and 471 to verify the diff is in the noise range.

Reviewers: rengolin, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24818

llvm-svn: 284757
2016-10-20 18:06:52 +00:00
Dehao Chen 95fc43143d Revert r284545 again as the regression in ppc still exists. There is bug in MBPI exposed by th patch.
Also update the section.ll to fix non-x86 failure.

llvm-svn: 284563
2016-10-19 01:18:25 +00:00
Dehao Chen f8ac3d26d5 Using branch probability to guide critical edge splitting.
Summary:
The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.

The performance impact on speccpu2006 on Intel sandybridge machines:

spec/2006/fp/C++/444.namd                  25.3  +0.26%
spec/2006/fp/C++/447.dealII               45.96  -0.10%
spec/2006/fp/C++/450.soplex               41.97  +1.49%
spec/2006/fp/C++/453.povray               36.83  -0.96%
spec/2006/fp/C/433.milc                   23.81  +0.32%
spec/2006/fp/C/470.lbm                    41.17  +0.34%
spec/2006/fp/C/482.sphinx3                48.13  +0.69%
spec/2006/int/C++/471.omnetpp             22.45  +3.25%
spec/2006/int/C++/473.astar               21.35  -2.06%
spec/2006/int/C++/483.xalancbmk           36.02  -2.39%
spec/2006/int/C/400.perlbench              33.7  -0.17%
spec/2006/int/C/401.bzip2                  22.9  +0.52%
spec/2006/int/C/403.gcc                   32.42  -0.54%
spec/2006/int/C/429.mcf                   39.59  +0.19%
spec/2006/int/C/445.gobmk                 26.98  -0.00%
spec/2006/int/C/456.hmmer                 24.52  -0.18%
spec/2006/int/C/458.sjeng                 28.26  +0.02%
spec/2006/int/C/462.libquantum            55.44  +3.74%
spec/2006/int/C/464.h264ref               46.67  -0.39%

geometric mean                                   +0.20%

Manually checked 473 and 471 to verify the diff is in the noise range.

Reviewers: rengolin, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24818

llvm-svn: 284545
2016-10-18 23:24:02 +00:00
Dehao Chen 62d0e64e9e revert r284541.
llvm-svn: 284544
2016-10-18 23:11:20 +00:00
Dehao Chen ea62ae9844 Using branch probability to guide critical edge splitting.
Summary:
The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.

The performance impact on speccpu2006 on Intel sandybridge machines:

spec/2006/fp/C++/444.namd                  25.3  +0.26%
spec/2006/fp/C++/447.dealII               45.96  -0.10%
spec/2006/fp/C++/450.soplex               41.97  +1.49%
spec/2006/fp/C++/453.povray               36.83  -0.96%
spec/2006/fp/C/433.milc                   23.81  +0.32%
spec/2006/fp/C/470.lbm                    41.17  +0.34%
spec/2006/fp/C/482.sphinx3                48.13  +0.69%
spec/2006/int/C++/471.omnetpp             22.45  +3.25%
spec/2006/int/C++/473.astar               21.35  -2.06%
spec/2006/int/C++/483.xalancbmk           36.02  -2.39%
spec/2006/int/C/400.perlbench              33.7  -0.17%
spec/2006/int/C/401.bzip2                  22.9  +0.52%
spec/2006/int/C/403.gcc                   32.42  -0.54%
spec/2006/int/C/429.mcf                   39.59  +0.19%
spec/2006/int/C/445.gobmk                 26.98  -0.00%
spec/2006/int/C/456.hmmer                 24.52  -0.18%
spec/2006/int/C/458.sjeng                 28.26  +0.02%
spec/2006/int/C/462.libquantum            55.44  +3.74%
spec/2006/int/C/464.h264ref               46.67  -0.39%

geometric mean                                   +0.20%

Manually checked 473 and 471 to verify the diff is in the noise range.

Reviewers: rengolin, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24818

llvm-svn: 284541
2016-10-18 21:36:11 +00:00
Ahmed Bougacha e81610fabb [ARM] Don't generate clrex for pre-v7 targets.
Since r248294, we emit clrex, but it doesn't exist on v6.

llvm-svn: 248640
2015-09-26 00:14:02 +00:00
Ahmed Bougacha 81616a72ea [ARM] Emit clrex in the expanded cmpxchg fail block.
ARM counterpart to r248291:

In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

llvm-svn: 248294
2015-09-22 17:22:58 +00:00
Jonathan Roelofs 44937d98a3 Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.

See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.

llvm-svn: 216138
2014-08-20 23:38:50 +00:00
Logan Chien 63bee2a2bb Replace the result usages while legalizing cmpxchg.
We should update the usages to all of the results;
otherwise, we might get assertion failure or SEGV during
the type legalization of ATOMIC_CMP_SWAP_WITH_SUCCESS
with two or more illegal types.

For example, in the following sequence, both i8 and i1
might be illegal in some target, e.g. armv5, mipsel, mips64el,

    %0 = cmpxchg i8* %ptr, i8 %desire, i8 %new monotonic monotonic
    %1 = extractvalue { i8, i1 } %0, 1

Since both i8 and i1 should be legalized, the corresponding
ATOMIC_CMP_SWAP_WITH_SUCCESS dag will be checked/replaced/updated
twice.

If we don't update the usage to *ALL* of the results in the
first round, the DAG for extractvalue might be processed earlier.
The GetPromotedInteger() will result in assertion failure,
because its operand (i.e. the success bit of cmpxchg) is not
promoted beforehand.

llvm-svn: 213569
2014-07-21 17:33:44 +00:00