Use llvm::make_unique to avoid ambiguity with MSVC.
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.
Differential Revision: https://reviews.llvm.org/D34144
llvm-svn: 305690
Summary:
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.
Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB
Reviewed By: MatzeB
Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34144
llvm-svn: 305677
Summary:
This patch makes instruction fusion more aggressive by
* adding artificial edges between the successors of FirstSU and
SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps.
* updating PostGenericScheduler::tryCandidate to keep clusters together,
similar to GenericScheduler::tryCandidate.
This change increases the number of AES instruction pairs generated on
Cortex-A57 and Cortex-A72. This doesn't change code at all in
most benchmarks or general code, but we've seen improvement on kernels
using AESE/AESMC and AESD/AESIMC.
Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB
Reviewed By: evandro
Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33230
llvm-svn: 303618
This patch assumes that the dependents to be scanned for the ExitSU are its
predecessors; otherwise, the successors of the instr are scanned.
Furthermore, sometimes the ExitSU was being fused twice, since it may be
fused once when scanning the successors from the beginning of the BB and
then again when scanning the predecessors of ExitSU. Thus, when scanning
the successors of an instr, skip the ExitSU.
llvm-svn: 299974
In order to make it easier to parse information about the performance of
MacroFusion, this patch adds the function and the instruction names to the
debug output of this pass.
llvm-svn: 297504
This feature enables the fusion of such operations on Cortex A57, as
recommended in its Software Optimisation Guide, sections 4.14 and 4.15.
Differential revision: https://reviews.llvm.org/D28698
llvm-svn: 293739
This feature enables the fusion of such operations on Cortex A57, as
recommended in its Software Optimisation Guide, section 4.13, and on Exynos
M1.
Differential revision: https://reviews.llvm.org/D28491
llvm-svn: 293738
This patch moves the class for scheduling adjacent instructions,
MacroFusion, to the target.
In AArch64, it also expands the fusion to all instructions pairs in a
scheduling block, beyond just among the predecessors of the branch at the
end.
Differential revision: https://reviews.llvm.org/D28489
llvm-svn: 293737