D124631 added special processing for STATEPOINT instructions.
It appears that assertion added there is too strong. We can get two
tied operands with the same register tied to different defs. If we
hit such case, do not process it in statepoint-specific code and
delegate it to common case.
STATEPOINT is a special pseudo instruction which represent Moving GC semantic to LLVM.
Every tied def/use VReg pair in STATEPOINT represent same physical register which can
'magically' change during call wrapped by statepoint.
(By construction, tied use operand is not live across STATEPOINT).
This means that when converting into two-address form, there is not need to insert COPY
instruction before stateppoint, what TwoAddressInstruction pass does for 'regular'
instructions.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D124631
Currently we create register mappings for registers used only once in current
MBB. For registers with multiple uses, when all the uses are in the current MBB,
we can also create mappings for them similarly according to the last use.
For example
%reg101 = ...
= ... reg101
%reg103 = ADD %reg101, %reg102
We can create mapping between %reg101 and %reg103.
Differential Revision: https://reviews.llvm.org/D113193
Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D113493
In TwoAddressInstructionPass::processTiedPairs when updating live
intervals after moving the last use of RegB back to the newly inserted
copy, update any affected subranges as well as the main range.
Differential Revision: https://reviews.llvm.org/D110411
This fixes an assertion failure with -early-live-intervals when trying
to update the live intervals for a debug instruction, which don't even
have slot indexes.
Differential Revision: https://reviews.llvm.org/D113116
In TwoAddressInstructionPass::processTiedPairs with
-early-live-intervals, update any preexisting physreg live intervals,
as well as virtreg live intervals. By default (without
-precompute-phys-liveness) physreg live intervals only exist for
registers that are live-in to some basic block.
Differential Revision: https://reviews.llvm.org/D113191
In TwoAddressInstructionPass::processTiedPairs with
-early-live-intervals, update any preexisting physreg live intervals,
as well as virtreg live intervals. By default (without
-precompute-phys-liveness) physreg live intervals only exist for
registers that are live-in to some basic block.
Differential Revision: https://reviews.llvm.org/D113191
In function convertInstTo3Addr, after converting a two address instruction into
three address instruction, only the last new instruction is inserted into
DistanceMap. This is wrong, DistanceMap should track all instructions from the
beginning of current MBB to the working instruction. When a two address
instruction is converted to three address instruction, multiple instructions may
be generated (usually an extra COPY is generated), all of them should be
inserted into DistanceMap.
Similarly when unfolding memory operand in function tryInstructionTransform
DistanceMap is not maintained correctly.
Differential Revision: https://reviews.llvm.org/D111857
With the -early-live-intervals command line flag,
TwoAddressInstructionPass::runOnMachineFunction would call
MachineFunction::verify before returning to check the live intervals.
But there was not much benefit to doing this since -verify-machineinstrs
and LLVM_ENABLE_EXPENSIVE_CHECKS provide a more general way of
scheduling machine verification after every pass.
Also it caused problems on targets like Lanai which are marked as "not
machine verifier clean", since verification would fail for known
target-specific problems which are nothing to do with LiveIntervals.
Differential Revision: https://reviews.llvm.org/D111618
This patch contains following enhancements to SrcRegMap and DstRegMap:
1 In findOnlyInterestingUse not only check if the Reg is two address usage,
but also check after commutation can it be two address usage.
2 If a physical register is clobbered, remove SrcRegMap entries that are
mapped to it.
3 In processTiedPairs, when create a new COPY instruction, add a SrcRegMap
entry only when the COPY instruction is coalescable. (The COPY src is
killed)
With these enhancements isProfitableToCommute can do better commute decision,
and finally more register copies are removed.
Differential Revision: https://reviews.llvm.org/D108731
Fix the calculation of ReplacedAllUntiedUses when any of the tied defs
are early-clobber. The effect of this is to fix the placement of kill
flags on an instruction like this (from @f2 in
test/CodeGen/SystemZ/asm-18.ll):
INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], killed %4:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit
After TwoAddressInstruction without this patch:
%3:grh32bit = COPY killed %4:grh32bit
INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit
Note that the COPY kills %4, even though there is a later use of %4 in
the INLINEASM. This fails machine verification if you force it to run
after TwoAddressInstruction (currently it is disabled for other
reasons).
After TwoAddressInstruction with this patch:
%3:grh32bit = COPY %4:grh32bit
INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit
Differential Revision: https://reviews.llvm.org/D110848
In collectTiedOperands, when handling an undef use that is tied to a
def, constrain the dst reg with the actual register class of the src
reg, instead of with the register class from the instructions's
MCInstrDesc. This makes a difference in some AMDGPU test cases like
this, before:
%16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1
After, without this patch:
undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
This fails machine verification if you force it to run after
TwoAddressInstruction (currently it is disabled) with:
*** Bad machine code: Invalid register class for subregister index ***
- function: s_load_constant_v3i32_align4
- basic block: %bb.0 (0xa011a88)
- instruction: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
- operand 0: undef %16.sub0_sub1:sgpr_96
Register class SGPR_96 does not fully support subreg index 4
After, with this patch:
undef %16.sub0_sub1:sgpr_96_with_sub0_sub1 = COPY killed %11:sreg_64_xexec
See also svn r159120 which introduced the code to handle tied undef
uses.
Differential Revision: https://reviews.llvm.org/D110944
Call RemoveMachineInstrFromMaps before erasing instrs.
repairIntervalsInRange will do this for you after erasing the
instruction, but it's not safe to rely on it because assertions in
SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased
instruction.
This fixes asan buildbot failures caused by D110335.
In TwoAddressInstructionPass::processTiedPairs, update subranges of the
live interval for RegB as well as the main range.
This is a small step towards switching TwoAddressInstructionPass over
from LiveVariables to LiveIntervals. Currently this path is only tested
if you explicitly enable -early-live-intervals.
Differential Revision: https://reviews.llvm.org/D110526
After TwoAddressInstructionPass calls
TargetInstrInfo::convertToThreeAddress, improve the LiveIntervals repair
to cope with convertToThreeAddress creating more than one new
instruction.
This mostly seems to benefit X86. For example in
test/CodeGen/X86/zext-trunc.ll it converts:
%4:gr32 = ADD32rr %3:gr32(tied-def 0), %2:gr32, implicit-def dead $eflags
to:
undef %6.sub_32bit:gr64 = COPY %3:gr32
undef %7.sub_32bit:gr64_nosp = COPY %2:gr32
%4:gr32 = LEA64_32r killed %6:gr64, 1, killed %7:gr64_nosp, 0, $noreg
Differential Revision: https://reviews.llvm.org/D110335
Call RemoveMachineInstrFromMaps before erasing instrs.
repairIntervalsInRange will do this for you after erasing the
instruction, but it's not safe to rely on it because assertions in
SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased
instruction.
This fixes asan buildbot failures caused by D110328.
This simplifies the API and addresses a FIXME in
TwoAddressInstructionPass::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D110229
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register
of ADD).
Differential Revision: https://reviews.llvm.org/D104684
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register of ADD).
Differential Revision: https://reviews.llvm.org/D101970
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register
of ADD).
Differential Revision: https://reviews.llvm.org/D101970
The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:
1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D95982
This patch touches two optimizations, TwoAddressInstruction and X86's
FixupLEAs pass, both of which optimize by re-creating instructions. For
LEAs, various bits of arithmetic are better represented as LEAs on X86,
while TwoAddressInstruction sometimes converts instrs into three address
instructions if it's profitable.
For debug instruction referencing, both of these require substitutions to
be created -- the old instruction number must be pointed to the new
instruction number, as illustrated in the added test. If this isn't done,
any variable locations based on the optimized instruction are
conservatively dropped.
Differential Revision: https://reviews.llvm.org/D85756
In `rescheduleKillAboveMI`, current implementation uses `SmallSet` to track reg's defs and uses. When comparing, use `SmallSet.count` to find out if it's clobbered or used. It's not correct if involving subregisters. This patch uses `regOverlapsSet` already used by `rescheduleMIBelowKill` to fix the issue.
Fixed https://bugs.llvm.org/show_bug.cgi?id=47707.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D88716
This function has a bug which will incorrectly reschedule instructions
after an INLINEASM_BR (which can branch). (The bug may also allow
scheduling past a throwing-CALL, I'm not certain.)
I could fix that bug, but, as the removed FIXME notes, it's better to
attempt rescheduling before converting to 3-addr form, as that may
remove the need to convert in the first place. In fact, the code to do
such reordering was added to this pass only a few months later, in
2011, via the addition of the function rescheduleMIBelowKill. That
code does not contain the same bug.
The removal of the sink3AddrInstruction function is not a no-op: in
some cases it would move an instruction post-conversion, when
rescheduleMIBelowKill would not move the instruction pre-converison.
However, this does not appear to be important: the machine instruction
scheduler can reorder the after-conversion instructions, in any case.
This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when
built with clang after 4b0aa5724f.
Link: https://github.com/ClangBuiltLinux/linux/issues/1085
Differential Revision: https://reviews.llvm.org/D83708
Summary:
Currently, MachineVerifier will attempt to verify that tied operands
satisfy register constraints as soon as the function is no longer in
SSA form. However, PHIElimination will take the function out of SSA
form while TwoAddressInstructionPass will actually rewrite tied operands
to match the constraints. PHIElimination runs first in the pipeline.
Therefore, whenever the MachineVerifier is run after PHIElimination,
it will encounter verification errors on any tied operands.
This patch adds a function property called TiedOpsRewritten that will be
set by TwoAddressInstructionPass and will control when the verifier checks
tied operands.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D80538
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
The new instruciton might have less operands than the original instruction. If we don't resample, the next loop iteration might read an operand that doesn't exist.
X86 can commute blends to movss/movsd which reduces from 4 operands to 3. This happened in the test case that caused r354363 & company to be reverted. A reduced version of that has been committed here.
Really this whole checking for more commutable operands is a little fragile. It assumes that the new instructions operands are the same order and positions as the original except for the pair that was swapped. I don't know of anything that breaks this assumption today, but I've left a fixme. Fixing this will likely require an interface change.
llvm-svn: 354738
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
A DBG_VALUE between a two-address instruction and a following COPY
would prevent rescheduleMIBelowKill optimization inside
TwoAddressInstructionPass.
Differential Revision: https://reviews.llvm.org/D55987
llvm-svn: 350289