r363757 renamed ExpandISelPseudo to FinalizeISel, so the RUN line in
select-must-add-unconditional-jump.mir needed updating to refer to finalize-isel.
llvm-svn: 365108
When these tests were originally written, the middle end would introduce
an unnecessary copy from r24:r23->GPR16->r24:r23, and these tests
mistakenly relied on it.
The most optimal codegen for the functions in the test cases before this patch
would be NOPs. This is because the first i16 argument always gets the same register
allocation as an i16 return value in the AVR calling convention.
These tests broke in r362963 when the codegen was improved and the
redundant copy was eliminated. After this, the test functions
were lowered to their optimal form - a 'ret' and nothing else.
This patch prepends an extra i16 operand to each of the test functions
so that a 16-bit copy must be inserted for the program to be correct.
llvm-svn: 363131
Summary:
LDWRdPtr would be expanded to ld+ldd. ldd only accepts the pointer register is Y or Z.
So the register class of pointer of LDWRdPtr should be PTRDISPREGS instead of PTRREGS.
Reviewers: dylanmckay
Reviewed By: dylanmckay
Subscribers: dylanmckay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62300
llvm-svn: 362351
If we would allow register coalescing on PTRDISPREGS class then register
allocator can lock Z register to some virtual register. Larger instructions
requiring a memory acces then fail during the register allocation phase since
there is no available register to hold a pointer if Y register was already
taken for a stack frame. This patch prevents it by keeping Z register
spillable. It does it by not allowing coalescer to lock it.
Original discussion on https://github.com/avr-rust/rust/issues/128.
llvm-svn: 362298
Summary:
The endianess used in the calling convention does not always match the
endianess of the target on all architectures, namely AVR.
When an argument is too large to be legalised by the architecture and is
split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian
is queried to find the endianess that function arguments must be laid
out in.
This approach was recommended by Eli Friedman.
Originally reported in https://github.com/avr-rust/rust/issues/129.
Patch by Carl Peto.
Reviewers: bogner, t.p.northover, RKSimon, niravd, efriedma
Reviewed By: efriedma
Subscribers: JDevlieghere, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62003
llvm-svn: 361222
Summary:
A number of optimizations are inhibited by single-use TokenFactors not
being merged into the TokenFactor using it. This makes we consider if
we can do the merge immediately.
Most tests changes here are due to the change in visitation causing
minor reorderings and associated reassociation of paired memory
operations.
CodeGen tests with non-reordering changes:
X86/aligned-variadic.ll -- memory-based add folded into stored leaq
value.
X86/constant-combiners.ll -- Optimizes out overlap between stores.
X86/pr40631_deadstore_elision -- folds constant byte store into
preceding quad word constant store.
Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet
Reviewed By: courbet
Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59260
llvm-svn: 356068
This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.
Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.
The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.
This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.
Thanks to Carl Peto for reporting this bug.
This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.
llvm-svn: 351721
This reverts commit r351718.
Carl pointed out that the unit test could be improved.
This patch will be recommitted once the test is made more resilient.
llvm-svn: 351719
This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.
Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.
The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.
This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.
Thanks to Carl Peto for reporting this bug.
This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.
llvm-svn: 351718
Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:
ld $GPR8, [PTR:XYZ]+
ld $GPR8, [PTR]+1
This has a problem; the [PTR] is incremented in-place once, but never
decremented.
Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.
This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.
Bug first reported by Keshav Kini.
Patch by Kaushik Phatak.
llvm-svn: 351673
This reverts commit r351544.
In that commit, I had mistakenly misattributed the issue submitter as
the patch author, Kaushik Phatak.
The patch will be recommitted immediately with the correct attribution.
llvm-svn: 351672
Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:
ld $GPR8, [PTR:XYZ]+
ld $GPR8, [PTR]+1
This has a problem; the [PTR] is incremented in-place once, but never
decremented.
Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.
This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.
Patch by Keshav Kini.
llvm-svn: 351544
We should not pre-scheduled the node has ADJCALLSTACKDOWN parent,
or else, when bottom-up scheduling, ADJCALLSTACKDOWN and
ADJCALLSTACKUP may hold CallResource too long and make other
calls can't be scheduled. If there's no other available node
to schedule, the scheduler will try to rename the register by
creating copy to avoid the conflict which will fail because
CallResource is not a real physical register.
llvm-svn: 351527
This change modifies the LLVM ISel lowering settings so that
8-bit/16-bit multiplication is expanded to calls into the compiler
runtime library if the MCU being targeted does not support
multiplication in hardware.
Before this, MUL instructions would be generated on CPUs like the
ATtiny85, triggering a CPU reset due to an illegal instruction at
runtime.
First raised in https://github.com/avr-rust/rust/issues/124.
llvm-svn: 351523
In r346432 ("[DAGCombine] Improve alias analysis for chain of independent stores"),
the order of ldi/sts blocks changed.
The new IR is equivalent to the old IR.
This patch updates the test to fix the test suite.
llvm-svn: 346565
This patch fixes a bug in the AVR FRMIDX expansion logic.
The expansion would leave a leftover operand from the original FRMIDX,
but now attached to a MOVWRdRr instruction. The MOVWRdRr instruction
did not expect this operand and so LLVM rejected the machine
instruction.
This would trigger an assertion:
Assertion failed: ((isImpReg || Op.isRegMask() || MCID->isVariadic() ||
OpNo < MCID->getNumOperands() || isMetaDataOp) &&
"Trying to add an operand to a machine instr that is already done!"),
function addOperand, file llvm/lib/CodeGen/MachineInstr.cpp
Tim fixed this so that now the FRMIDX is expanded correctly into
a well-formed MOVWRdRr.
Patch by Tim Neumann
llvm-svn: 346117
This is an AVR-specific workaround for a limitation of the register
allocator that only exposes itself on targets with high register
contention like AVR, which only has three pointer registers.
The three pointer registers are X, Y, and Z.
In most nontrivial functions, Y is reserved for the frame pointer,
as per the calling convention. This leaves X and Z. Some instructions,
such as LPM ("load program memory"), are only defined for the Z
register. Sometimes this just leaves X.
When the backend generates a LDDWRdPtrQ instruction with Z as the
destination pointer, it usually trips up the register allocator
with this error message:
LLVM ERROR: ran out of registers during register allocation
This patch is a hacky workaround. We ban the LDDWRdPtrQ instruction
from ever using the Z register as an operand. This gives the
register allocator a bit more space to allocate, fixing the
regalloc exhaustion error.
Here is a description from the patch author Peter Nimmervoll
As far as I understand the problem occurs when LDDWRdPtrQ uses
the ptrdispregs register class as target register. This should work, but
the allocator can't deal with this for some reason. So from my testing,
it seams like (and I might be totally wrong on this) the allocator reserves
the Z register for the ICALL instruction and then the register class
ptrdispregs only has 1 register left and we can't use Y for source and
destination. Removing the Z register from DREGS fixes the problem but
removing Y register does not.
More information about the bug can be found on the avr-rust issue
tracker at https://github.com/avr-rust/rust/issues/37.
A bug has raised to track the removal of this workaround and a proper
fix; PR39553 at https://bugs.llvm.org/show_bug.cgi?id=39553.
Patch by Peter Nimmervoll
llvm-svn: 346114
Commit r343851 changed the format of the generated instructions.
An unnecessary load has been removed. Previously, a value would be moved
from r24 into a temporary register just to be copied into r30 before the
indirect call. Now, codegen immediately loads r24 into r30, saving a
MOVW instruction.
llvm-svn: 344111
The 'rol Rd' instruction is equivalent to 'adc Rd'.
This caused compile warnings from tablegen because of conflicting bits
shared between each instruction.
llvm-svn: 341275
This sets trackLivenessAfterRegAlloc on AVRRegisterInfo.
Most existing targets set this flag. Without it, specific IR inputs
cause LLVM to fail with:
Assertion failed: (getParent()->getProperties().hasProperty( MachineFunctionProperties::Property::TracksLiveness) &&
"Liveness information is accurate"), function livein_begin
file MachineBasicBlock.cpp, line 1354.
With this commit, this no longer happens.
Patch by Peter Nimmervoll.
llvm-svn: 334409
The test is taken from
https://github.com/avr-rust/rust/issues/57
The originally implementation of struct return lowering was made in
r325474.
Patch by Peter Nimmervoll
llvm-svn: 327967
This patch adds i128 division support by instruction LLVM to lower
128-bit divisions to the __udivmodti4 and __divmodti4 rtlib functions.
This also adds test for 64-bit division and 128-bit division.
Patch by Peter Nimmervoll.
llvm-svn: 327814
Before I started maintaining the AVR backend, this instruction
never originally used to have an earlyclobber flag.
Some time afterwards (years ago), I must've added it back in, not realising that it
was left out for a reason.
This pseudo instrction exists solely to work around a long standing bug
in the register allocator.
Before this commit, the LDDWRdYQ pseudo was not actually working around
any bug. With the earlyclobber flag removed again, the LDDWRdYQ pseudo
now correctly works around PR13375 again.
llvm-svn: 326774
Instead of:
%bb.1: derived from LLVM BB %for.body
print:
bb.1.for.body:
Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.
llvm-svn: 324563
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).
In these architectures, all types are aligned to 8-bits.
After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.
This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html
Reviewers: bogner, nemanjai, joerg, efriedma
Reviewed By: efriedma
Subscribers: efriedma, cactus, llvm-commits
Differential Revision: https://reviews.llvm.org/D39946
llvm-svn: 320243
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
llvm-svn: 319665
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
llvm-svn: 319427
This test was originally added when an old bug was fixed that caused
broken iterator code to break basic block placement.
The issue has an extremely low chance of every being a problem again.
This specific test is very flaky and fails often due to upstream
changes.
I have removed this test because it negates more value than it returns.
llvm-svn: 318134
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.
Patch by Thomas Backman.
llvm-svn: 314891
In some cases, the code generator attempts to generate instructions such as:
lddw r24, Y+63
which expands to:
ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary
This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.
Patch by Thomas Backman.
llvm-svn: 314890
Also enables '__do_clear_bss'.
These functions are automaticalled called by the CRT if they are
declared.
We need these to be called otherwise RAM will start completely
uninitialised, even though we need to copy RAM variables from progmem to
RAM.
llvm-svn: 312905