to instruction right after the last instruction in the bundle.
- Add a finalizeBundle() variant that doesn't specify LastMI. Instead, the code
will find the last instruction in the bundle by following the 'InsideBundle'
marker. This is useful in case bundles are formed early (i.e. during MI
scheduling) but finalized later (i.e. after register allocator has finished
rewriting virtual registers with physical registers).
llvm-svn: 148444
Load/store instructions w/ a fixup to be relative a function marked as thumb
don't use the low bit to specify thumb vs. non-thumb like interworking
branches do, so don't set it when dealing with those fixups.
rdar://10348687.
llvm-svn: 148366
When set, this bit indicates that a register is completely defined by
the value of its sub-registers.
Use the CoveredBySubRegs property to infer which super-registers are
call-preserved given a list of callee-saved registers. For example, the
ARM registers D8-D15 are callee-saved. This now automatically implies
that Q4-Q7 are call-preserved.
Conversely, Win64 callees save XMM6-XMM15, but the corresponding
YMM6-YMM15 registers are not call-preserved because they are not fully
defined by their sub-registers.
llvm-svn: 148363
(This time I believe I've checked all the -Wreturn-type warnings from GCC & added the couple of llvm_unreachables necessary to silence them. If I've missed any, I'll happily fix them as soon as I know about them)
llvm-svn: 148262
live across BBs before register allocation. This miscompiled 197.parser
when a cmp + b are optimized to a cbnz instruction even though the CPSR def
is live-in a successor.
cbnz r6, LBB89_12
...
LBB89_12:
ble LBB89_1
The fix consists of two parts. 1) Teach LiveVariables that some unallocatable
registers might be liveouts so don't mark their last use as kill if they are.
2) ARM constantpool island pass shouldn't form cbz / cbnz if the conditional
branch does not kill CPSR.
rdar://10676853
llvm-svn: 148168
The QQ and QQQQ registers are not 'real', they are pseudo-registers used
to model some vld and vst instructions.
This makes the call clobber lists longer, but I intend to get rid of
those soon.
llvm-svn: 148151
Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits.
Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12
llvm-svn: 147922
This function runs after all constant islands have been placed, and may
shrink some instructions to their 2-byte forms. This can actually cause
some constant pool entries to move out of range because of growing
alignment padding.
Treat instructions that may be shrunk the same as inline asm - they
erode the known alignment bits.
Also reinstate an old assertion in verify(). It is correct now that
basic block offsets include alignments.
Add a single large test case that will hopefully exercise many parts of
the constant island pass.
<rdar://problem/10670199>
llvm-svn: 147885
On Thumb, the displacement computation hardware uses the address of the
current instruction rouned down to a multiple of 4. Include this
rounding in the UserOffset we compute for each instruction.
When inline asm is present, the instruction alignment may not be known.
Constrain the maximum displacement instead in that case.
This makes it possible for CreateNewWater() and OffsetIsInRange() to
agree about the valid displacements. When they disagree, infinite
looping happens.
As always, test cases for this stuff are insane.
<rdar://problem/10660175>
llvm-svn: 147825
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.
This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.
<rdar://problem/10629774>
llvm-svn: 147712
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.
It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.
<rdar://problem/10625436>
llvm-svn: 147579
Once register allocation has started the reserved registers are frozen.
Fix the ARM canRealignStack() hook to respect the frozen register state.
Now the hook returns false if register allocation was started with frame
pointer elimination enabled.
It also returns false if register allocation started without a reserved
base pointer, and stack realignment would require a base pointer. This
bug was breaking oggenc on armv6.
No test case, an upcoming patch will use this functionality to realign
the stack for spill slots when possible.
llvm-svn: 147578
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.
<rdar://problem/10625436>
llvm-svn: 147487
If anybody has strong feelings about 'default: assert(0 && "blah")' vs
'default: llvm_unreachable("blah")', feel free to regularize the instances of
each in this file.
llvm-svn: 147459
ARM targets with NEON units have access to aligned vector loads and
stores that are potentially faster than unaligned operations.
Add support for spilling the callee-saved NEON registers to an aligned
stack area using 16-byte aligned NEON loads and store.
This feature is off by default, controlled by an -align-neon-spills
command line option.
llvm-svn: 147211
My change r146949 added register clobbers to the eh_sjlj_dispatchsetup pseudo
instruction, but on Thumb1 some of those registers cannot be used. This
caused massive failures on the testsuite when compiling for Thumb1. While
fixing that, I noticed that the eh_sjlj_setjmp instruction has a "nofp"
variant, and I realized that dispatchsetup needs the same thing, so I have
added that as well.
llvm-svn: 147204
The value from the operands isn't right yet, but we weren't encoding it at
all previously. The parser needs to twiddle the values when building the
instruction.
Partial for: rdar://10558523
llvm-svn: 147170
Rather than require the symbol to be explicitly an argument of the directive,
allow it to look ahead and grab the symbol from the next non-whitespace
line.
rdar://10611140
llvm-svn: 147100
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.
Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack. Don't use aligned spill code in that case.
llvm-svn: 146997
We used to rely on the *eh_sjlj_setjmp instructions to mark that a function
with setjmp/longjmp exception handling clobbers all the registers. But with
the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are
expanded away earlier, before PEI can see them to determine what registers to
save and restore. Mark the dispatchsetup instruction in the same way, since
that instruction cannot be expanded early. This also more accurately reflects
when the registers are clobbered.
llvm-svn: 146949
"mov r1, r2, lsl #0" should assemble as "mov r1, r2" even though it's
not strictly legal UAL syntax. It's a common extension and the friendly
thing to do.
rdar://10604663
llvm-svn: 146937
Use information computed while inferring new register classes to emit
accurate, table-driven implementations of getMatchingSuperRegClass().
Delete the old manual, error-prone implementations in the targets.
llvm-svn: 146873
This adjustment is already included in the block offsets computed by
BasicBlockInfo, and adjusting again here can cause the pass to loop.
When CreateNewWater splits a basic block, OffsetIsInRange would reject
the new CPE on the next pass because of the too conservative alignment
adjustment. This caused the block to be split again, and so on.
llvm-svn: 146751
The command line option should be removed, but not until the feature has
gotten a lot of testing. The ARMConstantIslandPass tends to have subtle
bugs that only show up after a while.
llvm-svn: 146739
An aligned constant pool entry may require extra alignment padding where
the new water is created. Take that into account when computing offset.
Also consider the alignment of other constant pool entries when
splitting a basic block. Alignment padding may make it necessary to
move the split point higher.
llvm-svn: 146609
r0 = mov #0
r0 = moveq #1
Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.
llvm-svn: 146583
Work in progress. Parsing for non-writeback, single spaced register lists
works now. The rest have the representations better factored, but still
need more to be able to parse properly.
llvm-svn: 146579
When 'cmp rn #imm' doesn't match due to the immediate not being representable,
but 'cmn rn, #-imm' does match, use the latter in place of the former, as
it's equivalent.
rdar://10552389
llvm-svn: 146567
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
llvm-svn: 146542
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.
Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.
Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.
llvm-svn: 146466
Constant pool entries with different alignment may cause more alignment
padding to be inserted. Compute the amount of padding needed, and try to
pick the location that requires the least amount of padding.
Also take the extra padding into account when the water is above the
use.
llvm-svn: 146458
subdirectories to traverse into.
- Originally I wanted to avoid this and just autoscan, but this has one key
flaw in that new subdirectories can not automatically trigger a rerun of the
llvm-build tool. This is particularly a pain when switching back and forth
between trees where one has added a subdirectory, as the dependencies will
tend to be wrong. This will also eliminates FIXME implicitly.
llvm-svn: 146436
These modifiers simply select either the low or high D subregister of a Neon
Q register. I've also removed the unimplemented 'p' modifier, which turns out
to be a bit different than the comment here suggests and as far as I can tell
was only intended for internal use in Apple's version of gcc.
llvm-svn: 146417
Downgrade the alignment of the initial constant island when constant
pool entries are moved elsewhere.
This is all gated by -arm-align-constant-islands.
llvm-svn: 146391
Order constant pool entries by descending alignment in the initial
island to ensure packing and correct alignment. When the command line
flag is set, also align the basic block containing the constant pool
entries.
This is only a partial implementation of constant island alignment. More
to come.
llvm-svn: 146375
The split point is picked such that the newly created water has the same
alignment as the function. This makes the island suitable for constant
pool entries with potentially higher alignment.
This also fixes an issue where the basic block was split one instruction
too late, causing nonconvergence of the algorithm.
<rdar://problem/10550705>
There is still an issue with correctly packing differently aligned
entries in the island.
llvm-svn: 146314