Commit Graph

344 Commits

Author SHA1 Message Date
Jim Grosbach 7471937ad7 Make tBX_RET and tBX_RET_vararg predicable.
The normal tBX instruction is predicable, so there's no reason the
pseudos for using it as a return shouldn't be. Gives us some nice code-gen
improvements as can be seen by the test changes. In particular, several
tests now have to disable if-conversion because it works too well and defeats
the test.

llvm-svn: 134746
2011-07-08 21:50:04 +00:00
Jakob Stoklund Olesen bbe2a5cfff Fix more register allocation sensitive tests.
llvm-svn: 134667
2011-07-08 00:24:06 +00:00
Evan Cheng 8b2bda09a5 Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests.
llvm-svn: 134590
2011-07-07 03:55:05 +00:00
Chandler Carruth a6e593b4eb FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and
makes one of the tests actually mean something (as the string 'add' will
always appear in the output of this file).

llvm-svn: 134358
2011-07-02 21:34:52 +00:00
Jim Grosbach cf1464d943 ARMv7M vs. ARMv7E-M support.
The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."

Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.

rdar://9572992

llvm-svn: 134261
2011-07-01 21:12:19 +00:00
Jim Grosbach b98ab91e39 Thumb1 register to register MOV instruction is predicable.
Fix a FIXME and allow predication (in Thumb2) for the T1 register to
register MOV instructions. This allows some better codegen with
if-conversion (as seen in the test updates), plus it lays the groundwork
for pseudo-izing the tMOVCC instructions.

llvm-svn: 134197
2011-06-30 22:10:46 +00:00
Jim Grosbach 353da73186 Pseudo-ize the t2LDMIA_RET instruction.
It's just a t2LDMIA_UPD instruction with extra codegen properties, so it
doesn't need the encoding information. As a side-benefit, we now correctly
recognize for instruction printing as a 'pop' instruction.

llvm-svn: 134173
2011-06-30 18:25:42 +00:00
Benjamin Kramer d2a84f6a63 Don't depend on the optimization reverted in r134067.
llvm-svn: 134068
2011-06-29 14:07:18 +00:00
Chris Lattner 8936d2bfbc Remove support for parsing the "type i32" syntax for defining a numbered
top level type without a specified number.  This syntax isn't documented
and blocks forward progress.

llvm-svn: 133371
2011-06-19 00:03:46 +00:00
Chris Lattner 80ed9dc9e5 rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is
for pre-2.9 bitcode files.  We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.

As usual, updating the testsuite is a PITA.

llvm-svn: 133337
2011-06-18 06:05:24 +00:00
Jakob Stoklund Olesen 831ae0105a Switch ARM to using AltOrders instead of MethodBodies.
This slightly changes the GPR allocation order on Darwin where R9 is not
a callee-saved register:

Before: %R0 %R1 %R2 %R3 %R12 %R9 %LR %R4 %R5 %R6 %R8 %R10 %R11
After:  %R0 %R1 %R2 %R3 %R9 %R12 %LR %R4 %R5 %R6 %R8 %R10 %R11
llvm-svn: 133326
2011-06-18 01:14:46 +00:00
Chris Lattner b90ed2233c manually upgrade a bunch of tests to modern syntax, and remove some that
are either unreduced or only test old syntax.

llvm-svn: 133228
2011-06-17 03:14:27 +00:00
Rafael Espindola 844485af13 Implement Jakob's suggestion on how to detect fall thought without calling
AnalyzeBranch.

llvm-svn: 132981
2011-06-14 06:08:32 +00:00
Rafael Espindola defd4b0875 AnalyzeBranch doesn't change which successors a bb has, just the order
we try to branch to them.

Before we were creating successor lists with duplicated entries. Fixing that
found a bug in isBlockOnlyReachableByFallthrough that would causes it to
return the wrong answer for

-----------
...
jne foo
jmp bar

foo:
----------

llvm-svn: 132882
2011-06-12 03:20:32 +00:00
Cameron Zwarich 2e252de512 Fix an issue where the two-address conversion pass incorrectly rewrites untied
operands to an early clobber register. This fixes <rdar://problem/9566076>.

llvm-svn: 132738
2011-06-07 23:54:00 +00:00
Jakob Stoklund Olesen b8bf3c0f8b Switch AllocationOrder to using RegisterClassInfo instead of a BitVector
of reserved registers.

Use RegisterClassInfo in RABasic as well. This slightly changes som
allocation orders because RegisterClassInfo puts CSR aliases last.

llvm-svn: 132581
2011-06-03 20:34:53 +00:00
Stuart Hastings aa02c0847d Since I can't reproduce the failures from 131261, re-trying with a
simplified version.  <rdar://problem/9298790>

llvm-svn: 131274
2011-05-13 00:51:54 +00:00
Stuart Hastings 8d57d8ea64 Revert 131266 and 131261 due to buildbot complaints.
rdar://problem/9298790

llvm-svn: 131269
2011-05-13 00:15:17 +00:00
Stuart Hastings ef4940254f Tweak 131261 (thumb2-cbnz.ll) to generate the intended cbnz.
rdar://problem/9298790

llvm-svn: 131266
2011-05-13 00:10:03 +00:00
Stuart Hastings 89f1b47e3a Non-fast-isel followup to 129634; correctly handle branches controlled
by non-CMP expressions.  The executable test case (129821) would test
this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC
tester.  Alas, the ARM assembly would be very difficult to check with
FileCheck.

The thumb2-cbnz.ll test is affected; it generates larger code (tst.w
vs. cmp #0), but I believe the new version is correct.
rdar://problem/9298790

llvm-svn: 131261
2011-05-12 23:36:41 +00:00
Eli Friedman 5401962643 Re-revert r130877; it's apparently causing a regression on 197.parser,
possibly related to cbnz formation.

llvm-svn: 130977
2011-05-06 05:23:07 +00:00
Eli Friedman 0fe4608af2 Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases.
Original message:

Teach MachineCSE how to do simple cross-block CSE involving physregs.  This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .

llvm-svn: 130877
2011-05-04 22:10:36 +00:00
Eli Friedman 3bd79ba856 Back out r130862; it appears to be breaking bootstrap.
llvm-svn: 130867
2011-05-04 20:48:42 +00:00
Eli Friedman a16fc2fec0 Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
llvm-svn: 130862
2011-05-04 19:54:24 +00:00
Jakob Stoklund Olesen 28a93a49bb Fix more register and coalescing dependencies.
llvm-svn: 130859
2011-05-04 19:02:11 +00:00
Jakob Stoklund Olesen d7fd7bfc31 Explicitly request physreg coalesing for a bunch of Thumb2 unit tests.
These tests all follow the same pattern:

	mov	r2, r0
	movs	r0, #0
	$CMP	r2, r1
	it	eq
	moveq	r0, #1
	bx	lr

The first 'mov' can be eliminated by rematerializing 'movs r0, #0' below the
test instruction:

	$CMP	r0, r1
	mov.w	r0, #0
	it	eq
	moveq	r0, #1
	bx	lr

So far, only physreg coalescing can do that. The register allocators won't yet
split live ranges just to eliminate copies. They can learn, but this particular
problem is not likely to show up in real code. It only appears because r0 is
used for both the function argument and return value.

llvm-svn: 130858
2011-05-04 19:02:07 +00:00
Jakob Stoklund Olesen edfabc9aad Weekly fix of register allocation dependent unit tests.
llvm-svn: 130567
2011-04-30 01:37:52 +00:00
Andrew Trick e794e17524 Teach Thumb2 isel to fold and->rotr ==> ROR.
Generalization of Nate Begeman's patch!

llvm-svn: 130502
2011-04-29 14:18:15 +00:00
Andrew Trick 65266ed4d7 Combine thumb2-ror tests.
llvm-svn: 130498
2011-04-29 14:02:41 +00:00
Evan Cheng 1355bbdd11 Be careful about scheduling nodes above previous calls. It increase usages of
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.

Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.

rdar://9329627

llvm-svn: 130245
2011-04-26 21:31:35 +00:00
Benjamin Kramer ba446cc12a Make tests more useful.
lit needs a linter ...

llvm-svn: 130126
2011-04-25 10:12:01 +00:00
Andrew Trick 76dca78cb4 Accidental function name mangling.
llvm-svn: 130050
2011-04-23 04:08:15 +00:00
Andrew Trick 0ed5778a1e Thumb2 and ARM add/subtract with carry fixes.
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.

Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.

Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).

llvm-svn: 130048
2011-04-23 03:55:32 +00:00
Andrew Trick 1a1f8d4640 whitespace
llvm-svn: 130046
2011-04-23 03:24:11 +00:00
Evan Cheng c0d2004e3c In Thumb2 mode, lower frame indix references to:
add <rd>, sp, #<imm8>
ldr <rd>, [sp, #<imm8>]
When the offset from sp is multiple of 4 and in range of 0-1020.
This saves code size by utilizing 16-bit instructions.

rdar://9321541

llvm-svn: 129971
2011-04-22 01:42:52 +00:00
Andrew Trick b53a00d2cb Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.

llvm-svn: 129421
2011-04-13 00:38:32 +00:00
Chris Lattner 418b1037b0 fix two completely broken tests, which were matching due to PR9629.
llvm-svn: 129195
2011-04-09 06:34:38 +00:00
Jakob Stoklund Olesen 100f53fd25 Fix Thumb and Thumb2 tests to be register allocator independent.
llvm-svn: 128690
2011-03-31 23:31:50 +00:00
Eric Christopher d553096688 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502

llvm-svn: 128320
2011-03-26 01:21:03 +00:00
Cameron Zwarich 338d362200 Roll r127459 back in:
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

llvm-svn: 127498
2011-03-11 21:52:04 +00:00
Daniel Dunbar 94ccb27b43 Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get
created from the", it broke some GCC test suite tests.

llvm-svn: 127477
2011-03-11 19:30:30 +00:00
Cameron Zwarich cc27b3acc4 Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

llvm-svn: 127459
2011-03-11 04:54:27 +00:00
Bob Wilson 43dff0f4b4 Move a test that ended up in the wrong place.
llvm-svn: 124933
2011-02-05 04:15:50 +00:00
Evan Cheng 2f2435d026 Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.

llvm-svn: 123991
2011-01-21 18:55:51 +00:00
Andrew Trick bd428ec50f Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).

llvm-svn: 123971
2011-01-21 06:19:05 +00:00
Bob Wilson 8265d56638 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson 651eaa02b8 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

llvm-svn: 121730
2010-12-13 23:02:37 +00:00
Evan Cheng 3434575704 (or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056
llvm-svn: 121606
2010-12-11 04:11:38 +00:00
Jim Grosbach 5fccad84a3 ARM stm/ldm instructions require more than one register in the register list.
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460

llvm-svn: 121391
2010-12-09 18:31:13 +00:00
Bob Wilson ed854baad5 The Thumb tADDrSPi instruction is not valid when the destination is SP.
Check for that and try narrowing it to tADDspi instead.  Radar 8724703.

llvm-svn: 120892
2010-12-04 04:40:19 +00:00