Commit Graph

3017 Commits

Author SHA1 Message Date
Eric Christopher 6a0333c1ed One definition of isThumb is plenty, thanks.
llvm-svn: 112793
2010-09-02 01:39:14 +00:00
Jim Grosbach 8ee5cd99ef Remove trailing whitespace
llvm-svn: 112790
2010-09-02 01:02:06 +00:00
Eric Christopher 74487fcbe7 Rework arm fast-isel load and store handling. Move offset computation
into the "address selection" routine and handle constant materialization
for stores.

llvm-svn: 112788
2010-09-02 00:53:56 +00:00
Jim Grosbach 6f2067659d trivial cleanup
llvm-svn: 112779
2010-09-02 00:02:26 +00:00
Jim Grosbach dffc9d328d Simplify the tGPR register class now that the register allocators know not
to try to allocate reserved registers.

llvm-svn: 112774
2010-09-01 23:50:23 +00:00
Bob Wilson 38ab35a911 Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply,
add, and subtract operations with zero-extended or sign-extended vectors.
Update tests.  Add auto-upgrade support for the old intrinsics.

llvm-svn: 112773
2010-09-01 23:50:19 +00:00
Eric Christopher fde5a3d494 Some basic store support.
llvm-svn: 112752
2010-09-01 22:16:27 +00:00
Eric Christopher 3ce9c4a65f Add some more load types in.
llvm-svn: 112721
2010-09-01 18:01:32 +00:00
Chris Lattner 94f834348f zap dead code.
llvm-svn: 112712
2010-09-01 16:04:34 +00:00
Chris Lattner 39eccb4754 temporarily revert r112664, it is causing a decoding conflict, and
the testcases should be merged.

llvm-svn: 112711
2010-09-01 16:00:50 +00:00
Bill Wendling 6789f8b6ae We have a chance for an optimization. Consider this code:
int x(int t) {
  if (t & 256)
    return -26;
  return 0;
}

We generate this:

     tst.w   r0, #256
     mvn     r0, #25
     it      eq
     moveq   r0, #0

while gcc generates this:

     ands    r0, r0, #256
     it      ne
     mvnne   r0, #25
     bx      lr

Scandalous really!

During ISel time, we can look for this particular pattern. One where we have a
"MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND
instruction to 0. Something like this (greatly simplified):

  %r0 = ISD::AND ...
  ARMISD::CMPZ %r0, 0         @ sets [CPSR]
  %r0 = ARMISD::MOVCC 0, -26  @ reads [CPSR]

All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR]
when it's zero. The zero value will all ready be in the %r0 register and we only
need to change it if the AND wasn't zero. Easy!

llvm-svn: 112664
2010-08-31 22:41:22 +00:00
Bill Wendling d657d82597 And ANDS pattern to match the t2ANDS pattern.
llvm-svn: 112654
2010-08-31 22:05:37 +00:00
Jim Grosbach 9ce9210e47 SP relative offsets need to be adjusted by the local allocation size when
determining if they're likely to be in range of the SP when resolving
frame references.

llvm-svn: 112624
2010-08-31 18:52:31 +00:00
Jim Grosbach 6f6b590b99 this assert should just be a condition, since this function is just asking if
the offset is legally encodable, not actually trying to do the encoding.

llvm-svn: 112622
2010-08-31 18:49:31 +00:00
Bill Wendling b70dc8777e - Cleanup some whitespaces.
- Convert {0,1} and friends into 0b01, which is identical and more consistent.

llvm-svn: 112593
2010-08-31 07:50:46 +00:00
Eric Christopher 901176a755 Rewrite slightly so we can expand for floating point types easier.
llvm-svn: 112568
2010-08-31 01:28:42 +00:00
Eric Christopher bbd1098989 If we have an unhandled type then assert, we shouldn't get here for
things we can't handle.

llvm-svn: 112559
2010-08-30 23:48:26 +00:00
Anton Korobeynikov 48043d0173 Expand MOVi32imm in ARM mode after regalloc. This provides
scheduling opportunities (extra instruction can go in between
MOVT / MOVW pair removing the stall).

llvm-svn: 112546
2010-08-30 22:50:36 +00:00
Bill Wendling 87bb14c566 Use the existing T2I_bin_s_irs pattern instead of creating T2I_bin_sw_irs, which
is meant to do exactly the same thing. Thanks to Jim Grosbach for pointing this
out! :-)

llvm-svn: 112538
2010-08-30 22:05:23 +00:00
Jakob Stoklund Olesen 4d30f90e35 Remember to clear the shadow kill flag at the same time as clearing the real
kill flag.

This could cause duplicate kill flags when the same register was used twice in a
continuous sequence of STRs.

There is no small test case. <rdar://problem/8218046>

llvm-svn: 112534
2010-08-30 21:52:40 +00:00
Bob Wilson 4cd8a126c3 Remove NEON vmovn intrinsic, replacing it with vector truncate operations.
Auto-upgrade the old intrinsic and update tests.

llvm-svn: 112507
2010-08-30 20:02:30 +00:00
Jim Grosbach fef37287a8 Make ARM add rN, sp, #imm instructions rematerializable. That's how the address of locals is calculated, so this should
help relieve register pressure a bit. Recalculating the local address is
almost always going to be better than spilling.

llvm-svn: 112503
2010-08-30 19:49:58 +00:00
Bob Wilson e2f8bdac14 When expanding NEON VST pseudo instructions, if the original super-register
operand is killed, add it to the expanded instruction as an implicit kill
operand instead of marking the individual subregs with kill flags.  This
should work better in general and also handles the case for VST3 where one
of the subregs was not referenced in the expanded instruction and so was
not marked killed.

llvm-svn: 112494
2010-08-30 18:10:48 +00:00
Bill Wendling f8dfa461fa Create Thumb2sI_cpsr and T2sI_cpsr. These new classes indicate that CPSR is the
optional modified register (instead of reg0). Along with r112461 it will make
sure that the optional define of CPSR is marked as "def" and will thus mark the
instructions using these classes (t2ANDS*) as setting the 's' flag.

llvm-svn: 112462
2010-08-30 01:47:35 +00:00
Bill Wendling 8fc2b590b9 Fix whitespaces. No functionality changes.
llvm-svn: 112421
2010-08-29 11:31:07 +00:00
Bob Wilson d0c054886c Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm
IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.

llvm-svn: 112416
2010-08-29 05:57:34 +00:00
Bill Wendling df9ec17d53 - Add a parameter to T2I_bin_irs for those patterns which set the S bit.
- Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit.

llvm-svn: 112399
2010-08-29 03:55:31 +00:00
Bill Wendling b0dc465c04 Name ANDflag to ANDS, which is less stupid.
llvm-svn: 112395
2010-08-29 03:06:09 +00:00
Bill Wendling ac64ed0923 File missing from last commit.
llvm-svn: 112394
2010-08-29 03:02:28 +00:00
Bill Wendling 0a65116cce Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but
it sets the CPSR register.

llvm-svn: 112393
2010-08-29 03:02:11 +00:00
Bob Wilson 950882be07 Use pseudo instructions for VST1 and VST2.
llvm-svn: 112357
2010-08-28 05:12:57 +00:00
Bob Wilson 8ee9394750 We don't need to custom-select VLDMQ and VSTMQ anymore.
llvm-svn: 112336
2010-08-28 00:20:11 +00:00
Bob Wilson ca5af12920 When merging Thumb2 loads/stores, do not give up when the offset is one of
the special values that for ARM would be used with IB or DA modes.  Fall
through and consider materializing a new base address is it would be
profitable.

llvm-svn: 112329
2010-08-27 23:57:52 +00:00
Bob Wilson 13ce07fa92 Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like
all the other LDM/STM instructions.  This fixes asm printer crashes when
compiling with -O0.  I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.

Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier.  Much of the backend
was not aware of these special cases.  The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode.  I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON.  Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.

llvm-svn: 112322
2010-08-27 23:18:17 +00:00
Bob Wilson af371b49a8 Unsigned value cannot be < 0.
llvm-svn: 112300
2010-08-27 21:44:35 +00:00
Jim Grosbach 6a77066913 Simplify eliminateFrameIndex() interface back down now that PEI doesn't need
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
2010-08-26 23:32:16 +00:00
Jim Grosbach e82d5b4aaf tidy up a bit. no functional change.
llvm-svn: 112228
2010-08-26 21:56:30 +00:00
Jim Grosbach 17da935964 Turn off the scavenging based frame reg reuse briefly to measure whether it's
still having a significant effect. It shouldn't be now that the pre-RA
virtual base reg stuff is in. Assuming that's valididated by the nightly
testers, we can simplify a lot of the PEI frame index code.

llvm-svn: 112220
2010-08-26 21:29:54 +00:00
Bob Wilson 97919e9c59 Use pseudo instructions for VST3.
llvm-svn: 112208
2010-08-26 18:51:29 +00:00
Bill Wendling a9c03f4fae Reapply r112176 without removing the other CMN patterns (that was unintentional).
llvm-svn: 112206
2010-08-26 18:33:51 +00:00
Jim Grosbach 074d22e1ac Restrict the register to tGPR to make sure the str instruction will be
encodable as a 16-bit wide instruction.

llvm-svn: 112195
2010-08-26 17:02:47 +00:00
Dan Gohman 10b20b2b81 Revert r112176; it broke test/CodeGen/Thumb2/thumb2-cmn.ll.
llvm-svn: 112191
2010-08-26 15:50:25 +00:00
Bill Wendling a9a0599b39 There seems to be a (potential) hardware bug with the CMN instruction and
comparison with 0. These two pieces of code should give identical results:

  rsbs r1, r1, 0
  cmp  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

and:

  cmn  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).

The AddWithCarry in the CMP case seems to be relying upon the identity:

  ~x + 1 = -x

However when x is 0 and unsigned, this doesn't hold:

   x = 0
  ~x = 0xFFFF FFFF
  ~x + 1 = 0x1 0000 0000
  (-x = 0) != (0x1 0000 0000 = ~x + 1)

Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.

(See the ARM docs for the "AddWithCarry" pseudo-code.)

This is related to <rdar://problem/7569620>.

llvm-svn: 112176
2010-08-26 09:07:33 +00:00
Bob Wilson 4cec44975e Use pseudo instructions for VST1d64Q.
llvm-svn: 112170
2010-08-26 05:33:30 +00:00
Jim Grosbach 08da771ec3 Enable pre-RA virtual frame base register allocation. rdar://8277890
llvm-svn: 112127
2010-08-26 00:58:06 +00:00
Bob Wilson 4629f423f8 Revert svn 107892 (with changes to work with trunk). It caused a crash if
a VLD result was not used (Radar 8355607).  It should also fix pr7988, but
I haven't verified that yet.

llvm-svn: 112118
2010-08-26 00:13:36 +00:00
Bob Wilson 9392b0e960 Start converting NEON load/stores to use pseudo instructions, beginning here
with the VST4 instructions.  Until after register allocation, we want to
represent sets of adjacent registers by a single super-register.  These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands.  Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.

llvm-svn: 112108
2010-08-25 23:27:42 +00:00
Jim Grosbach 0a84487fa7 Don't override the var from the enclosing scope.
When doing copy/paste/modify, it's apparently rather important to remember
the 'modify' bit...

llvm-svn: 112075
2010-08-25 19:11:34 +00:00
Daniel Dunbar a54a1b0edf ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed
comparison that would overflow.
 - The other under/overflow cases can't actually happen because the immediates
   which would trigger them are legal (so we don't enter this code), but
   adjusted the style to make it clear the transform is always valid.

llvm-svn: 112053
2010-08-25 16:58:05 +00:00
Eric Christopher 7a0d8c69cb Do type checks before we bother to do everything else.
llvm-svn: 112039
2010-08-25 08:43:57 +00:00
Eric Christopher 761e7fb605 Reorganize load mechanisms. Handle types in a little less fixed way.
Fix some todos.  No functional change.

llvm-svn: 112031
2010-08-25 07:23:49 +00:00
Eric Christopher 15b182f4d4 Fix predicate and add a comment.
llvm-svn: 111981
2010-08-24 22:34:11 +00:00
Eric Christopher 236ec8f3b5 Rework braindead conditionals I put in yesterday.
llvm-svn: 111974
2010-08-24 22:07:27 +00:00
Eric Christopher 6c99ebf5b0 Fix thumb2 mode loads to have the correct operand ordering. Add a todo
to fix this in the port.

llvm-svn: 111973
2010-08-24 22:03:02 +00:00
Jim Grosbach 2eedb7949e Add ARM heuristic for when to allocate a virtual base register for stack
access. rdar://8277890&7352504

llvm-svn: 111968
2010-08-24 21:19:33 +00:00
Jim Grosbach b77d67f318 Move enabling the local stack allocation pass into the target where it belongs.
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.

llvm-svn: 111942
2010-08-24 19:05:43 +00:00
Jim Grosbach 35b7c033d4 add ARM cmd line option to force always using virtual base regs when possible.
Intended to help ease reproducing problems by increasing base register usage
after heuristics for only using the when needed are in place.

llvm-svn: 111930
2010-08-24 18:04:52 +00:00
Bill Wendling 2c64ba63a1 Add comments for what the condition code symbols mean.
llvm-svn: 111889
2010-08-24 01:11:30 +00:00
Eric Christopher 46d3a56e5d Update comment.
llvm-svn: 111887
2010-08-24 01:10:52 +00:00
Eric Christopher c0c00ca33f Fix the opcode and the operands for the load instruction.
llvm-svn: 111885
2010-08-24 01:10:04 +00:00
Eric Christopher eb47692c22 Add register class hack that needs to go away, but makes it more obvious
that it needs to go away.  Use loadRegFromStackSlot where possible.

Also, remember to update the value map.

llvm-svn: 111883
2010-08-24 00:50:47 +00:00
Eric Christopher 9d4e471cc2 Add some more debugging code, make it more obvious that RegOffset is
getting an address for an object and select some default values.

llvm-svn: 111871
2010-08-24 00:07:24 +00:00
Eric Christopher e3107d6283 Don't need the extra register here.
llvm-svn: 111864
2010-08-23 23:28:04 +00:00
Eric Christopher 414501c511 Add some more "get address into register" code and a more TODOs/FIXMEs.
llvm-svn: 111860
2010-08-23 23:14:31 +00:00
Eric Christopher 8d03b8a8ce Add an ARMFunctionInfo member and use it.
llvm-svn: 111854
2010-08-23 22:32:45 +00:00
Eric Christopher 00202ee329 Start getting ARM loads/address computation going.
llvm-svn: 111850
2010-08-23 21:44:12 +00:00
Bob Wilson 9a511c07e4 Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend and
zero-extend operations.

llvm-svn: 111614
2010-08-20 04:54:02 +00:00
Eric Christopher 985d9e4ea8 Fix loop conditionals (MO.isDef() asserts that it's a reg) and
move some constraints around.

llvm-svn: 111594
2010-08-20 00:36:24 +00:00
Eric Christopher d8e8a2945e Add a couple of random comments.
llvm-svn: 111592
2010-08-20 00:20:31 +00:00
Jim Grosbach 56e56323c8 Better handling of offsets on frame index references. rdar://8277890
llvm-svn: 111585
2010-08-19 23:52:25 +00:00
Jim Grosbach 8c58bd30dc Add Thumb1 support for virtual frame indices.
rdar://8277890

llvm-svn: 111533
2010-08-19 17:52:13 +00:00
Eric Christopher a5d60c62b1 Silence warning.
llvm-svn: 111518
2010-08-19 15:35:27 +00:00
Eric Christopher 0d274a0258 Add an AddOptionalDefs method and use it.
llvm-svn: 111489
2010-08-19 00:37:05 +00:00
Bill Wendling 768d3b510c Add the "isCompare" attribute to the defm instead of each individual instr.
llvm-svn: 111481
2010-08-19 00:05:48 +00:00
Eric Christopher 8a70781cac Remove extra header.
llvm-svn: 111456
2010-08-18 23:38:16 +00:00
Jim Grosbach dbfc2ce95d Enable ARM base register reuse to local stack slot allocation. Whenever a new
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.

rdar://8277890

llvm-svn: 111443
2010-08-18 22:44:49 +00:00
Bill Wendling ad2aa57774 Minor simplification. Gets rid of a needless temporary.
llvm-svn: 111430
2010-08-18 21:32:07 +00:00
Jim Grosbach e0e9b3013f Add hook for re-using virtual base registers for local stack slot access.
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.

ongoing saga of rdar://8277890

llvm-svn: 111374
2010-08-18 17:57:37 +00:00
Bob Wilson fb7eaff759 Expand ZERO_EXTEND operations for NEON vector types.
Testcase from Nick Lewycky.

llvm-svn: 111341
2010-08-18 01:45:52 +00:00
Jim Grosbach 3cf08661f4 Add materialization of virtual base registers for frame indices allocated into
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.

Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.

llvm-svn: 111315
2010-08-17 22:41:55 +00:00
Jakob Stoklund Olesen e2cbaf6ed7 Don't call tablegen'ed Predicate_* functions in the ARM target.
llvm-svn: 111277
2010-08-17 20:39:04 +00:00
Jim Grosbach 62800a990b 80 column cleanup.
llvm-svn: 111266
2010-08-17 18:39:16 +00:00
Jim Grosbach c252ee2375 Add hook to examine an instruction referencing a frame index to determine
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.

In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.

rdar://8277890

llvm-svn: 111262
2010-08-17 18:13:53 +00:00
Jim Grosbach 8995a1018c explicitly handle no-op cases for clarity. Fixes clang warning.
llvm-svn: 111260
2010-08-17 18:00:41 +00:00
Bob Wilson 942b10f511 Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoid
printing "lsl #0".  This fixes the remaining parts of pr7792.  Make
corresponding changes for encoding/decoding these instructions.

llvm-svn: 111251
2010-08-17 17:23:19 +00:00
Chris Lattner 72a364c107 fix emacs language spec's, patch by Edmund Grimley-Evans!
llvm-svn: 111241
2010-08-17 16:20:04 +00:00
Bob Wilson 411dfad981 Allow more cases of undef shuffle indices and add tests for them.
llvm-svn: 111226
2010-08-17 05:54:34 +00:00
Eric Christopher 09f757d4bc Copy over some overridden MI wrappers for ARM fast-isel. This is where
we're adding predicates and optional defs to the MachineInstrs.

llvm-svn: 111222
2010-08-17 01:25:29 +00:00
Eric Christopher 663f49900d Make arm fast-isel possible to enable via command line.
llvm-svn: 111219
2010-08-17 00:46:57 +00:00
Bob Wilson c350e7a509 Ignore undef shuffle indices when checking for a VTRN shuffle. Radar 8290937.
llvm-svn: 111208
2010-08-16 23:37:17 +00:00
Bob Wilson 804f6159f1 Generalize a pattern for PKHTB: an SRL of 16-31 bits will guarantee
that the high halfword is zero.  The shift need not be exactly 16 bits.

llvm-svn: 111196
2010-08-16 22:26:55 +00:00
Bob Wilson 481d7a9ab4 Rename sat_shift operand to shift_imm, in preparation for using it for other
instructions besides saturate instructions.  No functional changes.

llvm-svn: 111168
2010-08-16 18:27:34 +00:00
Bob Wilson 8303fbbcf9 Remove unused code.
llvm-svn: 111154
2010-08-16 17:06:03 +00:00
Bob Wilson bffc757df7 T2I_rbin_irs rr variant is for disassembly only, so don't provide a pattern.
llvm-svn: 111068
2010-08-14 03:18:29 +00:00
Bob Wilson 4577f37d49 Add a Thumb2 t2RSBrr instruction for disassembly only.
This fixes another part of PR7792.

llvm-svn: 111057
2010-08-13 23:24:25 +00:00
Bob Wilson 3c9ed76ba5 Temporarily disable tail calls on ARM to work around some linker problems.
llvm-svn: 111050
2010-08-13 22:43:33 +00:00
Bob Wilson 15b3c3d0ac Move the Thumb2 SSAT and USAT optional shift operator out of the
instruction opcode.  This fixes part of PR7792.

llvm-svn: 111047
2010-08-13 21:48:10 +00:00
Bob Wilson d3a828ce68 Refactor the code for disassembling Thumb2 saturate instructions along the
same lines as the change I made for ARM saturate instructions.

llvm-svn: 111029
2010-08-13 19:04:21 +00:00
Johnny Chen 8e8f1c133a Cleaned up the for-disassembly-only entries in the arm instruction table so that
the memory barrier variants (other than 'SY' full system domain read and write)
are treated as one instruction with option operand.

llvm-svn: 110951
2010-08-12 20:46:17 +00:00
Evan Cheng 44a320dafa Make sure ARM constant island pass does not break up an IT block. If the split point is in the middle of an IT block, it should move it up to just above the IT instruction. rdar://8302637
llvm-svn: 110947
2010-08-12 20:30:05 +00:00