Commit Graph

332 Commits

Author SHA1 Message Date
Jim Grosbach 7ef7ddd2df Clean up a few 80 column violations.
llvm-svn: 132946
2011-06-13 22:54:22 +00:00
Tanya Lattner f0759ef271 Fix encoding for VEXTdf.
llvm-svn: 132486
2011-06-02 21:25:24 +00:00
Mon P Wang 92ff16b7bb Fixed MC encoding for index_align for VLD1/VST1 (single element from one lane) for size 32
llvm-svn: 131085
2011-05-09 17:47:27 +00:00
Mon P Wang 27f3330132 Fixed encoding for VEXTqf
llvm-svn: 129101
2011-04-07 19:56:12 +00:00
Owen Anderson abda3caf67 Somehow we managed to forget to encode the lane index for a large swathe of NEON instructions. With this fix, the entire test-suite passes with the Thumb integrated assembler.
llvm-svn: 128587
2011-03-30 23:45:29 +00:00
Cameron Zwarich 53dd03d537 Add a ARM-specific SD node for VBSL so that forms with a constant first operand
can be recognized. This fixes <rdar://problem/9183078>.

llvm-svn: 128584
2011-03-30 23:01:21 +00:00
Owen Anderson d6c5a741b5 Get rid of the non-writeback versions VLDMDB and VSTMDB, which don't actually exist.
llvm-svn: 128461
2011-03-29 16:45:53 +00:00
Jim Grosbach 59eea670f8 ARM VDUPfd and VDUPfq can just be patterns. The instruction is the same
as for VDUP32d and VDUP32q, respectively.

llvm-svn: 127489
2011-03-11 20:44:08 +00:00
Jim Grosbach c77dea7f55 ARM VDUPLNfq and VDUPLNfd definitions can just be Pat<>s for VDUPLN32q
and VDUPLN32d, respectively.

llvm-svn: 127486
2011-03-11 20:31:17 +00:00
Jim Grosbach 24fe5e36ea ARM VREV64df and VREV64qf can just be patterns. The instruction is the same
as for VREV64d32 and VREV64q32, respectively.

llvm-svn: 127485
2011-03-11 20:18:05 +00:00
Bill Wendling 5e57137e87 * Correct encoding for VSRI.
* Add tests for VSRI and VSLI.

llvm-svn: 127297
2011-03-09 00:33:17 +00:00
Bill Wendling a7f303de71 Correct the encoding for VRSRA and VSRA instructions.
llvm-svn: 127294
2011-03-09 00:00:35 +00:00
Bill Wendling e313f16ad9 * Fix VRSHR and VSHR to have the correct encoding for the immediate.
* Update the NEON shift instruction test to expect what 'as' produces.

llvm-svn: 127293
2011-03-08 23:48:09 +00:00
Bill Wendling 77ad1dc56d Rename the narrow shift right immediate operands to "shr_imm*" operands. Also
expand the testing of the narrowing shift right instructions.

No functionality change.

llvm-svn: 127193
2011-03-07 23:38:41 +00:00
Bill Wendling 3b1459b810 Narrow right shifts need to encode their immediates differently from a normal
shift.

   16-bit: imm6<5:3> = '001', 8 - <imm> is encded in imm6<2:0>
   32-bit: imm6<5:4> = '01',16 - <imm> is encded in imm6<3:0>
   64-bit: imm6<5> = '1', 32 - <imm> is encded in imm6<4:0>

llvm-svn: 126723
2011-03-01 01:00:59 +00:00
Bob Wilson e3ecd5fb9b Add patterns to use post-increment addressing for Neon VST1-lane instructions.
llvm-svn: 126477
2011-02-25 06:42:42 +00:00
Bob Wilson a609b8954e Change VLD3/4 and VST3/4 for quad registers to not update the address register.
These operations are expanded to pairs of loads or stores, and the first one
uses the address register update to produce the address for the second one.
So far, the second load/store has also updated the address register, just
for convenience, since that output has never been used.  In anticipation of
actually supporting post-increment updates for these operations, this changes
the non-updating operations to use a non-updating load/store for the second
instruction.

llvm-svn: 125013
2011-02-07 17:43:15 +00:00
Bob Wilson 42e67b5f73 Fix some NEON instruction itineraries.
llvm-svn: 125012
2011-02-07 17:43:12 +00:00
Bob Wilson 8265d56638 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson eda2a9ec89 Rearrange some Neon multiclasses. No functional changes.
llvm-svn: 122119
2010-12-18 00:42:58 +00:00
Bob Wilson 00871c71e9 Fix result type of Neon floating-point comparisons against zero.
The result vector elements are always integers.  Radar 8782191.

llvm-svn: 122112
2010-12-18 00:04:33 +00:00
Bob Wilson fa27a8621c Add Neon VCVT instructions for f32 <-> f16 conversions.
Clang is now providing intrinsics for these and so we need to support them
in the backend.  Radar 8068427.

llvm-svn: 121902
2010-12-15 22:14:12 +00:00
Bob Wilson 651eaa02b8 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

llvm-svn: 121730
2010-12-13 23:02:37 +00:00
Bob Wilson aae0862172 Simplify N2VSPat, removing some unnecessary type arguments.
llvm-svn: 121729
2010-12-13 23:02:31 +00:00
Bob Wilson 9c00c014ab Delete a line that I forgot to revert previously.
llvm-svn: 121719
2010-12-13 22:05:55 +00:00
Bob Wilson 9b3546d877 Use COPY_TO_REGCLASS instead of pseudo instructions for Neon FP patterns.
Jakob Olesen suggested that we can avoid the need for separate pseudo
instructions here by using COPY_TO_REGCLASS in the patterns.  The pattern
gets pretty ugly but it seems to work well.  Partial fix for Radar 8711675.

llvm-svn: 121718
2010-12-13 21:58:05 +00:00
Bob Wilson 157fec42c9 Use pseudo instructions for 2-register Neon instructions for scalar FP.
Partial fix for Radar 8711675.

llvm-svn: 121716
2010-12-13 21:05:52 +00:00
Bob Wilson 52f522720e Remove unused instruction class arguments.
llvm-svn: 121715
2010-12-13 21:05:44 +00:00
Bob Wilson 9375d27460 Add float patterns for Neon vld1-lane/dup and vst1-lane operations.
llvm-svn: 121583
2010-12-10 22:13:32 +00:00
Bob Wilson e1d3322111 Remove unused arguments.
llvm-svn: 121582
2010-12-10 22:13:24 +00:00
Evan Cheng 62c7b5bf76 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Jim Grosbach 371e586544 Fix copy/pasto in vmin.f32 encoding.
llvm-svn: 120709
2010-12-02 16:30:58 +00:00
Owen Anderson 4472801765 Use by-name rather than by-order matching for NEON operands.
llvm-svn: 120507
2010-12-01 00:28:25 +00:00
Bob Wilson 318ce7cb3f Fix the encoding of VLD4-dup alignment.
The only reasonable way I could find to do this is to provide an alternate
version of the addrmode6 operand with a different encoding function.  Use it
for all the VLD-dup instructions for the sake of consistency.

llvm-svn: 120358
2010-11-30 00:00:42 +00:00
Bob Wilson 0b27b68164 Rename VLDnDUP instructions with double-spaced registers
in an attempt to make things a little more consistent.

llvm-svn: 120357
2010-11-30 00:00:38 +00:00
Bob Wilson 431ac4ef50 Add support for NEON VLD3-dup instructions.
The encoding for alignment in VLD4-dup instructions is still a work in progress.

llvm-svn: 120356
2010-11-30 00:00:35 +00:00
Bob Wilson 77ab165afe Add support for NEON VLD3-dup instructions.
llvm-svn: 120312
2010-11-29 19:35:29 +00:00
Bob Wilson 2d790df105 Add support for NEON VLD2-dup instructions.
llvm-svn: 120236
2010-11-28 06:51:26 +00:00
Bob Wilson 04b2c94205 Another minor refactoring for VLD1DUP instructions.
The op11_8 field is the same for all of them so put it in the instruction
classes instead of specifying it separately for each instruction.

llvm-svn: 120234
2010-11-28 06:51:15 +00:00
Bob Wilson d74cf2c8f6 Refactor. Set alignment bit in VLD1-dup instruction classes.
llvm-svn: 120197
2010-11-27 07:12:02 +00:00
Bob Wilson c92eea0175 Add NEON VLD1-dup instructions (load 1 element to all lanes).
llvm-svn: 120194
2010-11-27 06:35:16 +00:00
Owen Anderson 7e484e0be7 Use by-name rather than by-order operand matching for some NEON encodings.
llvm-svn: 119923
2010-11-21 06:47:06 +00:00
Owen Anderson b4fd2c90e9 The Vm and Vn register fields must be the same for a register-register vmov.
llvm-svn: 119867
2010-11-19 23:12:43 +00:00
Jim Grosbach 785952e5ac Operand names
llvm-svn: 119864
2010-11-19 22:43:08 +00:00
Jim Grosbach 7d8df3185f Clarify operand names.
llvm-svn: 119858
2010-11-19 22:36:02 +00:00
Jim Grosbach 9c335bf977 Remove trailing whitespace.
llvm-svn: 119608
2010-11-18 01:39:50 +00:00
Jim Grosbach a74c7ccd59 ARM PseudoInst instructions don't need or use an assembler string. Get rid of
the operand to the pattern.

llvm-svn: 119607
2010-11-18 01:38:26 +00:00
Bill Wendling a68e3a5397 Encode the multi-load/store instructions with their respective modes ('ia',
'db', 'ib', 'da') instead of having that mode as a separate field in the
instruction. It's more convenient for the asm parser and much more readable for
humans.
<rdar://problem/8654088>

llvm-svn: 119310
2010-11-16 01:16:36 +00:00
Owen Anderson c7baee31ad Add support for ARM's specialized vector-compare-against-zero instructions.
llvm-svn: 118453
2010-11-08 23:21:22 +00:00
Owen Anderson 30c4892ea5 Add codegen and encoding support for the immediate form of vbic.
llvm-svn: 118291
2010-11-05 19:27:46 +00:00