Commit Graph

20459 Commits

Author SHA1 Message Date
Craig Topper 6c08930c5e Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.
llvm-svn: 147361
2011-12-30 02:18:36 +00:00
Craig Topper 2ca79b9d4b Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.
llvm-svn: 147360
2011-12-30 01:49:53 +00:00
Hal Finkel 692d1fb355 Cleanup stack/frame register define/kill states. This fixes two bugs:
1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test).

2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this.

llvm-svn: 147359
2011-12-30 00:34:00 +00:00
Craig Topper d773607eee Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions.
llvm-svn: 147353
2011-12-29 20:43:40 +00:00
Craig Topper 8cab06a214 Expose FMA3 instructions to the disassembler.
llvm-svn: 147351
2011-12-29 20:03:14 +00:00
Craig Topper e1bd05128e Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled.
llvm-svn: 147349
2011-12-29 19:46:19 +00:00
Craig Topper dd286a5201 Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit.
llvm-svn: 147348
2011-12-29 19:25:56 +00:00
Craig Topper a060afb5ba Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339.
llvm-svn: 147347
2011-12-29 18:47:31 +00:00
Craig Topper 97f05c5768 Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet.
llvm-svn: 147345
2011-12-29 18:08:36 +00:00
Craig Topper 1559123c77 Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms.
llvm-svn: 147344
2011-12-29 18:00:08 +00:00
Craig Topper 9e61291bf5 Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Craig Topper 7bd3305f3e Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A.
llvm-svn: 147339
2011-12-29 15:51:45 +00:00
Craig Topper 0fdf720ded Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8.
llvm-svn: 147337
2011-12-29 03:34:54 +00:00
Craig Topper 862c9b65be Remove some elses after returns.
llvm-svn: 147336
2011-12-29 03:20:51 +00:00
Craig Topper 274e20a499 Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path.
llvm-svn: 147335
2011-12-29 03:09:33 +00:00
Eli Friedman 3a01ddb7e9 Fix type-checking for load transformation which is not legal on floating-point types. PR11674.
llvm-svn: 147323
2011-12-28 21:24:44 +00:00
Elena Demikhovsky b3515a8d4b Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.

llvm-svn: 147308
2011-12-28 08:14:01 +00:00
Benjamin Kramer b668401b2e Clean up some Release build warnings.
llvm-svn: 147289
2011-12-27 11:41:05 +00:00
Craig Topper df34d152bd Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.
llvm-svn: 147287
2011-12-27 06:27:23 +00:00
Venkatraman Govindaraju 1fc8263b4d Sparc: Implement emitFrameIndexDebugValue and getDebugValue Location hooks.
llvm-svn: 147269
2011-12-25 18:50:24 +00:00
Rafael Espindola a56ab0ede7 Section relative fixups are a coff concept, not a x86 one. Replace the
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.

llvm-svn: 147252
2011-12-24 14:47:52 +00:00
Chandler Carruth a3d54fe0ae Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the
LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type

We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]

llvm-svn: 147251
2011-12-24 12:12:34 +00:00
Chandler Carruth 38ce24455d Add systematic testing for cttz as well, and fix the bug I spotted by
inspection earlier.

llvm-svn: 147250
2011-12-24 11:46:10 +00:00
Benjamin Kramer 767bbe48c1 Chandler fixed this.
llvm-svn: 147247
2011-12-24 11:23:32 +00:00
Chandler Carruth c9fcde2347 Expand more when we have a nice 'tzcnt' instruction, to avoid generating
'bsf' instructions here.

This one is actually debatable to my eyes. It's not clear that any chip
implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless
EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding.
Still, this restores the old behavior with 'tzcnt' enabled for now.

llvm-svn: 147246
2011-12-24 11:11:38 +00:00
Chandler Carruth 7e9453e916 Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the
X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244
2011-12-24 10:55:54 +00:00
Jakob Stoklund Olesen 103318e9ea Fix Comments.
llvm-svn: 147238
2011-12-24 04:17:01 +00:00
Akira Hatanaka 1cf7576707 Add MachineMemOperands to instructions generated in storeRegToStackSlot or
loadRegFromStackSlot. 

llvm-svn: 147235
2011-12-24 03:11:18 +00:00
Akira Hatanaka 6f54a46133 Detect unaligned loads/stores that have been added for Mips64 support.
llvm-svn: 147234
2011-12-24 03:07:37 +00:00
Akira Hatanaka 695d113adc If target ABI is N64, LEA should be daddiu.
llvm-svn: 147232
2011-12-24 02:59:27 +00:00
Rafael Espindola 908d2ed14e Move x86 specific bits of the COFF writer to lib/Target/X86.
llvm-svn: 147231
2011-12-24 02:14:02 +00:00
Jakob Stoklund Olesen 0965585cb1 Experimental support for aligned NEON spills.
ARM targets with NEON units have access to aligned vector loads and
stores that are potentially faster than unaligned operations.

Add support for spilling the callee-saved NEON registers to an aligned
stack area using 16-byte aligned NEON loads and store.

This feature is off by default, controlled by an -align-neon-spills
command line option.

llvm-svn: 147211
2011-12-23 00:36:18 +00:00
Bob Wilson 1a74de9504 Add variants of the dispatchsetup pseudo for Thumb and !VFP. <rdar://10620138>
My change r146949 added register clobbers to the eh_sjlj_dispatchsetup pseudo
instruction, but on Thumb1 some of those registers cannot be used.  This
caused massive failures on the testsuite when compiling for Thumb1.  While
fixing that, I noticed that the eh_sjlj_setjmp instruction has a "nofp"
variant, and I realized that dispatchsetup needs the same thing, so I have
added that as well.

llvm-svn: 147204
2011-12-22 23:39:48 +00:00
Chad Rosier 00bbedff03 Fix 80-column violations.
llvm-svn: 147192
2011-12-22 22:35:21 +00:00
Jim Grosbach ea2319112f ARM VFP assembly parsing and encoding for VCVT(float <--> fixed point).
rdar://10558523

llvm-svn: 147189
2011-12-22 22:19:05 +00:00
Bob Wilson 268d2599e0 Add missing usesCustomInserter flag on Int_eh_sjlj_setjmp_nofp.
Noticed by inspection; I don't have a testcase for this.

llvm-svn: 147188
2011-12-22 22:12:44 +00:00
Jim Grosbach c4d8d2f155 Tidy up. Use predicate function a bit more liberally.
llvm-svn: 147184
2011-12-22 22:02:35 +00:00
Rafael Espindola 6ca42c5be3 Fix incorrect relocation generation. Patch by Kristof Beyls.
Fixes PR11214.

llvm-svn: 147180
2011-12-22 21:36:43 +00:00
Jim Grosbach f0d25117c6 ARM VFP add encoding of the bitcount to fixed-point<-->floating point. insns.
The value from the operands isn't right yet, but we weren't encoding it at
all previously. The parser needs to twiddle the values when building the
instruction.

Partial for: rdar://10558523

llvm-svn: 147170
2011-12-22 19:55:21 +00:00
Jim Grosbach b65dd04923 Remove some bogus comments.
llvm-svn: 147169
2011-12-22 19:45:01 +00:00
Jim Grosbach 489ed5929e ARM pre-UAL aliases. fcmp[sd].
llvm-svn: 147158
2011-12-22 19:20:45 +00:00
Rafael Espindola 250096233b Fix an incomplete refactoring of the ppc backend. Thanks to rdivacky for reporting
it. It does need some some tests...

llvm-svn: 147154
2011-12-22 18:38:06 +00:00
Jim Grosbach 12ccf45bbb ARM assembler should accept shift-by-zero for any shifted-immediate operand.
Just treat it as-if the shift wasn't there at all. 'as' compatibility.

rdar://10604767

llvm-svn: 147153
2011-12-22 18:04:04 +00:00
Jim Grosbach 21488b8839 ARM assembly parser canonicallize on 'lsl' for shift-by-zero form.
llvm-svn: 147152
2011-12-22 17:37:00 +00:00
Jim Grosbach 3794d82af5 Tidy up. Trailing whitespace.
llvm-svn: 147151
2011-12-22 17:17:10 +00:00
Jim Grosbach 62bffd8827 Nuke invalid comment from copy/paste.
llvm-svn: 147150
2011-12-22 17:04:50 +00:00
Rafael Espindola 1dc45d8df4 Move the Mips only bits of the ELF writer to lib/Target/Mips.
llvm-svn: 147133
2011-12-22 03:03:17 +00:00
Rafael Espindola 84d00f11cd Make the virtual methods in ARMELFObjectWriter public.
llvm-svn: 147132
2011-12-22 02:58:12 +00:00
Rafael Espindola cc369ac0a2 Move the MBlaze ELF writer bits to lib/Target/MBlaze.
llvm-svn: 147129
2011-12-22 02:28:24 +00:00
Rafael Espindola 428b9ee036 Fix cmake.
llvm-svn: 147126
2011-12-22 02:06:17 +00:00
Rafael Espindola 38a400df3b Move PPC bits to lib/Target/PowerPC.
llvm-svn: 147124
2011-12-22 01:57:09 +00:00
Rafael Espindola 2da9777cef Hopefully fix the cmake build.
llvm-svn: 147121
2011-12-22 01:11:01 +00:00
Rafael Espindola 4449b21294 Fix name in comments.
llvm-svn: 147119
2011-12-22 01:06:53 +00:00
Akira Hatanaka e2eed9649e Local dynamic TLS model for direct object output. Create the correct TLS MIPS
ELF relocations.

Patch by Jack Carter.

llvm-svn: 147118
2011-12-22 01:05:17 +00:00
Richard Smith 32a756b7ce Unbreak cmake build after r147115.
llvm-svn: 147117
2011-12-22 01:03:35 +00:00
Rafael Espindola a0124055b1 Move the ARM specific parts of the ELF writer to Target/ARM.
llvm-svn: 147115
2011-12-22 00:37:50 +00:00
Jim Grosbach 2b80dad572 ARM NEON mnemonic aliase for vrecpeq.
llvm-svn: 147109
2011-12-21 23:52:37 +00:00
Jim Grosbach 7869d8c01e ARM VFP optional data type on VMOV GPR<-->SPR.
llvm-svn: 147104
2011-12-21 23:24:15 +00:00
Jim Grosbach 260b4b336a ARM NEON optional data type on VSWP instructions.
llvm-svn: 147103
2011-12-21 23:09:28 +00:00
Jim Grosbach a50e24fcb3 ARM NEON mnemonic aliases for vzipq and vswpq.
llvm-svn: 147102
2011-12-21 23:04:33 +00:00
Jim Grosbach 1152cc0cad ARM asm parser should be more lenient w/ .thumb_func directive.
Rather than require the symbol to be explicitly an argument of the directive,
allow it to look ahead and grab the symbol from the next non-whitespace
line.

rdar://10611140

llvm-svn: 147100
2011-12-21 22:30:16 +00:00
Jim Grosbach 8c59bbc1ed Thumb2 assembly parsing of 'mov rd, rn, rrx'.
Maps to the RRX instruction. Missed this case earlier.

rdar://10615373

llvm-svn: 147096
2011-12-21 21:04:19 +00:00
Chad Rosier 3172488cc0 Fix 80-column violations.
llvm-svn: 147095
2011-12-21 20:59:09 +00:00
Jim Grosbach b3ef713e44 Thumb2 assembly parsing of 'mov(register shifted register)' aliases.
These map to the ASR, LSR, LSL, ROR instruction definitions.

rdar://10615373

llvm-svn: 147094
2011-12-21 20:54:00 +00:00
Jakob Stoklund Olesen 3588a43e3a Move common code into an MRI function.
llvm-svn: 147071
2011-12-21 19:50:05 +00:00
Jim Grosbach c80a264386 ARM NEON assmebly parsing for VLD2 to all lanes instructions.
llvm-svn: 147069
2011-12-21 19:40:55 +00:00
Chad Rosier 3ede414127 No case stmt for BUILD_VECTOR in PerformDAGCombine(), so I assume this isn't
necessary.  Please chime in if I'm mistaken.

llvm-svn: 147065
2011-12-21 19:14:52 +00:00
Chad Rosier 7248bda595 Fix a couple of copy-n-paste bugs. Noticed by George Russell!
llvm-svn: 147064
2011-12-21 18:56:22 +00:00
Rafael Espindola b264d33854 Move the X86 specific bits of the ELF writer to the Target/X86 directory.
Other targets will follow shortly.

llvm-svn: 147060
2011-12-21 17:30:17 +00:00
Rafael Espindola 1ad4095d6b Reduce the exposure of Triple::OSType in the ELF object writer. This will
avoid including ADT/Triple.h in many places when the target specific bits are
moved.

llvm-svn: 147059
2011-12-21 17:00:36 +00:00
Craig Topper b8b1b4c1de Remove mode specific disassembler classes and just call X86GenericDisassembler constructor with appropriate argument in the creation functions. This removes a few tables that needed to be anchored.
llvm-svn: 147046
2011-12-21 08:06:52 +00:00
Craig Topper f30188418b Fix typo in a couple comments
llvm-svn: 147045
2011-12-21 06:30:53 +00:00
Evan Cheng dc8a1aaea6 Fix a couple of copy-n-paste bugs. Noticed by George Russell.
llvm-svn: 147032
2011-12-21 03:04:10 +00:00
Jim Grosbach 7de7ab83fa ARM assembly parsing allows constant expressions for lane indices.
llvm-svn: 147028
2011-12-21 01:19:23 +00:00
Jim Grosbach c5af54ec89 ARM NEON VLD2 assembly parsing for structure to all lanes, non-writeback.
llvm-svn: 147025
2011-12-21 00:38:54 +00:00
Akira Hatanaka 964c891e61 Fix bug in zero-store peephole pattern reported in pr11615.
The patch and test case were originally written by Mans Rullgard.

llvm-svn: 147024
2011-12-21 00:31:10 +00:00
Akira Hatanaka 1d8efaba7e Expand 64-bit CTLZ nodes if target architecture does not support it. Add test
case for DCLO and DCLZ.

llvm-svn: 147022
2011-12-21 00:20:27 +00:00
Akira Hatanaka 410ce9cb44 Expand 64-bit CTPOP and CTTZ.
llvm-svn: 147021
2011-12-21 00:14:05 +00:00
Akira Hatanaka 91c052c4d8 Expand 64-bit atomic load and store.
llvm-svn: 147019
2011-12-21 00:02:58 +00:00
Akira Hatanaka 4706ac9715 Add definition of DSBH (Double Swap Bytes within Halfwords) and
DSHD (Double Swap Halfwords within Doublewords). Add a pattern which replaces
64-bit bswap with a DSBH and DSHD pair.

llvm-svn: 147017
2011-12-20 23:56:43 +00:00
Akira Hatanaka 43c1ff4db3 Add definition of WSBH (Word Swap Bytes within Halfwords), which is an
instruction supported by mips32r2, and add a pattern which replaces bswap with
a ROTR and WSBH pair.
 
WSBW is removed since it is not an instruction the current architectures
support.

llvm-svn: 147015
2011-12-20 23:47:44 +00:00
Akira Hatanaka 79aed157e7 64-bit uint-fp conversion nodes are expanded.
llvm-svn: 147014
2011-12-20 23:40:56 +00:00
Akira Hatanaka 2bb8d068f5 Enable custom lowering DYNAMIC_STACKALLOC nodes.
llvm-svn: 147013
2011-12-20 23:35:46 +00:00
Akira Hatanaka 8e2c02e2d6 Set the correct stack pointer register that should be saved or restored.
llvm-svn: 147012
2011-12-20 23:28:36 +00:00
Jim Grosbach cd22e4a81e ARM .req register name aliases are case insensitive, just like regnames.
llvm-svn: 147009
2011-12-20 23:11:00 +00:00
Akira Hatanaka cb2a85bc22 Add function MipsDAGToDAGISel::SelectMULT and factor out code that generates
nodes needed for multiplication. Add code for selecting 64-bit MULHS and MULHU
nodes.

llvm-svn: 147008
2011-12-20 23:10:57 +00:00
Akira Hatanaka 2c8d1734f8 Fix indentation.
llvm-svn: 147007
2011-12-20 22:58:01 +00:00
Akira Hatanaka cf10f08825 64-bit data directive.
llvm-svn: 147005
2011-12-20 22:52:19 +00:00
Akira Hatanaka 494fdf1499 32-to-64-bit sext_inreg pattern.
llvm-svn: 147004
2011-12-20 22:40:40 +00:00
Akira Hatanaka 8756816e6f Add 64-bit extload patterns.
llvm-svn: 147003
2011-12-20 22:36:08 +00:00
Akira Hatanaka 0cee2045c9 Add patterns for matching extloads with 64-bit address. The patterns are enabled
only when the target ABI is N64.

llvm-svn: 147001
2011-12-20 22:33:53 +00:00
Jim Grosbach 4eda145c7f Move comment to appropriate place.
llvm-svn: 147000
2011-12-20 22:26:38 +00:00
Akira Hatanaka dac1d48d8d Add code in MipsDAGToDAGISel for selecting constant +0.0.
MIPS64 can generate constant +0.0 with a single DMTC1 instruction. 

llvm-svn: 146999
2011-12-20 22:25:50 +00:00
Jakob Stoklund Olesen b95c102c2f Heed spill slot alignment on ARM.
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.

Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack.  Don't use aligned spill code in that case.

llvm-svn: 146997
2011-12-20 22:15:04 +00:00
Akira Hatanaka 14468c6cb6 Revert part of r146995 that was accidentally commmitted.
llvm-svn: 146996
2011-12-20 22:09:36 +00:00
Akira Hatanaka 4e210691c0 32-to-64-bit sign extension pattern.
llvm-svn: 146995
2011-12-20 22:06:20 +00:00
Akira Hatanaka 9b9bd1cc15 Add a pattern for matching zero-store with 64-bit address. The pattern is enabled
only when the target ABI is N64. 

llvm-svn: 146992
2011-12-20 21:50:49 +00:00
Jim Grosbach 2c59052984 ARM assembly parsing and encoding for VST2 single-element, double spaced.
llvm-svn: 146990
2011-12-20 20:46:29 +00:00
Jim Grosbach 75e2ab5db2 ARM assembly parsing and encoding for VLD2 single-element, double spaced.
llvm-svn: 146983
2011-12-20 19:21:26 +00:00
Evan Cheng 68132d8093 ARM target code clean up. Check for iOS, not Darwin where it makes sense.
llvm-svn: 146981
2011-12-20 18:26:50 +00:00
Jason W Kim 135d244b56 First steps in ARM AsmParser support for .eabi_attribute and .arch
(Both used for Linux gnueabi)
No behavioral change yet (no tests need so far)

llvm-svn: 146977
2011-12-20 17:38:12 +00:00
Elena Demikhovsky ec7e6e0946 This is the second fix related to VZEXT_MOVL node.
The failure that I see in the current version is:

LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
  0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
    0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
      0x18b9870: v4i64 = undef [ID=4]
      0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
        0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
      0x18b9970: i32 = Constant<0> [ID=3]
    0x18b9170: v2i64 = undef [ORD=1] [ID=1]
    0x18b9570: i32 = Constant<2> [ID=5]

llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Chandler Carruth 24680c24d8 Begin teaching the X86 target how to efficiently codegen patterns that
use the zero-undefined variants of CTTZ and CTLZ. These are just simple
patterns for now, there is more to be done to make real world code using
these constructs be optimized and codegen'ed properly on X86.

The existing tests are spiffed up to check that we no longer generate
unnecessary cmov instructions, and that we generate the very important
'xor' to transform bsr which counts the index of the most significant
one bit to the number of leading (most significant) zero bits. Also they
now check that when the variant with defined zero result is used, the
cmov is still produced.

llvm-svn: 146974
2011-12-20 11:19:37 +00:00
Chandler Carruth e805b16e3d Fix up the CMake build for the new files added in r146960, they're
likely to stay either way that discussion ends up resolving itself.

llvm-svn: 146966
2011-12-20 08:42:11 +00:00
David Blaikie a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Bob Wilson 75f12cc3fe Mark ARM eh_sjlj_dispatchsetup as clobbering all registers. Radar 10567930.
We used to rely on the *eh_sjlj_setjmp instructions to mark that a function
with setjmp/longjmp exception handling clobbers all the registers.  But with
the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are
expanded away earlier, before PEI can see them to determine what registers to
save and restore.  Mark the dispatchsetup instruction in the same way, since
that instruction cannot be expanded early.  This also more accurately reflects
when the registers are clobbered.

llvm-svn: 146949
2011-12-20 01:29:27 +00:00
Jim Grosbach e2ca9e5b5f ARM assembly shifts by zero should be plain 'mov' instructions.
"mov r1, r2, lsl #0" should assemble as "mov r1, r2" even though it's
not strictly legal UAL syntax. It's a common extension and the friendly
thing to do.

rdar://10604663

llvm-svn: 146937
2011-12-20 00:59:38 +00:00
Dan Gohman 94580ab375 Add basic generic CodeGen support for half.
llvm-svn: 146927
2011-12-20 00:02:33 +00:00
Jim Grosbach 045b6c71a6 ARM NEON assembly aliases for VMOV<-->VMVN for i32 immediates.
e.g., "vmov.i32 d4, #-118" can be assembled as "vmvn.i32 d4, #117"

rdar://10603913

llvm-svn: 146925
2011-12-19 23:51:07 +00:00
Jim Grosbach 8648c10184 ARM assembly parsing and encoding support for LDRD(label).
rdar://9932658

llvm-svn: 146921
2011-12-19 23:06:24 +00:00
Akira Hatanaka db47e0c49d Add patterns for matching immediates whose lower 16-bit is cleared. These
patterns emit a single LUi instruction instead of a pair of LUi and ORi.

llvm-svn: 146900
2011-12-19 20:21:18 +00:00
Akira Hatanaka 9e1d369e3c Tidy up. Simplify logic. No functional change intended.
llvm-svn: 146896
2011-12-19 19:52:25 +00:00
Jim Grosbach 64f4de29e0 ARM NEON two-operand aliases for VPADD.
rdar://10602276

llvm-svn: 146895
2011-12-19 19:51:03 +00:00
Akira Hatanaka 2a232d81f6 Remove definitions of double word shift plus 32 instructions. Assembler or
direct-object emitter should emit the appropriate shift instruction depending
on the shift amount.

llvm-svn: 146893
2011-12-19 19:44:09 +00:00
Jim Grosbach e16acacc3a ARM VFP pre-UAL mnemonic aliases for fmul[sd].
llvm-svn: 146892
2011-12-19 19:43:50 +00:00
Akira Hatanaka c4db30e358 Remove unused predicate.
llvm-svn: 146889
2011-12-19 19:32:20 +00:00
Akira Hatanaka 3c9f336361 Remove the restriction on the first operand of the add node in SelectAddr.
This change reduces the number of instructions generated.

For example, 
(load (add (sub $n0, $n1), (MipsLo got(s))))

results in the following sequence of instructions:
1. sub $n2, $n0, $n1
2. lw got(s)($n2)

Previously, three instructions were needed.
1. sub $n2, $n0, $n1
2. addiu $n3, $n2, got(s)
3. lw 0($n3)

llvm-svn: 146888
2011-12-19 19:28:37 +00:00
Jim Grosbach 92a939ae73 ARM VFP pre-UAL mnemonic aliases for fcpy[sd] and fdiv[sd].
llvm-svn: 146887
2011-12-19 19:02:41 +00:00
Jim Grosbach 9ae4fc035b ARM NEON implied destination aliases for VMAX/VMIN.
llvm-svn: 146885
2011-12-19 18:57:38 +00:00
Jim Grosbach cef98cddbe ARM NEON relax parse time diagnostics for alignment specifiers.
There's more variation that we need to handle. Error checking will need
to be on operand predicates.

llvm-svn: 146884
2011-12-19 18:31:43 +00:00
Jim Grosbach a7d2421603 Tidy up.
llvm-svn: 146882
2011-12-19 18:11:17 +00:00
Jakob Stoklund Olesen 24159e346d Remove a register class that can just as well be synthesized.
Add the new TableGen register class synthesizer feature to the release
notes.

llvm-svn: 146875
2011-12-19 16:53:40 +00:00
Jakob Stoklund Olesen c7b437ae34 Emit a getMatchingSuperRegClass() implementation for every target.
Use information computed while inferring new register classes to emit
accurate, table-driven implementations of getMatchingSuperRegClass().

Delete the old manual, error-prone implementations in the targets.

llvm-svn: 146873
2011-12-19 16:53:34 +00:00
Benjamin Kramer 1b54835a10 Another variadics tweak.
llvm-svn: 146852
2011-12-18 20:51:31 +00:00
Benjamin Kramer 530b820500 Use the fancy new VariadicFunction template instead of a plain variadic function.
Some compilers were complaining about passing StringRef to it.

llvm-svn: 146850
2011-12-18 19:59:20 +00:00
Benjamin Kramer 32481916eb Hexagon: Remove unused variables.
llvm-svn: 146846
2011-12-18 12:00:09 +00:00
Craig Topper a913dde0ef Remove an unused X86ISD node type.
llvm-svn: 146833
2011-12-17 19:16:44 +00:00
Benjamin Kramer 792edd3c75 X86: Factor the bswap asm matching to be slightly less horrible to read.
llvm-svn: 146831
2011-12-17 14:36:05 +00:00
Evan Cheng 903231bc58 Fix a CPSR liveness tracking bug introduced when I converted IT block to bundle.
llvm-svn: 146805
2011-12-17 01:25:34 +00:00
Rafael Espindola d3df3d3527 Add back the MC bits of 126425. Original patch by Nathan Jeffords. I added the
asm parsing and testcase.

llvm-svn: 146801
2011-12-17 01:14:52 +00:00
Lang Hames da07b3ad42 Make sure that the lower bits on the VSELECT condition are properly set.
llvm-svn: 146800
2011-12-17 01:08:46 +00:00
Jakob Stoklund Olesen 465cdf3ba4 Preserve more memory operands in ARMExpandPseudo.
I don't think this affects anything but verbose assembly.

llvm-svn: 146787
2011-12-17 00:07:02 +00:00
Jakob Stoklund Olesen 9790187b6c Fix off-by-one error in bucket sort.
The bad sorting caused a misaligned basic block when building 176.vpr in
ARM mode.

<rdar://problem/10594653>

llvm-svn: 146767
2011-12-16 23:00:05 +00:00
Jakob Stoklund Olesen 5af144809e Don't adjust for alignment padding in OffsetIsInRange.
This adjustment is already included in the block offsets computed by
BasicBlockInfo, and adjusting again here can cause the pass to loop.

When CreateNewWater splits a basic block, OffsetIsInRange would reject
the new CPE on the next pass because of the too conservative alignment
adjustment. This caused the block to be split again, and so on.

llvm-svn: 146751
2011-12-16 19:10:00 +00:00
Benjamin Kramer 9ca2e7293b Hexagon: Fix a nasty order-of-initialization bug.
Reenable the tests.

llvm-svn: 146750
2011-12-16 19:08:59 +00:00
Jakob Stoklund Olesen 2a05f691ab Note ARM constant island alignment in the release notes.
The command line option should be removed, but not until the feature has
gotten a lot of testing. The ARMConstantIslandPass tends to have subtle
bugs that only show up after a while.

llvm-svn: 146739
2011-12-16 16:07:41 +00:00
Craig Topper a4d411cb1b Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.
llvm-svn: 146726
2011-12-16 08:06:31 +00:00
NAKAMURA Takumi 93d990bd61 Target/Hexagon: Fix CMake build.
llvm-svn: 146724
2011-12-16 06:21:02 +00:00
Jim Grosbach 4a29971f02 ARM NEON aliases for vmovq.f*
llvm-svn: 146714
2011-12-16 00:12:22 +00:00
Jim Grosbach 66886253a7 Thumb2 ADR assembly parsing w/o the .w suffix.
llvm-svn: 146710
2011-12-15 23:52:17 +00:00
Eli Friedman 64944090ff Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Nick Lewycky c9e935c7e2 Move parts of lib/Target that use CodeGen into lib/CodeGen.
llvm-svn: 146702
2011-12-15 22:58:58 +00:00
Eli Friedman c9bf1b1bff Make check a bit more strict so we don't call ARM_AM::getFP32Imm with a value that isn't a 32-bit value. (This is just to be safe; I don't think this actually causes any issues in practice.)
llvm-svn: 146700
2011-12-15 22:56:53 +00:00
Jim Grosbach a47294e24d ARM NEON VCLE is an alias for VCGE w/ the source operands reversed.
llvm-svn: 146699
2011-12-15 22:56:33 +00:00
Tony Linthicum b3705e0b9e Add MCTargetDesc library to Hexagon target
llvm-svn: 146692
2011-12-15 22:29:08 +00:00
Jim Grosbach 4a5c887370 ARM NEON VTBL/VTBX assembly parsing and encoding.
llvm-svn: 146691
2011-12-15 22:27:11 +00:00
Jakob Stoklund Olesen cba8e8c3e0 Enable proper constant island alignment by default.
The code size increase is tiny (< 0.05%) because so little code uses
16-byte constant pool entries.

llvm-svn: 146690
2011-12-15 22:14:45 +00:00
Chad Rosier 41dbf59e12 Add missing zmovl AVX patterns which were causing crashes.
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!

llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Jim Grosbach c2f16a3499 Silence warning.
llvm-svn: 146686
2011-12-15 21:54:55 +00:00
Jim Grosbach 2f50e92f40 ARM NEON two-register double spaced register list parsing support.
llvm-svn: 146685
2011-12-15 21:44:33 +00:00
Chad Rosier 75ed9dcbc6 Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX.
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!

llvm-svn: 146684
2011-12-15 21:34:44 +00:00
Lang Hames c44b5e469b Fix VSELECT operand order. Was previously backwards, causing bogus vector shift results - <rdar://problem/10559581>.
llvm-svn: 146671
2011-12-15 18:57:27 +00:00
Hal Finkel 9dd3f62b38 Ensure that the nop that should follow a bl call in PPC64 ELF actually does
llvm-svn: 146664
2011-12-15 17:54:01 +00:00
Richard Osborne 275e874c67 Pass optLevel to XCoreDAGToDAGISel.
Patch by Kyriakos Georgiou.

llvm-svn: 146656
2011-12-15 15:18:35 +00:00
Chad Rosier b7a0b89ff0 Use SmallVector/assign(), rather than std::vector/push_back().
llvm-svn: 146627
2011-12-15 01:16:09 +00:00
Chad Rosier 1940baa76b Add support for lowering fneg when AVX is enabled.
rdar://10566486

llvm-svn: 146625
2011-12-15 01:02:25 +00:00
Bill Wendling ae94fb4009 The saved registers weren't being processed in the correct order. This lead to
the compact unwind claiming that one register was saved before another, which
isn't all that great in general. Process them in the natural order. Reverse the
list only when necessary for the algorithm.

llvm-svn: 146612
2011-12-14 23:53:24 +00:00
Jakob Stoklund Olesen 9efd7ebf0a Consider CPE alignment in CreateNewWater().
An aligned constant pool entry may require extra alignment padding where
the new water is created.  Take that into account when computing offset.

Also consider the alignment of other constant pool entries when
splitting a basic block.  Alignment padding may make it necessary to
move the split point higher.

llvm-svn: 146609
2011-12-14 23:48:54 +00:00
Jim Grosbach da51104282 ARM NEON better assembly operand range checking for lane indices of VLD/VST.
llvm-svn: 146608
2011-12-14 23:35:06 +00:00
Jim Grosbach a8aa30b620 ARM NEON VLD2/VST2 lane indexed assembly parsing and encoding.
llvm-svn: 146605
2011-12-14 23:25:46 +00:00
Jim Grosbach bb18fb4f52 ARM NEON fix alignment encoding for VST2 w/ writeback.
Add tests for w/ writeback instruction parsing and encoding.

llvm-svn: 146594
2011-12-14 21:49:24 +00:00
Jim Grosbach 8e987f5e25 Nuke old code. Missed in last commit.
llvm-svn: 146590
2011-12-14 21:41:32 +00:00
Jim Grosbach 88ac761aa4 ARM NEON refactor VST2 w/ writeback instructions.
In addition to improving the representation, this adds support for assembly
parsing of these instructions.

llvm-svn: 146588
2011-12-14 21:32:11 +00:00
Jim Grosbach b7ec06c5c9 ARM NEON improve factoring a bit. No functional change.
llvm-svn: 146585
2011-12-14 20:59:15 +00:00
Evan Cheng da103bf9ec Model ARM predicated write as read-mod-write. e.g.
r0 = mov #0
r0 = moveq #1

Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.

llvm-svn: 146583
2011-12-14 20:00:08 +00:00
Jim Grosbach 8d24618975 ARM NEON VST2 assembly parsing and encoding.
Work in progress. Parsing for non-writeback, single spaced register lists
works now. The rest have the representations better factored, but still
need more to be able to parse properly.

llvm-svn: 146579
2011-12-14 19:35:22 +00:00
Jakob Stoklund Olesen e5585e8fed Fix speling and 80-col.
llvm-svn: 146575
2011-12-14 18:49:13 +00:00
Akira Hatanaka bff84e1914 Add support for local dynamic TLS model in LowerGlobalTLSAddress. Direct object
emission is not supported yet, but a patch that adds the support should follow
soon.

llvm-svn: 146572
2011-12-14 18:26:41 +00:00
Jim Grosbach 4288b9786f Fix copy/pasto that skipped the 'modify' step.
llvm-svn: 146571
2011-12-14 18:12:37 +00:00
Jim Grosbach 1bb6e066f6 ARM/Thumb2 mov vs. mvn alias goes both ways.
llvm-svn: 146570
2011-12-14 17:56:51 +00:00
Chad Rosier ded6160473 VFP2 is required for FP loads. Noticed by inspection.
llvm-svn: 146569
2011-12-14 17:55:03 +00:00
Chad Rosier fce28914ea Tidy up.
llvm-svn: 146568
2011-12-14 17:32:02 +00:00
Jim Grosbach a342667fd0 ARM/Thumb2 'cmp rn, #imm' alias to cmn.
When 'cmp rn #imm' doesn't match due to the immediate not being representable,
but 'cmn rn, #-imm' does match, use the latter in place of the former, as
it's equivalent.

rdar://10552389

llvm-svn: 146567
2011-12-14 17:30:24 +00:00
Chad Rosier a26979be29 Fix 80-column violation and extraneous brackets.
llvm-svn: 146566
2011-12-14 17:26:05 +00:00
Jim Grosbach ab5830e51b ARM assembler support for the target-specific .req directive.
rdar://10549683

llvm-svn: 146543
2011-12-14 02:16:11 +00:00
Evan Cheng 7fae11b231 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Jim Grosbach 485e5622f4 Thumb2 assembler aliases for "mov(shifted register)"
rdar://10549767

llvm-svn: 146520
2011-12-13 22:45:11 +00:00
Jim Grosbach 18bf363078 ARM LDM/STM system instruction variants.
rdar://10550269

llvm-svn: 146519
2011-12-13 21:48:29 +00:00
Jim Grosbach 6eb142a616 Thumb2 pre/post indexed stores can be from any non-PC GPR.
rdar://10549786

llvm-svn: 146518
2011-12-13 21:10:25 +00:00
Jim Grosbach 5ac89675a0 Thumb2 tweak for ccout handling in RSB parsing.
llvm-svn: 146516
2011-12-13 21:06:41 +00:00
Jim Grosbach 1f1a3598c2 ARM thumb2 parsing of "rsb rd, rn, #0".
rdar://10549741

llvm-svn: 146515
2011-12-13 20:50:38 +00:00
Jim Grosbach 4b0844e191 ARM NEON two-operand aliases for VQDMULH.
llvm-svn: 146514
2011-12-13 20:40:37 +00:00
Jim Grosbach 561e4e18cf ARM pre-UAL NEG mnemonic for convenience when porting old code.
llvm-svn: 146511
2011-12-13 20:23:22 +00:00
Jim Grosbach 2a2348e6c2 ARM add some more pre-UAL VFP mnemonics for convenience when porting old code.
llvm-svn: 146508
2011-12-13 20:13:48 +00:00
Jim Grosbach 9227f39c53 ARM add more 'gas' compatibility aliases for NEON instructions.
llvm-svn: 146507
2011-12-13 20:08:32 +00:00
Chad Rosier 563de603f7 [fast-isel] Unaligned loads of floats are not supported. Therefore, convert to a regular
load and then move the result from a GPR to a FPR.

llvm-svn: 146502
2011-12-13 19:22:14 +00:00
Akira Hatanaka 5e9d16cb53 Expand .cprestore directive to multiple instructions if the offset does not fit
in a 16-bit field.

llvm-svn: 146469
2011-12-13 03:09:05 +00:00
Chandler Carruth 637cc6a8aa Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Jakob Stoklund Olesen bfa576fe8e Account for CPE alignment when searching for new water.
Constant pool entries with different alignment may cause more alignment
padding to be inserted. Compute the amount of padding needed, and try to
pick the location that requires the least amount of padding.

Also take the extra padding into account when the water is above the
use.

llvm-svn: 146458
2011-12-13 00:44:30 +00:00
NAKAMURA Takumi 4ea3c8f54a Target/Hexagon: Fix CMake build. We don't use add_llvm_library_dependencies().
llvm-svn: 146457
2011-12-13 00:36:04 +00:00
Daniel Dunbar 8889bb08b8 LLVMBuild: Introduce a common section which currently has a list of the
subdirectories to traverse into.
 - Originally I wanted to avoid this and just autoscan, but this has one key
   flaw in that new subdirectories can not automatically trigger a rerun of the
   llvm-build tool. This is particularly a pain when switching back and forth
   between trees where one has added a subdirectory, as the dependencies will
   tend to be wrong. This will also eliminates FIXME implicitly.

llvm-svn: 146436
2011-12-12 22:45:54 +00:00
Akira Hatanaka 5d5e0d819d Emit B (unconditional branch) when -relocation-model=pic and J (jump) when
-relocation-model=static.

llvm-svn: 146432
2011-12-12 22:39:35 +00:00
Akira Hatanaka faa88c0add Fix indentation.
llvm-svn: 146431
2011-12-12 22:38:19 +00:00
Tony Linthicum 36e0519ca2 fix warning
llvm-svn: 146420
2011-12-12 21:52:59 +00:00
Bob Wilson fadc2c83e5 Implement 'e' and 'f' modifiers for Neon inline asm. <rdar://problem/10551006>
These modifiers simply select either the low or high D subregister of a Neon
Q register.  I've also removed the unimplemented 'p' modifier, which turns out
to be a bit different than the comment here suggests and as far as I can tell
was only intended for internal use in Apple's version of gcc.

llvm-svn: 146417
2011-12-12 21:45:15 +00:00
Tony Linthicum 1213a7a57f Hexagon backend support
llvm-svn: 146412
2011-12-12 21:14:40 +00:00
Daniel Dunbar 27a7489a03 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Jan Sjödin 7c0face455 XOP instructions and encoding tests.
llvm-svn: 146407
2011-12-12 19:37:49 +00:00
Jakob Stoklund Olesen 91a7bcbb9b Add a postOffset() alignment argument.
This computes the offset of the layout sucessor block, considering its
alignment as well.

llvm-svn: 146401
2011-12-12 19:25:54 +00:00
Jakob Stoklund Olesen 0863de458d Fix typo.
llvm-svn: 146400
2011-12-12 19:25:51 +00:00