llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	d6641af80c	With the fix in r138164: "Add <imp-def> operands to QQ and QQQQ stack loads." -verify-machineinstrs can be enabled for this test case. llvm-svn: 138171	2011-08-20 00:34:45 +00:00
Chad Rosier	be7625161e	VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg. Therefore, rather then generate a pseudo instruction, which is later expanded, generate the necessary instructions in place. llvm-svn: 138163	2011-08-20 00:17:25 +00:00
Devang Patel	59e27c5f12	Do not use named md nodes to track variables that are completely optimized. This does not scale while doing LTO with debug info. New approach is to include list of variables in the subprogram info directly. llvm-svn: 138145	2011-08-19 23:28:12 +00:00
Jim Grosbach	8d77bb5f06	Use regex to remove false dependencies on register allocation. llvm-svn: 138137	2011-08-19 23:10:31 +00:00
Jim Grosbach	066e9ec1e4	Update tests. llvm-svn: 138116	2011-08-19 22:19:48 +00:00
Jakob Stoklund Olesen	90b6018c8f	Add test case for r138018. llvm-svn: 138033	2011-08-19 04:30:24 +00:00
Akira Hatanaka	fb4161ae88	Use subword loads instead of a 4-byte load when the size of a structure (or a piece of it) that is being passed by value is smaller than a word. llvm-svn: 138007	2011-08-18 23:39:37 +00:00
Ivan Krasin	d7cbd4c518	FastISel: avoid function calls between the materialization of the constant and its use. llvm-svn: 137993	2011-08-18 22:06:10 +00:00
Jim Grosbach	90103ccc05	Thumb assembly parsing and encoding for LDM instruction. Fix base register type and canonicallize to the "ldm" spelling rather than "ldmia." Add diagnostics for incorrect writeback token and out-of-range registers. llvm-svn: 137986	2011-08-18 21:50:53 +00:00
Richard Osborne	56f3b70225	Add intrinsics for SETEV, GETED, GETET. llvm-svn: 137938	2011-08-18 13:00:48 +00:00
Bruno Cardoso Lopes	3c7d6eb64c	Cleanup vector logical ops in AVX and add use int versions for simple v2i64 llvm-svn: 137919	2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes	1a87fcb9ba	Fix PR10688. Add support for spliting 256-bit vector shifts when the shift amount is variable llvm-svn: 137885	2011-08-17 22:12:20 +00:00
Jim Grosbach	e2a0404a69	Thumb assembly parsing and encoding for ADR. llvm-svn: 137864	2011-08-17 20:37:40 +00:00
Bruno Cardoso Lopes	be5e987379	Introduce matching patterns for vbroadcast AVX instruction. The idea is to match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810	2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes	3400825b41	Update test to not use the scalar type to splat from a load llvm-svn: 137809	2011-08-17 02:29:15 +00:00
Bruno Cardoso Lopes	ed786a346e	Now that we have a canonical way to handle 256-bit splats: vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807	2011-08-17 02:29:10 +00:00
Akira Hatanaka	5360f88355	Add support for ext and ins. llvm-svn: 137804	2011-08-17 02:05:42 +00:00
Bruno Cardoso Lopes	2e99f1b3aa	Instead of always leaving the work to the generic legalizer when there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733	2011-08-16 18:21:54 +00:00
Akira Hatanaka	7d7bec5acf	Add test case for r137711. llvm-svn: 137725	2011-08-16 17:32:01 +00:00
Akira Hatanaka	2263c10946	Fix handling of double precision loads and stores when Mips1 is targeted. Mips1 does not support double precision loads or stores, therefore two single precision loads or stores must be used in place of these instructions. This patch treats double precision loads and stores as if they are legal instructions until MCInstLowering, instead of generating the single precision instructions during instruction selection or Prolog/Epilog code insertion. Without the changes made in this patch, llc produces code that has the same problem described in r137484 or bails out when MipsInstrInfo::storeRegToStackSlot or loadRegFromStackSlot is called before register allocation. llvm-svn: 137711	2011-08-16 03:51:51 +00:00
Bruno Cardoso Lopes	cbe7feeab9	Fix PR10656. It's only profitable to use 128-bit inserts and extracts when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661	2011-08-15 21:45:54 +00:00
Eric Christopher	8c5f3f7624	Fix this test to avoid leaving a temporary file behind. llvm-svn: 137651	2011-08-15 20:55:03 +00:00
Bob Wilson	d1de7764be	Expand VMOVQQQQ pseudo instructions. Apparently we never added code to expand these pseudo instructions, and in over a year, no one has noticed. Our register allocator must be awesome! llvm-svn: 137551	2011-08-13 05:14:55 +00:00
Bruno Cardoso Lopes	f15dfe5818	The VPERM2F128 is a AVX instruction which permutes between two 256-bit vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519	2011-08-12 21:48:26 +00:00
Akira Hatanaka	2fcc1cfdce	Define unaligned load and store. llvm-svn: 137515	2011-08-12 21:30:06 +00:00
Akira Hatanaka	2f6b944f56	Test case for 137484 llvm-svn: 137486	2011-08-12 18:12:06 +00:00
Akira Hatanaka	79d60d0e94	Enclose directive .cprestore with .set macro and nomacro to silence assembler warning. llvm-svn: 137378	2011-08-11 22:42:31 +00:00
Bruno Cardoso Lopes	8fbf023c9b	Add a dag combine to xform 256-bit shuffles into simple vector inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362	2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes	9eb3762e08	Fix the test added by Nadav in r137308. Make it more strict: 1) check for the "v" version of movaps 2) add a couple of CHECK-NOT to guarantee the behavior 3) move to a more appropriate test file llvm-svn: 137361	2011-08-11 21:50:35 +00:00
Bruno Cardoso Lopes	043c820800	Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. llvm-svn: 137324	2011-08-11 18:59:13 +00:00
Jim Grosbach	27ad83d8a9	ARM push of a single register encodes as pre-indexed STR. Per the ARM ARM, a 'push' of a single register encodes as an STR, not an STM. llvm-svn: 137318	2011-08-11 18:07:11 +00:00
Jim Grosbach	8ba76c6d5c	ARM pop of a single register encodes as post-indexed LDR. Per the ARM ARM, a 'pop' of a single register encodes as an LDR, not an LDM. llvm-svn: 137316	2011-08-11 17:35:48 +00:00
Nadav Rotem	1542d5a00a	[AVX] If the data which is going to be saved is already in two XMM registers (for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308	2011-08-11 16:41:21 +00:00
Chris Lattner	5a2c70cc8f	add missing colon, thanks peter. llvm-svn: 137306	2011-08-11 16:15:10 +00:00
Chris Lattner	96710b4308	fix PR10605 / rdar://9930964 by adding a pretty scary missed check. It's somewhat surprising anything works without this. Before we would compile the testcase into: test: # @test movl $4, 8(%rdi) movl 8(%rdi), %eax orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 now we produce: test: # @test movl 8(%rdi), %eax movl $4, 8(%rdi) orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 llvm-svn: 137303	2011-08-11 06:26:54 +00:00
Bruno Cardoso Lopes	a2d8bb97b9	Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296	2011-08-11 02:49:44 +00:00
Bruno Cardoso Lopes	572c9aaf53	Use the splat index to generate the desired shuffle. Otherwise we could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295	2011-08-11 02:49:41 +00:00
Eli Friedman	3ae39f8ad1	Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292	2011-08-11 01:48:05 +00:00
NAKAMURA Takumi	504769fc2f	test/CodeGen/X86/opt-shuff-tstore.ll: Add explicit -mtriple=x86_64-linux. llvm-svn: 137262	2011-08-10 22:52:48 +00:00
Devang Patel	37a62058fe	While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases. llvm-svn: 137250	2011-08-10 21:25:34 +00:00
Nadav Rotem	d2b071f562	Fix the test. Add cpu target. llvm-svn: 137241	2011-08-10 19:49:19 +00:00
Nadav Rotem	410a11fe82	When performing a truncating store, it is sometimes possible to rearrange the data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238	2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes	3ff111c12d	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Rafael Espindola	36a3abc671	Add support for the R and Q constraints. llvm-svn: 137217	2011-08-10 16:26:42 +00:00
Bruno Cardoso Lopes	278ffd7d8e	Fix a bug in vpermilps mask checking. Fix PR10560 llvm-svn: 137194	2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes	72323966c8	Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 llvm-svn: 137179	2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes	fc481959d2	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	6963062a99	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Eli Friedman	4ef2426b87	Fix a couple ridiculous copy-paste errors. rdar://9914773 . llvm-svn: 137160	2011-08-09 22:17:39 +00:00
Bill Wendling	d7f41b7f66	Revert r137134. It breaks some code as Eli pointed out. llvm-svn: 137135	2011-08-09 18:56:35 +00:00
Bill Wendling	84ec8f65d1	Print out the variable declaration only if it is a declaration. Otherwise, a 'static' variable will be emitted twice. PR10081 llvm-svn: 137134	2011-08-09 18:31:50 +00:00
Jakob Stoklund Olesen	53910d6aae	Inflate register classes after coalescing. Coalescing can remove copy-like instructions with sub-register operands that constrained the register class. Examples are: x86: GR32_ABCD:sub_8bit_hi -> GR32 arm: DPR_VFP2:ssub0 -> DPR Recompute the register class of any virtual registers that are used by less instructions after coalescing. This affects code generation for the Cortex-A8 where we use NEON instructions for f32 operations, c.f. fp_convert.ll: vadd.f32 d16, d1, d0 vcvt.s32.f32 d0, d16 The register allocator is now free to use d16 for the temporary, and that comes first in the allocation order because it doesn't interfere with any s-registers. llvm-svn: 137133	2011-08-09 18:19:41 +00:00
Bruno Cardoso Lopes	bed48dc8ff	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	24dd1d4a27	Revert r137114 llvm-svn: 137127	2011-08-09 17:39:01 +00:00
Justin Holewinski	db05c2b963	PTX: Add initial support for device function calls - Calls are supported on SM 2.0+ for function with no return values llvm-svn: 137125	2011-08-09 17:36:31 +00:00
Bruno Cardoso Lopes	ad3453cf2d	Handle sitofp between v4f64 <- v4i32. Fix PR10559 llvm-svn: 137114	2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes	1155b1eafa	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	337a7fdb13	Rename and tidy up tests llvm-svn: 137103	2011-08-09 03:04:23 +00:00
Bruno Cardoso Lopes	2fc107365b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	af6a85484c	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	c96953c12a	Add support for several vector shifts operations while in AVX mode. Fix PR10581 llvm-svn: 137067	2011-08-08 21:31:08 +00:00
Eli Friedman	a27da98921	Fix up the patterns for SXTB, SXTH, UXTB, and UXTH so that they are correctly active without HasT2ExtractPack. PR10611. llvm-svn: 137061	2011-08-08 19:49:37 +00:00
Jakob Stoklund Olesen	4f0ace5674	Don't clobber pending ST regs when FP regs are killed. X86FloatingPoint keeps track of pending ST registers for an upcoming inline asm instruction with fixed stack register constraints. It does this by remembering which FP register holds the value that should appear at a fixed stack position for the inline asm. When that FP register is killed before the inline asm, make sure to duplicate it to a scratch register, so the ST register still has a live FP reference. This could happen when the same FP register was copied to two ST registers, or when a spill instruction is inserted between the ST copy and the inline asm. This fixes PR10602. llvm-svn: 137050	2011-08-08 17:15:43 +00:00
Rafael Espindola	9bc32a96be	print st_shndx with the correct number of bits. llvm-svn: 136880	2011-08-04 15:50:13 +00:00
Rafael Espindola	9528995e3f	print st_other with the correct number of bits. llvm-svn: 136877	2011-08-04 15:38:19 +00:00
Rafael Espindola	96df560ce1	print st_type with the correct number of bits. llvm-svn: 136875	2011-08-04 15:24:00 +00:00
Rafael Espindola	79ef75dc49	Print st_bind with the correct number of bits. llvm-svn: 136874	2011-08-04 15:10:35 +00:00
Rafael Espindola	1848231ad1	Print r_sym with the correct number of bits. llvm-svn: 136873	2011-08-04 14:48:27 +00:00
Rafael Espindola	260af5cef6	Print r_type with the correct number of bits. llvm-svn: 136872	2011-08-04 14:39:30 +00:00
Rafael Espindola	65c559c5fb	Change anther counter to decimal. llvm-svn: 136870	2011-08-04 14:01:03 +00:00
Rafael Espindola	cad9e7f094	Don't print a counter in hex. llvm-svn: 136869	2011-08-04 13:39:15 +00:00
Bill Wendling	e234f6ae0c	Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. Fixes PR10527. llvm-svn: 136853	2011-08-04 00:32:58 +00:00
Benjamin Kramer	3c7e9ee480	Remove underscore that's breaking linux buildbots. llvm-svn: 136833	2011-08-03 23:13:01 +00:00
Jakub Staszak	15e5b742ad	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Jakob Stoklund Olesen	da618420ee	Handle IMPLICIT_DEF instructions in X86FloatingPoint. This fixes PR10575. llvm-svn: 136787	2011-08-03 16:33:19 +00:00
Devang Patel	dc9cbaaf23	Use byte offset, instead of element number, to access merged global. llvm-svn: 136759	2011-08-03 01:25:46 +00:00
Rafael Espindola	c48e10cd54	Assume .cfi_startproc is the first thing in a function. If the function is externally visable, create a local symbol to use in the CFE. If not, use the function label itself. Fixes PR10420. llvm-svn: 136716	2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes	5ada908140	Make this kind of lowering to be supported by 256-bit instructions: shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691	2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes	a8e3673816	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	7513939ddd	Since vectors with all ones can't be created with a 256-bit instruction, avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642	2011-08-01 19:51:53 +00:00
Richard Osborne	0cc000ef29	Fix crash with varargs function with no named parameters. llvm-svn: 136623	2011-08-01 16:45:59 +00:00
Jakob Stoklund Olesen	5670f850c6	Revert "Don't check liveness of unallocatable registers." The ARM target depends on CPSR liveness being tracked after register allocation. llvm-svn: 136548	2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen	95cc5440e9	Don't check liveness of unallocatable registers. This includes registers like EFLAGS and ST0-ST7. We don't check for liveness issues in the verifier and scavenger because registers will never be allocated from these classes. While in SSA form, we do care about the liveness of unallocatable unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for MachineDCE and MachineSinking. llvm-svn: 136541	2011-07-29 23:36:21 +00:00
Eric Christopher	aa5030066f	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Bruno Cardoso Lopes	65ce5ea3ba	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	81eb193f2e	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	d23709b18c	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Jakob Stoklund Olesen	b28ee4115d	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	9c3badceba	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	b16081ce8c	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Bruno Cardoso Lopes	76bc28bac6	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	eca99c4b5a	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	9e2a301216	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes	27a30a7792	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Devang Patel	f098ce2757	It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry. llvm-svn: 136196	2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen	c3bcb02154	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	a79c1e0589	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	124ac2b997	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	f8fe47bd2b	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	93dc04d5ca	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	73a8393a47	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	747430417b	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	06b8b571b2	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	d600a0f878	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	9212bf275d	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	123dff0f58	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Eli Friedman	442d1b199f	Attempt to fix test failure reported on llvm-commits. llvm-svn: 135995	2011-07-25 22:28:51 +00:00
Eli Friedman	cbd3ba91b7	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Eli Friedman	ea8c66fea5	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen	56a56eb80e	Correctly handle <undef> tied uses when rewriting after a split. This fixes PR10463. A two-address instruction with an <undef> use operand was incorrectly rewritten so the def and use no longer used the same register, violating the tie constraint. Fix this by always rewriting <undef> operands with the register a def operand would use. llvm-svn: 135885	2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes	7a2075511b	Fix test check! llvm-svn: 135802	2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes	a89039998d	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Rafael Espindola	77242dd537	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes	612e56174b	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	14a95bda04	Although we already support this, add testcases for consistency llvm-svn: 135728	2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes	91eff5140f	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	178fb40612	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	b878caa5e2	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Devang Patel	bcd50a10d5	While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value. llvm-svn: 135627	2011-07-20 21:57:04 +00:00
Eli Friedman	6ed783228d	PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS. llvm-svn: 135595	2011-07-20 18:14:33 +00:00
Evan Cheng	76792992d6	Add MCObjectFileInfo and sink the MCSections initialization code from TargetLoweringObjectFileImpl down to MCObjectFileInfo. TargetAsmInfo is done to one last method. It's almost gone! llvm-svn: 135569	2011-07-20 05:58:47 +00:00
Eric Christopher	60648578ba	New pointer rotate test. llvm-svn: 135562	2011-07-20 03:09:11 +00:00
Akira Hatanaka	a4c09bce9b	Lower memory barriers to sync instructions. llvm-svn: 135537	2011-07-19 23:30:50 +00:00
Evan Cheng	ccf243d56b	Fix an obvious typo that's preventing x86 (32-bit) from using .literal16. llvm-svn: 135535	2011-07-19 23:14:32 +00:00
Akira Hatanaka	f3b29992d5	Use the correct opcodes: SLLV/SRLV or AND must be used instead of SLL/SRL or ANDi, when the instruction does not have any immediate operands. llvm-svn: 135520	2011-07-19 20:34:00 +00:00
Akira Hatanaka	e450358a21	Remove redundant instructions. - In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the instruction being expanded, instead of masking it in thisMBB. - Remove redundant Or in EmitAtomicCmpSwap. llvm-svn: 135495	2011-07-19 18:14:26 +00:00
Richard Osborne	f1b800998a	Add intrinsics for the zext / sext instructions. llvm-svn: 135476	2011-07-19 13:28:50 +00:00
Richard Osborne	252c43ee88	Add intrinsics for the testct, testwct instructions. llvm-svn: 135475	2011-07-19 13:00:40 +00:00
Richard Osborne	707f0beae1	Add intrinsics for the peek and endin instructions. llvm-svn: 135474	2011-07-19 12:50:25 +00:00
Evan Cheng	2129f59637	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Devang Patel	9ab3cac694	Revert r135423. llvm-svn: 135454	2011-07-19 00:28:24 +00:00
Eli Friedman	4d5532a085	FileCheck-ize a couple tests. llvm-svn: 135427	2011-07-18 21:23:42 +00:00
Devang Patel	4dc76f2438	During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases. [take 2] llvm-svn: 135423	2011-07-18 20:55:23 +00:00
Akira Hatanaka	338879a7f4	Do not treat atomic.load.sub differently than other atomic binary intrinsics. llvm-svn: 135418	2011-07-18 19:58:59 +00:00
Akira Hatanaka	27292638bd	Set mayLoad or mayStore flags for SC and LL in order to prevent LICM from moving them out of the loop. Previously, stores and loads to a stack frame object were inserted to accomplish this. Remove the code that was needed to do this. Patch by Sasa Stankovic. llvm-svn: 135415	2011-07-18 18:52:12 +00:00
Jakob Stoklund Olesen	c45d38e14a	Fix a crash when building 177.mesa for armv6. When splitting a live range immediately before an LDR_POST instruction that redefines the address register, make sure to use the correct value number in leaveIntvBefore. We need the value number entering the instruction. <rdar://problem/9793765> llvm-svn: 135413	2011-07-18 18:47:13 +00:00
Bruno Cardoso Lopes	4208cace5f	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Nick Lewycky	d8921f939c	Delete empty unused file. llvm-svn: 135379	2011-07-18 05:54:06 +00:00
Bruno Cardoso Lopes	4480040191	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	8df9cfc279	Fix a couple of things: 1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313	2011-07-15 22:24:33 +00:00
Owen Anderson	454e1c7abb	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Eric Christopher	92464be28c	Check register class matching instead of width of type matching when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180	2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes	6778597deb	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Eric Christopher	0c666b4664	Add a testcase for r135123. Part of rdar://9761830 llvm-svn: 135133	2011-07-14 06:23:09 +00:00
Benjamin Kramer	15cd5a3f12	Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient. llvm-svn: 135126	2011-07-14 01:38:42 +00:00
Bruno Cardoso Lopes	3c4d652210	We already support 256-bit packed ADD, SUB, DIV, MUL. Add testcases. llvm-svn: 135099	2011-07-13 22:28:55 +00:00
Bruno Cardoso Lopes	9613b64916	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Bruno Cardoso Lopes	1021b4a9dd	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Evan Cheng	f863e3fb73	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00
Jim Grosbach	ade1fb17d4	Improve test cases from r134746. Use memory barriers to force if-conversion off for these tests instead of the internal llc command line option ifcvt-limit. llvm-svn: 134986	2011-07-12 16:06:01 +00:00
Andrew Trick	1b9d9b6b7a	Comment correction. llvm-svn: 134958	2011-07-12 03:39:22 +00:00
Jim Grosbach	581da64241	Simplify printing of ARM shifted immediates. Print shifted immediate values directly rather than as a payload+shifter value pair. This makes for more readable output assembly code, simplifies the instruction printer, and is consistent with how Thumb immediates are displayed. llvm-svn: 134902	2011-07-11 16:48:36 +00:00
NAKAMURA Takumi	5b304432bb	test/CodeGen/PowerPC/vector.ll: Tweak redirection >%t >%t to >%t >>%t. See also r134814 (test/CodeGen/X86/vector.ll). llvm-svn: 134900	2011-07-11 16:21:52 +00:00
Cameron Zwarich	61715740cd	Add a missing test for r134882. llvm-svn: 134889	2011-07-11 08:35:17 +00:00
Chris Lattner	b1ed91f397	Land the long talked about "type system rewrite" patch. This patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type " instead "const Type " everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829	2011-07-09 17:41:24 +00:00
Chris Lattner	2522d2df06	more tests not making the jump into the brave new world. llvm-svn: 134820	2011-07-09 16:57:10 +00:00
NAKAMURA Takumi	9c6a679d3b	test/CodeGen/X86/vector.ll: Tweak temporary output to appease Win32 hosts. With Lit (not bash) in a test, multiple redirects >%t might open(%t, "w") multiple. It can be avoided if latter redirect is >>%t. It might work even if ">/dev/null" were used. llvm-svn: 134814	2011-07-09 10:22:28 +00:00
Jakob Stoklund Olesen	bf6afec312	Hoist spills within a basic block. Try to move spills as early as possible in their basic block. This can help eliminate interferences by shortening the live range being spilled. This fixes PR10221. llvm-svn: 134776	2011-07-09 00:25:03 +00:00
Evan Cheng	fd7e3fcad3	Fix broken x86_64 tests which specify non-64-bit cpu's. llvm-svn: 134756	2011-07-08 22:29:33 +00:00
Eli Friedman	e2f76c4ade	Default 64-bit target features and SSE2 on when a triple specifies x86-64. Clean up all the other hacks which are now unnecessary. llvm-svn: 134753	2011-07-08 22:16:47 +00:00
Jim Grosbach	7471937ad7	Make tBX_RET and tBX_RET_vararg predicable. The normal tBX instruction is predicable, so there's no reason the pseudos for using it as a return shouldn't be. Gives us some nice code-gen improvements as can be seen by the test changes. In particular, several tests now have to disable if-conversion because it works too well and defeats the test. llvm-svn: 134746	2011-07-08 21:50:04 +00:00
Julien Lerouge	112fcc164a	Add _allrem, _aullrem and _allmul to the runtime for MSVC. http://llvm.org/bugs/show_bug.cgi?id=10305 llvm-svn: 134744	2011-07-08 21:40:25 +00:00
Cameron Zwarich	f03fa189ca	Add an intrinsic and codegen support for fused multiply-accumulate. The intent is to use this for architectures that have a native FMA instruction. llvm-svn: 134742	2011-07-08 21:39:21 +00:00
Jakob Stoklund Olesen	4931bbc671	Be more aggressive about following hints. RAGreedy::tryAssign will now evict interference from the preferred register even when another register is free. To support this, add the EvictionCost struct that counts how many hints are broken by an eviction. We don't want to break one hint just to satisfy another. Rename canEvict to shouldEvict, and add the first bit of eviction policy that doesn't depend on spill weights: Always make room in the preferred register as long as the evictees can be split and aren't already assigned to their preferred register. Also make the CSR avoidance more accurate. When looking for a cheaper register it is OK to use a new volatile register. Only CSR aliases that have never been used before should be avoided. llvm-svn: 134735	2011-07-08 20:46:18 +00:00
Jim Grosbach	dbfb29d6c0	Use ARMPseudoExpand for ARM tail calls. llvm-svn: 134719	2011-07-08 18:50:22 +00:00
Benjamin Kramer	9960a25006	Emit a more efficient magic number multiplication for exact sdivs. We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building. struct foo { char x[24]; }; long bar(struct foo a, struct foo b) { return a-b; } is now compiled into movl 4(%esp), %eax subl 8(%esp), %eax sarl $3, %eax imull $-1431655765, %eax, %eax instead of movl 4(%esp), %eax subl 8(%esp), %eax movl $715827883, %ecx imull %ecx movl %edx, %eax shrl $31, %eax sarl $2, %edx addl %eax, %edx movl %edx, %eax llvm-svn: 134695	2011-07-08 10:31:30 +00:00
Jakob Stoklund Olesen	bbe2a5cfff	Fix more register allocation sensitive tests. llvm-svn: 134667	2011-07-08 00:24:06 +00:00
Jakob Stoklund Olesen	b138c44276	Remove a test that no longer makes sense. It was testing a linear scan feature: Test if linearscan is unfavoring registers for allocation to allow more reuse of reloads from stack slots. The greedy register allocator doesn't access any stack slots in this function, so the linear scan feature was not being tested. llvm-svn: 134666	2011-07-08 00:24:03 +00:00
Nick Lewycky	9badf60203	Let the inline asm 'q' constraint match float, and on 64-bit double too. Fixes PR9602! llvm-svn: 134665	2011-07-08 00:19:27 +00:00
Eric Christopher	7a2a0f80de	Go ahead and emit the barrier on x86-64 even without sse2. The processor supports it just fine. Fixes PR9675 and rdar://9740801 llvm-svn: 134664	2011-07-08 00:04:56 +00:00
Eric Christopher	9721396dab	Add support for the X86 'l' constraint. Fixes PR10149 and rdar://9738585 llvm-svn: 134648	2011-07-07 22:29:07 +00:00
Evan Cheng	13bcc6c1c7	Add Mode64Bit feature and sink it down to MC layer. llvm-svn: 134641	2011-07-07 21:06:52 +00:00
Evan Cheng	8b2bda09a5	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Lang Hames	2bbdc0bcda	Added a testcase for PR10220. llvm-svn: 134573	2011-07-07 00:36:02 +00:00
Jakub Staszak	3f158fdf6e	Introduce "expect" intrinsic instructions. llvm-svn: 134516	2011-07-06 18:22:43 +00:00
Dan Gohman	ad3e8fda3b	Revert r134366 and add an explicit triple to make this test host-independent. llvm-svn: 134447	2011-07-05 22:09:19 +00:00
Jakob Stoklund Olesen	bbad3bceb7	Fix PR10277. Remat during spilling triggers dead code elimination. If a phi-def becomes unused, that may also cause live ranges to split into separate connected components. This type of splitting is different from normal live range splitting. In particular, there may not be a common original interval. When the split range is its own original, make sure that the new siblings are also their own originals. The range being split cannot be used as an original since it doesn't cover the new siblings. llvm-svn: 134413	2011-07-05 15:38:41 +00:00
NAKAMURA Takumi	bb2f28f41c	test/CodeGen/X86/lsr-nonaffine.ll: Relax expressions for Win64 CC to appease Win32 hosts. llvm-svn: 134366	2011-07-03 09:26:14 +00:00
Chandler Carruth	a6e593b4eb	FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and makes one of the tests actually mean something (as the string 'add' will always appear in the output of this file). llvm-svn: 134358	2011-07-02 21:34:52 +00:00
Chandler Carruth	a33e630c55	FileCheck-ize another X86 test, making it more precisely verify the desired result based on the comments in the file. llvm-svn: 134354	2011-07-02 20:43:16 +00:00
Chandler Carruth	959fe548d7	FileCheck-ize and simplify RUN lines. llvm-svn: 134352	2011-07-02 20:43:11 +00:00
Chandler Carruth	02bece4957	FileCheck-ize llvm-svn: 134351	2011-07-02 20:43:08 +00:00
Chandler Carruth	91c4008373	FileCheck-ize and tighten up assertions to only check the relevant sections. llvm-svn: 134350	2011-07-02 20:43:04 +00:00
Chandler Carruth	4b15dd38a8	FileCheck-ize and cleanup IR. llvm-svn: 134349	2011-07-02 20:43:01 +00:00
Chandler Carruth	a7440ff322	FileCheck-ize llvm-svn: 134348	2011-07-02 20:42:59 +00:00
Chandler Carruth	5c87df3179	Remove a grep that is already checked with FileCheck. llvm-svn: 134346	2011-07-02 20:42:56 +00:00
Chandler Carruth	106ae72933	FileCheck-ize llvm-svn: 134345	2011-07-02 20:42:53 +00:00
Chandler Carruth	fbb7c9ba06	FileCheck-ize and modernize IR. llvm-svn: 134344	2011-07-02 20:42:50 +00:00
Chandler Carruth	81275c3029	FileCheck-ize and simplify RUNs. llvm-svn: 134343	2011-07-02 20:42:48 +00:00
Chandler Carruth	f0e2e37518	FileCheck-ize and modernize the RUN line. llvm-svn: 134342	2011-07-02 20:42:44 +00:00
Chandler Carruth	abd8a8f3ae	FileCheck-ize, tightening checks and avoiding a temporary file. llvm-svn: 134341	2011-07-02 20:42:42 +00:00
Chandler Carruth	6e4d90f7c6	FileCheck-ize, tightening checks and avoiding a temporary file. llvm-svn: 134340	2011-07-02 20:42:39 +00:00
Chandler Carruth	ff0e32536e	FileCheck-ize llvm-svn: 134339	2011-07-02 20:42:36 +00:00
Chandler Carruth	d954bb7ebb	FileCheck-ize llvm-svn: 134338	2011-07-02 20:42:33 +00:00
Chandler Carruth	362bff3bd3	FileCheck-ize a test, avoiding a temporary file. llvm-svn: 134337	2011-07-02 20:42:31 +00:00
Chandler Carruth	f2a29b726f	FileCheck-ize and simplify this test. llvm-svn: 134336	2011-07-02 20:42:28 +00:00
Chandler Carruth	7e44b420e1	FileCheck-ize llvm-svn: 134335	2011-07-02 20:42:25 +00:00
Chandler Carruth	144cf1a974	FileCheck-ize another codegen test. llvm-svn: 134334	2011-07-02 20:42:22 +00:00
Chandler Carruth	1d815f5373	Partially FileCheck-ize a test to remove a weird quoting situation. llvm-svn: 134333	2011-07-02 20:42:20 +00:00
Chandler Carruth	8a3f20abac	FileCheck-ize another test, and upgrade its syntax a bit. llvm-svn: 134332	2011-07-02 20:42:17 +00:00
Chandler Carruth	38d367e473	FileCheck-ize another codegen test, tightening it up. llvm-svn: 134331	2011-07-02 20:42:14 +00:00
Chandler Carruth	bf252382b8	FileCheck-ize another test, making it much more precise for testing the individual cases, while hard coding less about registers in use. llvm-svn: 134330	2011-07-02 20:42:11 +00:00
Chandler Carruth	308b3b66b1	FileCheck-ize another test. This one is more clear and runs fewer commands as a result. llvm-svn: 134329	2011-07-02 20:42:08 +00:00
Chandler Carruth	334faf8f1a	FileCheck-ize a test, no functionality changed. llvm-svn: 134328	2011-07-02 20:42:06 +00:00
Jakob Stoklund Olesen	54f7c59c1a	Better diagnostics when inline asm fails to allocate. asm.c:2:7: error: ran out of registers during register allocation asm(""::"r"(0), "r"(1), "r"(2), "r"(3), "r"(4), "r"(5), "r"(6), "r"(7), "r"(8), "r"(9)); ^ llvm-svn: 134310	2011-07-02 07:17:37 +00:00
Eric Christopher	2eca9d5ddf	Be less specific about register allocation ordering. llvm-svn: 134308	2011-07-02 04:06:41 +00:00
Eric Christopher	a8a56f7e5c	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Dan Gohman	a293f24a0d	Teach IVUsers to stop at non-affine expressions unless they are both outside the loop and reducible. This more completely hides them from LSR, which isn't usually able to do anything meaningful with non-affine expressions anyway, and this consequently hides them from SCEVExpander, which is acutely unprepared for non-affine expressions. Replace test/CodeGen/X86/lsr-nonaffine.ll with a new test that tests the new behavior. This works around the bug in PR10117 / rdar://problem/9633149, and is generally an improvement besides. llvm-svn: 134268	2011-07-01 22:05:19 +00:00
Jim Grosbach	cf1464d943	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Eric Christopher	29f1db85dd	Add support for the 'j' immediate constraint. This is conditionalized on supporting the instruction that the constraint is for 'movw'. Part of rdar://9119939 llvm-svn: 134222	2011-07-01 01:00:07 +00:00
Eric Christopher	c011d31543	Add support for the ARM 't' register constraint. And another testcase for the 'x' register constraint. Part of rdar://9119939 llvm-svn: 134220	2011-07-01 00:30:46 +00:00
Eric Christopher	f1c74595aa	Add support for the 'x' constraint. Part of rdar://9307836 and rdar://9119939 llvm-svn: 134215	2011-07-01 00:14:47 +00:00
Jakob Stoklund Olesen	d0e2352b65	Fix a problem with fast-isel return values introduced in r134018. We would put the return value from long double functions in the wrong register. This fixes gcc.c-torture/execute/conversion.c llvm-svn: 134205	2011-06-30 23:42:18 +00:00
Eric Christopher	f45daac30f	Add support for the 'h' constraint. Part of rdar://9119939 llvm-svn: 134203	2011-06-30 23:23:01 +00:00
Jim Grosbach	b98ab91e39	Thumb1 register to register MOV instruction is predicable. Fix a FIXME and allow predication (in Thumb2) for the T1 register to register MOV instructions. This allows some better codegen with if-conversion (as seen in the test updates), plus it lays the groundwork for pseudo-izing the tMOVCC instructions. llvm-svn: 134197	2011-06-30 22:10:46 +00:00
Jim Grosbach	353da73186	Pseudo-ize the t2LDMIA_RET instruction. It's just a t2LDMIA_UPD instruction with extra codegen properties, so it doesn't need the encoding information. As a side-benefit, we now correctly recognize for instruction printing as a 'pop' instruction. llvm-svn: 134173	2011-06-30 18:25:42 +00:00
Eric Christopher	c932173773	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Devang Patel	0eada03216	Revert r133953 for now. llvm-svn: 134116	2011-06-29 23:50:13 +00:00
Cameron Zwarich	34c8f51d65	In the ARM global merging pass, allow extraneous alignment specifiers. This pass already makes the assumption, which is correct on ARM, that a type's alignment is less than its alloc size. This improves codegen with Clang (which inserts a lot of extraneous alignment specifiers) and fixes <rdar://problem/9695089>. llvm-svn: 134106	2011-06-29 22:24:25 +00:00
Benjamin Kramer	d2a84f6a63	Don't depend on the optimization reverted in r134067. llvm-svn: 134068	2011-06-29 14:07:18 +00:00
Benjamin Kramer	8665f8d916	Revert a part of r126557 which could create unschedulable DAGs. llvm-svn: 134067	2011-06-29 13:47:25 +00:00
Jakob Stoklund Olesen	7297e7e223	Clean up the handling of the x87 fp stack to make it more robust. Drop the FpMov instructions, use plain COPY instead. Drop the FpSET/GET instruction for accessing fixed stack positions. Instead use normal COPY to/from ST registers around inline assembly, and provide a single new FpPOP_RETVAL instruction that can access the return value(s) from a call. This is still necessary since you cannot tell from the CALL instruction alone if it returns anything on the FP stack. Teach fast isel to use this. This provides a much more robust way of handling fixed stack registers - we can tolerate arbitrary FP stack instructions inserted around calls and inline assembly. Live range splitting could sometimes break x87 code by inserting spill code in unfortunate places. As a bonus we handle floating point inline assembly correctly now. llvm-svn: 134018	2011-06-28 18:32:28 +00:00
Roman Divacky	4394e68c24	Implement ISD::VAARG lowering on PPC32. llvm-svn: 134005	2011-06-28 15:30:42 +00:00
Jakob Stoklund Olesen	74dd400410	FileCheckize a couple of tests. Also and add a test for popping dead return values and avoid testing the spill precision. llvm-svn: 133997	2011-06-28 06:25:03 +00:00
Chandler Carruth	e2a1b16963	FileCheck-ize a test that had the strangest TCL quote I've seen yet: an opening single quote with no closing single quote, and with {} quotes "inside" of it. This broke some of our tools that scrape test cases. Also, while here, make the test actually assert what the comment says it asserts. This was essentially authored by Nick Lewycky, and merely typed in by myself. Let me know if this is still missing the mark, but the previous test only succeeded due to the improper quoting preventing anything from matching the grep -- it had a '4(%...)' sequence in the output! llvm-svn: 133980	2011-06-28 02:03:10 +00:00
Evan Cheng	b7d00313dc	Remove the experimental (and unused) pre-ra splitting pass. Greedy regalloc can split live ranges. llvm-svn: 133962	2011-06-27 23:40:45 +00:00
Devang Patel	4dc034df1d	During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases. llvm-svn: 133953	2011-06-27 22:32:04 +00:00
Eric Christopher	aff7ed55bc	Allow lr in the register options here. llvm-svn: 133935	2011-06-27 20:31:01 +00:00
Jakob Stoklund Olesen	f68976071d	Move all inline-asm-fpstack tests to a single file. Also fix some of the tests that were actually testing wrong behavior - An input operand in {st} is only popped by the inline asm when {st} is also in the clobber list. The original bug reports all had ~{st} clobbers as they should. llvm-svn: 133916	2011-06-27 17:27:37 +00:00
Dan Bailey	b49b736519	PTX: corrected tests that were failing llvm-svn: 133875	2011-06-25 19:41:17 +00:00
Dan Bailey	b7ee561399	PTX: Reverting implementation of i8. The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support boolean values. llvm-svn: 133873	2011-06-25 18:16:28 +00:00
Chad Rosier	f3e11190f3	Test case for r133858 (tail call optimize in the presence of byval). llvm-svn: 133863	2011-06-25 02:44:56 +00:00
Devang Patel	f071d72c44	Handle debug info for i128 constants. llvm-svn: 133821	2011-06-24 20:46:11 +00:00
Dan Bailey	dd01c4ac7a	PTX: Add support for i8 type and introduce associated .b8 registers The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates. llvm-svn: 133814	2011-06-24 19:27:10 +00:00
Chad Rosier	fa8d89327f	The Neon VCVT (between floating-point and fixed-point, Advanced SIMD) instructions can be used to match combinations of multiply/divide and VCVT (between floating-point and integer, Advanced SIMD). Basically the VCVT immediate operand that specifies the number of fraction bits corresponds to a floating-point multiply or divide by the corresponding power of 2. For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a combination of VMUL and VCVT (floating-point to integer) as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vmul.f32 d16, d17, d16 vcvt.s32.f32 d16, d16 becomes: vcvt.s32.f32 d16, d16, #3 Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a combinations of VCVT (integer to floating-point) and VDIV as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vcvt.f32.s32 d16, d16 vdiv.f32 d16, d17, d16 becomes: vcvt.f32.s32 d16, d16, #3 llvm-svn: 133813	2011-06-24 19:23:04 +00:00
Akira Hatanaka	35792089e7	Change the chain input of nodes that load the address of a function. This change enables SelectionDAG::getLoad at MipsISelLowering.cpp:1914 to return a pre-existing node instead of redundantly create a new node every time it is called. llvm-svn: 133811	2011-06-24 19:01:25 +00:00
Akira Hatanaka	ca88b4abec	Prevent generation of redundant addiu instructions that compute address of static variables or functions. llvm-svn: 133803	2011-06-24 17:55:19 +00:00
Justin Holewinski	6cdd72a9ca	PTX: Always use registers for return values, but use .param space for device parameters if SM >= 2.0 - Update test cases to be more robust against register allocation changes - Bump up the number of registers to 128 per type - Include Python script to re-generate register file with any number of registers llvm-svn: 133736	2011-06-23 18:10:13 +00:00
Justin Holewinski	a4aecf3dc4	PTX: Fixup test cases for device param changes llvm-svn: 133735	2011-06-23 18:10:08 +00:00
Andrew Trick	67ff0718a4	lit support for REQUIRES: asserts. Take #2. Don't piggyback on the existing config.build_mode. Instead, define a new lit feature for each build feature we need (currently just "asserts"). Teach both autoconf'd and cmake'd Makefiles to define this feature within test/lit.site.cfg. This doesn't require any lit harness changes and should be more robust across build systems. llvm-svn: 133664	2011-06-22 23:23:19 +00:00
Rafael Espindola	2496c1f1f8	Reenable tail duplication of bb with just an unconditional jump, but don't remove blocks that have their address taken. llvm-svn: 133659	2011-06-22 22:31:57 +00:00
Nick Lewycky	90e6a4e5d5	Needs a triple. llvm-svn: 133634	2011-06-22 19:42:14 +00:00
Nick Lewycky	6208a2fd66	Emit trailing padding on constant vectors when TargetData says that the vector is larger than the sum of the elements (including per-element padding). llvm-svn: 133631	2011-06-22 18:55:03 +00:00
Justin Holewinski	6fafebfb6a	PTX: Add signed integer comparisons llvm-svn: 133599	2011-06-22 02:09:50 +00:00
Justin Holewinski	54e3c0f5d9	PTX: Add .address_size directive if PTX version >= 2.3 Patch by Wei-Ren Chen llvm-svn: 133589	2011-06-22 00:43:56 +00:00
Devang Patel	c93ef81e24	Test case for r133560. llvm-svn: 133585	2011-06-22 00:03:42 +00:00
Bob Wilson	646dd0f4d1	Revert r133452: "Emit movq for 64-bit register to XMM register moves..." This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using the integrated assembler. llvm-svn: 133524	2011-06-21 17:35:13 +00:00
Anna Zaks	083f0b5a7e	Add support for sadd.with.overflow and uadd.with.overflow intrinsics to the CBackend by emitting definitions for each intrinsic that occurs in the module. llvm-svn: 133522	2011-06-21 17:18:15 +00:00
Evan Cheng	4c0bd9629d	Teach dag combine to match halfword byteswap patterns. 1. (((x) & 0xFF00) >> 8) \| (((x) & 0x00FF) << 8) => (bswap x) >> 16 2. ((x&0xff)<<8)\|((x&0xff00)>>8)\|((x&0xff000000)>>8)\|((x&0x00ff0000)<<8)) => (rotl (bswap x) 16) This allows us to eliminate most of the def : Pat patterns for ARM rev16 revsh instructions. It catches many more cases for ARM and x86. rdar://9609108 llvm-svn: 133503	2011-06-21 06:01:08 +00:00
Akira Hatanaka	4c406e7457	Re-apply 132758 and 132768 which were speculatively reverted in 132777. llvm-svn: 133494	2011-06-21 00:40:49 +00:00
Justin Holewinski	cd4484d25d	PTX: Fix conversion between predicates and value types llvm-svn: 133454	2011-06-20 18:42:48 +00:00
Nick Lewycky	c7df192279	Emit movq for 64-bit register to XMM register moves, but continue to accept movd when assembling. llvm-svn: 133452	2011-06-20 18:33:26 +00:00
Roman Divacky	254f82112d	Don't apply on PPC64 the 32bit ADDIC optimizations as there's no overflow with 32bit values. llvm-svn: 133439	2011-06-20 15:28:39 +00:00
Nadav Rotem	d34ce4344b	Fix PromoteIntRes_TRUNCATE: Add support for cases where the source vector type is to be split while the target vector is to be promoted. (eg: <4 x i64> -> <4 x i8> ) llvm-svn: 133424	2011-06-20 07:15:58 +00:00
Benjamin Kramer	f3c6d6def5	Update test. llvm-svn: 133390	2011-06-19 12:14:34 +00:00
Nadav Rotem	6d0036e259	Reduce the runtime of the test. Keep only the interesting cases. llvm-svn: 133381	2011-06-19 08:12:43 +00:00
Chris Lattner	8936d2bfbc	Remove support for parsing the "type i32" syntax for defining a numbered top level type without a specified number. This syntax isn't documented and blocks forward progress. llvm-svn: 133371	2011-06-19 00:03:46 +00:00
Chris Lattner	80ed9dc9e5	rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the target indep prefetch change. As usual, updating the testsuite is a PITA. llvm-svn: 133337	2011-06-18 06:05:24 +00:00
Jakob Stoklund Olesen	831ae0105a	Switch ARM to using AltOrders instead of MethodBodies. This slightly changes the GPR allocation order on Darwin where R9 is not a callee-saved register: Before: %R0 %R1 %R2 %R3 %R12 %R9 %LR %R4 %R5 %R6 %R8 %R10 %R11 After: %R0 %R1 %R2 %R3 %R9 %R12 %LR %R4 %R5 %R6 %R8 %R10 %R11 llvm-svn: 133326	2011-06-18 01:14:46 +00:00
Galina Kistanova	197ea52d4b	Moved to the right place. llvm-svn: 133324	2011-06-18 00:59:37 +00:00
Eric Christopher	e4a1266a9a	Fix UMULO support for 2x register width to allow the full range without a libcall to a new mulo<mode> libcall that we'd have to create. Finishes the rest of rdar://9090077 and rdar://9210061 llvm-svn: 133318	2011-06-18 00:09:57 +00:00
Nadav Rotem	ea7822685a	Fix a bug in the type-lowering of integer-promoted elements. Add a check that the newly created simple type is valid before checking its legality. Re-commit the test file. llvm-svn: 133291	2011-06-17 20:54:12 +00:00
Evan Cheng	7552a62af5	Add an alternative rev16 pattern. We should figure out a better way to handle these complex rev patterns. rdar://9609108 llvm-svn: 133289	2011-06-17 20:47:21 +00:00
Eric Christopher	5bbb2bdb46	Lower multiply with overflow checking to __mulo<mode> calls if we haven't been able to lower them any other way. Fixes rdar://9090077 and rdar://9210061 llvm-svn: 133288	2011-06-17 20:41:29 +00:00
Galina Kistanova	ac6bc75030	est 2008-06-04-indirectmem.ll is X86-specific. Move to X86 folder. llvm-svn: 133275	2011-06-17 18:26:23 +00:00
Chris Lattner	6bc5c89093	Stop accepting and ignoring attributes in function types. Attributes are applied to functions and call/invokes, not to types. llvm-svn: 133266	2011-06-17 17:37:13 +00:00
Roman Divacky	d041962c20	Fix a few places where 32bit instructions/registerset were used on PPC64. llvm-svn: 133260	2011-06-17 15:21:10 +00:00
Justin Holewinski	3604d9a421	PTX: Adjust rounding modes * rounding modes for fp add, mul, sub now use .rn * float -> int rounding correctly uses .rzi not .rni * 32bit fdiv for sm13 uses div.rn (instead of div.approx) * 32bit fdiv for sm10 now uses div (instead of div.approx) Approx is not IEEE 754 compatible (and should be optionally set by a flag to the backend instead). The .rn rounding modifier is the PTX default anyway, but it's better to be explicit. All these modifiers should be available by using __fmul_rz functions for example, but support will need to be added for this in the backend. Patch by Dan Bailey llvm-svn: 133253	2011-06-17 12:12:42 +00:00
Chris Lattner	5756c16cdf	make the asmparser reject function and type redefinitions. 'Merging' hasn't been needed since llvm-gcc 3.4 days. llvm-svn: 133248	2011-06-17 07:06:44 +00:00
Chris Lattner	59345c8b65	remove asmparser support for the old getresult instruction, which has been subsumed by extractvalue. llvm-svn: 133247	2011-06-17 06:57:15 +00:00
Chris Lattner	33de427cd6	remove parser support for the obsolete "multiple return values" syntax, which was replaced with return of a "first class aggregate". llvm-svn: 133245	2011-06-17 06:49:41 +00:00
Chris Lattner	def1949c00	Remove support for using "foo" as symbols instead of %"foo". This is ancient syntax and has been long obsolete. As usual, updating the tests is the nasty part of this. llvm-svn: 133242	2011-06-17 06:36:20 +00:00
Chris Lattner	b90ed2233c	manually upgrade a bunch of tests to modern syntax, and remove some that are either unreduced or only test old syntax. llvm-svn: 133228	2011-06-17 03:14:27 +00:00
Cameron Zwarich	033026ffc0	Update an insertion point iterator after replacing a return instruction with a tail call pseudoinstruction. This fixes <rdar://problem/9624333>. llvm-svn: 133227	2011-06-17 02:16:43 +00:00
Jakob Stoklund Olesen	c826df9506	Don't use register classes larger than TLI->getRegClassFor(VT). In Thumb mode we cannot handle GPR virtual registers, even though some instructions can. When isel is lowering a CopyFromReg, it should limit itself to subclasses of getRegClassFor(VT). <rdar://problem/9624323> llvm-svn: 133210	2011-06-16 22:50:38 +00:00
Nick Lewycky	65c47187f2	There's no need to be so picky about the particular register. llvm-svn: 133189	2011-06-16 21:00:00 +00:00
Justin Holewinski	7f191b2a3b	PTX: Finish new calling convention implementation llvm-svn: 133172	2011-06-16 17:50:00 +00:00
Bruno Cardoso Lopes	bbf2ab990f	Add AVX suport for fpextend. Original patch by Syoyo Fujita with more comments by me. llvm-svn: 133153	2011-06-16 07:03:21 +00:00
Eli Friedman	9b3d779537	FileCheck-ize test, and make it work on EABI hosts, like clang-native-arm-cortex-a9. llvm-svn: 133139	2011-06-16 02:36:32 +00:00
Eli Friedman	575d0163bb	Force a triple here so this test doesn't fail on EABI hosts (like clang-native-arm-cortex-a9). llvm-svn: 133134	2011-06-16 01:49:31 +00:00
Nick Lewycky	27d604cbf3	Commit the right set of tests for r133124. Sorry 'bout that! llvm-svn: 133133	2011-06-16 01:35:45 +00:00
Andrew Trick	41369d5e8a	Reenabling this test with REQUIRES: Asserts llvm-svn: 133132	2011-06-16 01:34:41 +00:00
Chad Rosier	aed609da92	Typos. llvm-svn: 133128	2011-06-16 01:24:24 +00:00
Chad Rosier	2730162bee	Revision r128665 added an optimization to make use of NEON multiplier accumulator forwarding. Specifically (from SVN log entry): Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 Make sure it catches cases where operand 1 is add/fadd/sub/fsub, which was intended in the original revision. llvm-svn: 133127	2011-06-16 01:21:54 +00:00
Nick Lewycky	6d677cfdd8	Add a DAGCombine for (ext (binop (load x), cst)). llvm-svn: 133124	2011-06-16 01:15:49 +00:00
Anna Zaks	a56e84e439	Rename the test. Thanks Cameron! Use shorter/generic names. llvm-svn: 133115	2011-06-16 00:34:10 +00:00
Anna Zaks	2c2aa9a9be	Function::getNumBlockIDs() should be used instead of Function::size() to set the upper limit on the block IDs since basic blocks might get removed (simplified away) after being initially numbered. Plus the test case, in which SelectionDAGBuilder::visitBr() calls llvm::MachineFunction::removeFromMBBNumbering(), which introduces the hole in numbering leading to an assert in llc (prior to the fix). llvm-svn: 133113	2011-06-16 00:03:21 +00:00
Rafael Espindola	10028230cf	Testcase for previous commit. llvm-svn: 133089	2011-06-15 21:18:51 +00:00
John McCall	4b7a8d68ae	Add a new function attribute, nonlazybind, which inhibits lazy-loading optimizations when emitting calls to the function; instead those calls may use faster relocations which require the function to be immediately resolved upon loading the dynamic object featuring the call. This is useful when it is known that the function will be called frequently and pervasively and therefore there is no merit in delaying binding of the function. Currently only implemented for x86-64, where it turns into a call through the global offset table. Patch by Dan Gohman, who assures me that he's going to add LangRef documentation for this once it's committed. llvm-svn: 133080	2011-06-15 20:36:13 +00:00
Andrew Trick	967d584a3a	Disabling this test until I can figure out the right lit flags. llvm-svn: 133068	2011-06-15 18:25:38 +00:00
Jakob Stoklund Olesen	5977109f14	Remove custom allocation orders in SystemZ. Note that this actually changes code generation, and someone who understands this target better should check the changes. - R12Q is now allocatable. I think it was omitted from the allocation order by mistake since it isn't reserved. It as apparently used as a GOT pointer sometimes, and it should probably be reserved if that is the case. - The GR64 registers are allocated in a different order now. The register allocator will automatically put the CSRs last. There were other changes to the order that may have been significant. The test fix is because r0 and r1 swapped places in the allocation order. llvm-svn: 133067	2011-06-15 18:02:56 +00:00
Evan Cheng	678b691aa3	Another revsh pattern. rdar://9609059 llvm-svn: 133064	2011-06-15 17:17:48 +00:00
Andrew Trick	3013b6ae4a	Added -stress-sched flag in the Asserts build. Added a test case for handling physreg aliases during pre-RA-sched. llvm-svn: 133063	2011-06-15 17:16:12 +00:00
Chad Rosier	19a1f425a7	TargetLoweringOpt is a struct used by DAGCombine, not a pass. llvm-svn: 133062	2011-06-15 16:48:02 +00:00
Nadav Rotem	24c6558865	This test was failing on X86 machines which do not have SSE4. Fixed the test by specifying that the target CPU is corei7. llvm-svn: 133053	2011-06-15 12:26:53 +00:00
Evan Cheng	6d02d9044b	PerformBFICombine - (bfi A, (and B, Mask1), Mask2) -> (bfi A, B, Mask2) iff the bits being cleared by the AND are not demanded by the BFI. The previous BFI dag combine rule was actually incorrect (or used to be correct until BFI representation changed). rdar://9609030 llvm-svn: 133034	2011-06-15 01:12:31 +00:00
Tanya Lattner	e9e6705cf9	Add an optimization that looks for a specific pair-wise add pattern and generates a vpaddl instruction instead of scalarizing the add. Includes a test case. llvm-svn: 133027	2011-06-14 23:48:48 +00:00
Rafael Espindola	2efebb3610	Add triple. llvm-svn: 133026	2011-06-14 23:47:36 +00:00
Chad Rosier	818e116723	When pattern matching during instruction selection make sure shl x,1 is not converted to add x,x if x is a undef. add undef, undef does not guarantee that the resulting low order bit is zero. Fixes <rdar://problem/9453156> and <rdar://problem/9487392>. llvm-svn: 133022	2011-06-14 22:29:10 +00:00

... 4 5 6 7 8 ...

5147 Commits