llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	ef7f5bf8c9	Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing. llvm-svn: 147767	2012-01-09 06:52:46 +00:00
Rafael Espindola	f28213ca01	Don't print an unused label before .cfi_endproc. llvm-svn: 147763	2012-01-09 00:17:29 +00:00
Craig Topper	744f6311d3	Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level. llvm-svn: 147762	2012-01-09 00:11:29 +00:00
Victor Umansky	540651cf59	Reverted commit #147601 upon Evan's request. llvm-svn: 147748	2012-01-08 17:20:33 +00:00
Rafael Espindola	382412032c	Don't print a label before .cfi_startproc when we don't need to. This makes the produce assembly when using CFI just a bit more readable. llvm-svn: 147743	2012-01-07 22:42:19 +00:00
Jakob Stoklund Olesen	8cdce7e690	Use getRegForValue() to materialize the address of ARM globals. This enables basic local CSE, giving us 20% smaller code for consumer-typeset in -O0 builds. <rdar://problem/10658692> llvm-svn: 147720	2012-01-07 04:07:22 +00:00
Evan Cheng	00b1a3cd7e	Added a late machine instruction copy propagation pass. This catches opportunities that only present themselves after late optimizations such as tail duplication .e.g. ## BB#1: movl %eax, %ecx movl %ecx, %eax ret The register allocator also leaves some of them around (due to false dep between copies from phi-elimination, etc.) This required some changes in codegen passes. Post-ra scheduler and the pseudo-instruction expansion passes have been moved after branch folding and tail merging. They were before branch folding before because it did not always update block livein's. That's fixed now. The pass change makes independently since we want to properly schedule instructions after branch folding / tail duplication. rdar://10428165 rdar://10640363 llvm-svn: 147716	2012-01-07 03:02:36 +00:00
Jakob Stoklund Olesen	68f034ee1a	Use movw+movt in ARMFastISel::ARMMaterializeGV. This eliminates a lot of constant pool entries for -O0 builds of code with many global variable accesses. This speeds up -O0 codegen of consumer-typeset by 2x because the constant island pass no longer has to look at thousands of constant pool entries. <rdar://problem/10629774> llvm-svn: 147712	2012-01-07 01:47:05 +00:00
Eric Christopher	c206d46709	Make the 'x' constraint work for AVX registers as well. Fixes rdar://10614894 llvm-svn: 147704	2012-01-07 01:02:09 +00:00
Jakob Stoklund Olesen	68a922c0e9	Enable aligned NEON spilling by default. Experiments show this to be a small speedup for modern ARM cores. llvm-svn: 147689	2012-01-06 22:19:37 +00:00
Chandler Carruth	e041a30bb9	Prevent a DAGCombine from firing where there are two uses of a combined-away node and the result of the combine isn't substantially smaller than the input, it's just canonicalized. This is the first part of a significant (7%) performance gain for Snappy's hot decompression loop. llvm-svn: 147604	2012-01-05 11:05:55 +00:00
Chandler Carruth	6bc151f5d4	Cleanup and FileCheck-ize a test. llvm-svn: 147603	2012-01-05 11:05:47 +00:00
Victor Umansky	9255b6d9fe	Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX. Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX) Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov llvm-svn: 147601	2012-01-05 08:46:19 +00:00
Benjamin Kramer	aca1885695	FileCheck hygiene. llvm-svn: 147580	2012-01-05 00:43:34 +00:00
Jakob Stoklund Olesen	d110e2a83f	Reapply r146997, "Heed spill slot alignment on ARM." Now that canRealignStack() understands frozen reserved registers, it is safe to use it for aligned spill instructions. It will only return true if the registers reserved at the beginning of register allocation allow for dynamic stack realignment. <rdar://problem/10625436> llvm-svn: 147579	2012-01-05 00:26:57 +00:00
NAKAMURA Takumi	91a3f886ef	test/CodeGen/X86/jump_sign.ll: Add -mcpu=pentiumpro for non-x86 hosts. It uses "cmov". llvm-svn: 147521	2012-01-04 03:52:23 +00:00
Akira Hatanaka	c669d7a6db	Have getRegForInlineAsmConstraint return the correct register class when target is Mips64. llvm-svn: 147516	2012-01-04 02:45:01 +00:00
Evan Cheng	801d98b3f0	Fix more places which should be checking for iOS, not darwin. llvm-svn: 147513	2012-01-04 01:55:04 +00:00
Evan Cheng	104dbb0fd1	For x86, canonicalize max (x > y) ? x : y => (x >= y) ? x : y So for something like (x - y) > 0 : (x - y) ? 0 It will be (x - y) >= 0 : (x - y) ? 0 This makes is possible to test sign-bit and eliminate a comparison against zero. e.g. subl %esi, %edi testl %edi, %edi movl $0, %eax cmovgl %edi, %eax => xorl %eax, %eax subl %esi, $edi cmovsl %eax, %edi rdar://10633221 llvm-svn: 147512	2012-01-04 01:41:39 +00:00
Jakob Stoklund Olesen	1b7f2a7638	Revert r146997, "Heed spill slot alignment on ARM." This patch caused a miscompilation of oggenc because a frame pointer was suddenly needed halfway through register allocation. <rdar://problem/10625436> llvm-svn: 147487	2012-01-03 22:34:35 +00:00
Nadav Rotem	6d31bac85e	Revert 147426 because it caused pr11696. llvm-svn: 147485	2012-01-03 22:19:42 +00:00
Nadav Rotem	1e7dda13c8	Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted. llvm-svn: 147484	2012-01-03 22:12:28 +00:00
Chad Rosier	493c1b3152	Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. rdar://10594409 llvm-svn: 147481	2012-01-03 21:05:52 +00:00
Elena Demikhovsky	8ec21a2801	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147445	2012-01-03 11:59:04 +00:00
Nadav Rotem	6c7a0e6c8b	Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit. llvm-svn: 147426	2012-01-02 08:05:46 +00:00
Craig Topper	b910984458	Allow CRC32 instructions to be selected when AVX is enabled. llvm-svn: 147411	2012-01-01 19:51:58 +00:00
Craig Topper	1c064e0a89	Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers. llvm-svn: 147409	2012-01-01 19:40:22 +00:00
Rafael Espindola	d3df940169	Revert 147399. It broke CodeGen/ARM/vext.ll. llvm-svn: 147400	2012-01-01 17:36:23 +00:00
Elena Demikhovsky	67f80c3432	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147399	2012-01-01 16:22:47 +00:00
Craig Topper	d51092d93a	Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. llvm-svn: 147393	2011-12-31 23:24:49 +00:00
Craig Topper	0e796fee11	Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected. llvm-svn: 147392	2011-12-31 23:15:11 +00:00
Craig Topper	6c08930c5e	Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms. llvm-svn: 147361	2011-12-30 02:18:36 +00:00
Craig Topper	2ca79b9d4b	Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. llvm-svn: 147360	2011-12-30 01:49:53 +00:00
Hal Finkel	692d1fb355	Cleanup stack/frame register define/kill states. This fixes two bugs: 1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test). 2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this. llvm-svn: 147359	2011-12-30 00:34:00 +00:00
Eli Friedman	3a01ddb7e9	Fix type-checking for load transformation which is not legal on floating-point types. PR11674. llvm-svn: 147323	2011-12-28 21:24:44 +00:00
Nadav Rotem	3c3dd6e588	PR11662. Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage. llvm-svn: 147309	2011-12-28 13:08:20 +00:00
Elena Demikhovsky	b3515a8d4b	Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. llvm-svn: 147308	2011-12-28 08:14:01 +00:00
Eli Friedman	e96286cdf2	Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. llvm-svn: 147283	2011-12-26 22:49:32 +00:00
Chandler Carruth	a3d54fe0ae	Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the LZCNT instructions are available. Force promotion to i32 to get a smaller encoding since the fix-ups necessary are just as complex for either promoted type We can't do standard promotion for CTLZ when lowering through BSR because it results in poor code surrounding the 'xor' at the end of this instruction. Essentially, if we promote the entire CTLZ node to i32, we end up doing the xor on a 32-bit CTLZ implementation, and then subtracting appropriately to get back to an i8 value. Instead, our custom logic just uses the knowledge of the incoming size to compute a perfect xor. I'd love to know of a way to fix this, but so far I'm drawing a blank. I suspect the legalizer could be more clever and/or it could collude with the DAG combiner, but how... ;] llvm-svn: 147251	2011-12-24 12:12:34 +00:00
Chandler Carruth	38ce24455d	Add systematic testing for cttz as well, and fix the bug I spotted by inspection earlier. llvm-svn: 147250	2011-12-24 11:46:10 +00:00
Chandler Carruth	103ca80f59	Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test. llvm-svn: 147249	2011-12-24 11:26:59 +00:00
Chandler Carruth	44cf07228b	Tidy up this rather crufty test. Put the declarations at the top to make my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like nounwind. Remove stray uses comments. Name things semantically rather than tN so that adding a new test in the middle doesn't cause pain, and so that new tests can be grouped semantically. This exposes how little systematic testing is going on here. I noticed this by finding several bugs via inspection and wondering why this test wasn't catching any of them. =[ llvm-svn: 147248	2011-12-24 11:26:57 +00:00
Chandler Carruth	c9fcde2347	Expand more when we have a nice 'tzcnt' instruction, to avoid generating 'bsf' instructions here. This one is actually debatable to my eyes. It's not clear that any chip implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding. Still, this restores the old behavior with 'tzcnt' enabled for now. llvm-svn: 147246	2011-12-24 11:11:38 +00:00
Chandler Carruth	eeb3a1ce3e	Tidy up some of these tests. llvm-svn: 147245	2011-12-24 11:11:36 +00:00
Chandler Carruth	7e9453e916	Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the X86ISelLowering C++ code. Because this is lowered via an xor wrapped around a bsr, we want the dagcombine which runs after isel lowering to have a chance to clean things up. In particular, it is very common to see code which looks like: (sizeof(x)8 - 1) ^ __builtin_clz(x) Which is trying to compute the most significant bit of 'x'. That's actually the value computed directly by the 'bsr' instruction, but if we match it too late, we'll get completely redundant xor instructions. The more naive code for the above (subtracting rather than using an xor) still isn't handled correctly due to the dagcombine getting confused. Also, while here fix an issue spotted by inspection: we should have been expanding the zero-undef variants to the normal variants when there is an 'lzcnt' instruction. Do so, and test for this. We don't want to generate unnecessary 'bsr' instructions. These two changes fix some regressions in encoding and decoding benchmarks. However, there is still a lot* to be improve on in this type of code. llvm-svn: 147244	2011-12-24 10:55:54 +00:00
Chandler Carruth	15075d4b19	Cleanup this test a bit, sorting things and grouping them more clearly. llvm-svn: 147243	2011-12-24 10:55:42 +00:00
Akira Hatanaka	79329ce425	Test case for r147232. llvm-svn: 147233	2011-12-24 03:05:43 +00:00
Jakob Stoklund Olesen	0965585cb1	Experimental support for aligned NEON spills. ARM targets with NEON units have access to aligned vector loads and stores that are potentially faster than unaligned operations. Add support for spilling the callee-saved NEON registers to an aligned stack area using 16-byte aligned NEON loads and store. This feature is off by default, controlled by an -align-neon-spills command line option. llvm-svn: 147211	2011-12-23 00:36:18 +00:00
Chad Rosier	7248bda595	Fix a couple of copy-n-paste bugs. Noticed by George Russell! llvm-svn: 147064	2011-12-21 18:56:22 +00:00
Evan Cheng	dc8a1aaea6	Fix a couple of copy-n-paste bugs. Noticed by George Russell. llvm-svn: 147032	2011-12-21 03:04:10 +00:00
Akira Hatanaka	964c891e61	Fix bug in zero-store peephole pattern reported in pr11615. The patch and test case were originally written by Mans Rullgard. llvm-svn: 147024	2011-12-21 00:31:10 +00:00
Akira Hatanaka	1d8efaba7e	Expand 64-bit CTLZ nodes if target architecture does not support it. Add test case for DCLO and DCLZ. llvm-svn: 147022	2011-12-21 00:20:27 +00:00
Akira Hatanaka	bd95275f7a	Test case for r147017. llvm-svn: 147018	2011-12-20 23:58:36 +00:00
Akira Hatanaka	cb2a85bc22	Add function MipsDAGToDAGISel::SelectMULT and factor out code that generates nodes needed for multiplication. Add code for selecting 64-bit MULHS and MULHU nodes. llvm-svn: 147008	2011-12-20 23:10:57 +00:00
Akira Hatanaka	cf10f08825	64-bit data directive. llvm-svn: 147005	2011-12-20 22:52:19 +00:00
Akira Hatanaka	494fdf1499	32-to-64-bit sext_inreg pattern. llvm-svn: 147004	2011-12-20 22:40:40 +00:00
Akira Hatanaka	dac1d48d8d	Add code in MipsDAGToDAGISel for selecting constant +0.0. MIPS64 can generate constant +0.0 with a single DMTC1 instruction. llvm-svn: 146999	2011-12-20 22:25:50 +00:00
Jakob Stoklund Olesen	b95c102c2f	Heed spill slot alignment on ARM. Use the spill slot alignment as well as the local variable alignment to determine when the stack needs to be realigned. This works now that the ARM target can always realign the stack by using a base pointer. Still respect the ARMBaseRegisterInfo::canRealignStack() function vetoing a realigned stack. Don't use aligned spill code in that case. llvm-svn: 146997	2011-12-20 22:15:04 +00:00
Evan Cheng	68132d8093	ARM target code clean up. Check for iOS, not Darwin where it makes sense. llvm-svn: 146981	2011-12-20 18:26:50 +00:00
Elena Demikhovsky	ec7e6e0946	This is the second fix related to VZEXT_MOVL node. The failure that I see in the current version is: LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14] 0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13] 0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12] 0x18b9870: v4i64 = undef [ID=4] 0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10] 0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9970: i32 = Constant<0> [ID=3] 0x18b9170: v2i64 = undef [ORD=1] [ID=1] 0x18b9570: i32 = Constant<2> [ID=5] llvm-svn: 146975	2011-12-20 13:34:28 +00:00
Chandler Carruth	24680c24d8	Begin teaching the X86 target how to efficiently codegen patterns that use the zero-undefined variants of CTTZ and CTLZ. These are just simple patterns for now, there is more to be done to make real world code using these constructs be optimized and codegen'ed properly on X86. The existing tests are spiffed up to check that we no longer generate unnecessary cmov instructions, and that we generate the very important 'xor' to transform bsr which counts the index of the most significant one bit to the number of leading (most significant) zero bits. Also they now check that when the variant with defined zero result is used, the cmov is still produced. llvm-svn: 146974	2011-12-20 11:19:37 +00:00
Bob Wilson	75f12cc3fe	Mark ARM eh_sjlj_dispatchsetup as clobbering all registers. Radar 10567930. We used to rely on the *eh_sjlj_setjmp instructions to mark that a function with setjmp/longjmp exception handling clobbers all the registers. But with the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are expanded away earlier, before PEI can see them to determine what registers to save and restore. Mark the dispatchsetup instruction in the same way, since that instruction cannot be expanded early. This also more accurately reflects when the registers are clobbered. llvm-svn: 146949	2011-12-20 01:29:27 +00:00
Evan Cheng	3bfaefe9e7	Move tests to FileCheck. llvm-svn: 146923	2011-12-19 23:26:44 +00:00
Akira Hatanaka	37c45db189	Add a test case for r146900. llvm-svn: 146901	2011-12-19 20:24:28 +00:00
Akira Hatanaka	db47e0c49d	Add patterns for matching immediates whose lower 16-bit is cleared. These patterns emit a single LUi instruction instead of a pair of LUi and ORi. llvm-svn: 146900	2011-12-19 20:21:18 +00:00
Akira Hatanaka	2a232d81f6	Remove definitions of double word shift plus 32 instructions. Assembler or direct-object emitter should emit the appropriate shift instruction depending on the shift amount. llvm-svn: 146893	2011-12-19 19:44:09 +00:00
Akira Hatanaka	3c9f336361	Remove the restriction on the first operand of the add node in SelectAddr. This change reduces the number of instructions generated. For example, (load (add (sub $n0, $n1), (MipsLo got(s)))) results in the following sequence of instructions: 1. sub $n2, $n0, $n1 2. lw got(s)($n2) Previously, three instructions were needed. 1. sub $n2, $n0, $n1 2. addiu $n3, $n2, got(s) 3. lw 0($n3) llvm-svn: 146888	2011-12-19 19:28:37 +00:00
Evan Cheng	903231bc58	Fix a CPSR liveness tracking bug introduced when I converted IT block to bundle. llvm-svn: 146805	2011-12-17 01:25:34 +00:00
Lang Hames	da07b3ad42	Make sure that the lower bits on the VSELECT condition are properly set. llvm-svn: 146800	2011-12-17 01:08:46 +00:00
Jakob Stoklund Olesen	9790187b6c	Fix off-by-one error in bucket sort. The bad sorting caused a misaligned basic block when building 176.vpr in ARM mode. <rdar://problem/10594653> llvm-svn: 146767	2011-12-16 23:00:05 +00:00
Benjamin Kramer	9ca2e7293b	Hexagon: Fix a nasty order-of-initialization bug. Reenable the tests. llvm-svn: 146750	2011-12-16 19:08:59 +00:00
Craig Topper	a4d411cb1b	Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes. llvm-svn: 146726	2011-12-16 08:06:31 +00:00
Chad Rosier	41dbf59e12	Add missing zmovl AVX patterns which were causing crashes. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146689	2011-12-15 22:11:31 +00:00
Chad Rosier	75ed9dcbc6	Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146684	2011-12-15 21:34:44 +00:00
Lang Hames	918f976e66	Set specific target cpu for testcase. llvm-svn: 146678	2011-12-15 20:22:34 +00:00
Lang Hames	2d6d3a2f96	Added test case for r146671. llvm-svn: 146675	2011-12-15 19:56:07 +00:00
Hal Finkel	750366f014	Add a test case to make sure that the nop really does follow the bl on ppc64 elf llvm-svn: 146666	2011-12-15 17:59:23 +00:00
Eli Friedman	2ec824966d	Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570. llvm-svn: 146630	2011-12-15 02:07:20 +00:00
Chad Rosier	1940baa76b	Add support for lowering fneg when AVX is enabled. rdar://10566486 llvm-svn: 146625	2011-12-15 01:02:25 +00:00
Devang Patel	c268688643	Do not sink instruction, if it is not profitable. On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator. Radar 10266272. llvm-svn: 146604	2011-12-14 23:20:38 +00:00
Akira Hatanaka	bff84e1914	Add support for local dynamic TLS model in LowerGlobalTLSAddress. Direct object emission is not supported yet, but a patch that adds the support should follow soon. llvm-svn: 146572	2011-12-14 18:26:41 +00:00
Evan Cheng	7fae11b231	- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. llvm-svn: 146542	2011-12-14 02:11:42 +00:00
Chad Rosier	4020ae75ea	Add newline at EOF. llvm-svn: 146538	2011-12-14 01:34:39 +00:00
Chad Rosier	563de603f7	[fast-isel] Unaligned loads of floats are not supported. Therefore, convert to a regular load and then move the result from a GPR to a FPR. llvm-svn: 146502	2011-12-13 19:22:14 +00:00
Akira Hatanaka	341850fdc6	Move direct object emitter test to directory test/MC/Mips. Rename it to elf-relsym.ll. llvm-svn: 146470	2011-12-13 03:50:34 +00:00
Akira Hatanaka	e41963ce47	Relocation against a symbol, instead of against section. We had some extreme test cases where there were a lot of relocations applied relative to a large rodata section. Gas would create a symbol for each of these whereas we would be relative to the beginning of the rodata section. This change mimics what gas does. Patch by Jack Carter. llvm-svn: 146468	2011-12-13 02:27:40 +00:00
Tony Linthicum	525ca5fc69	Temporarily disable Hexagon tests. They are failing on OS X llvm-svn: 146455	2011-12-13 00:33:45 +00:00
Akira Hatanaka	9e5908ae3a	Test case for r146432 by Jack Carter. llvm-svn: 146433	2011-12-12 22:41:39 +00:00
Bob Wilson	fadc2c83e5	Implement 'e' and 'f' modifiers for Neon inline asm. <rdar://problem/10551006> These modifiers simply select either the low or high D subregister of a Neon Q register. I've also removed the unimplemented 'p' modifier, which turns out to be a bit different than the comment here suggests and as far as I can tell was only intended for internal use in Apple's version of gcc. llvm-svn: 146417	2011-12-12 21:45:15 +00:00
Tony Linthicum	1213a7a57f	Hexagon backend support llvm-svn: 146412	2011-12-12 21:14:40 +00:00
Chandler Carruth	6b0e34c445	Manually upgrade the test suite to specify the flag to cttz and ctlz. I followed three heuristics for deciding whether to set 'true' or 'false': - Everything target independent got 'true' as that is the expected common output of the GCC builtins. - If the target arch only has one way of implementing this operation, set the flag in the way that exercises the most of codegen. For most architectures this is also the likely path from a GCC builtin, with 'true' being set. It will (eventually) require lowering away that difference, and then lowering to the architecture's operation. - Otherwise, set the flag differently dependending on which target operation should be tested. Let me know if anyone has any issue with this pattern or would like specific tests of another form. This should allow the x86 codegen to just iteratively improve as I teach the backend how to differentiate between the two forms, and everything else should remain exactly the same. llvm-svn: 146370	2011-12-12 11:59:10 +00:00
Stepan Dyatkovskiy	4683740967	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Third attempt: simplified checks in test for armv7-apple-darwin11. llvm-svn: 146341	2011-12-11 14:35:48 +00:00
Chad Rosier	1c468af854	Revert associate SelectInsertValue test as well. llvm-svn: 146332	2011-12-10 21:34:28 +00:00
Chad Rosier	6641294e3b	Revert r146322 to appease buildbots. Original commit message: Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146328	2011-12-10 19:55:03 +00:00
Stepan Dyatkovskiy	df0b779e9f	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146322	2011-12-10 08:42:24 +00:00
Hal Finkel	67a7f18faf	Make CR spill and restore use a reserved register. These operations cannot use the register scavenger because the scavenger can only scavenge one register and frame-index elimination may have already grabbed it. llvm-svn: 146318	2011-12-10 04:50:53 +00:00
Eli Friedman	4e36a934dc	Splats can contain undef's; make sure to handle them correctly. PR11526. llvm-svn: 146299	2011-12-09 23:54:42 +00:00
Evan Cheng	1d54d2210a	Update test to something more sensible. llvm-svn: 146282	2011-12-09 21:54:10 +00:00
Chad Rosier	dd998ff4df	[fast-isel] Add support for selecting insertvalue. rdar://10530851 llvm-svn: 146276	2011-12-09 20:09:54 +00:00
Benjamin Kramer	16bbfbec66	X86: Add patterns for the various rounding ops for SSE4.1 and AVX. llvm-svn: 146257	2011-12-09 15:44:03 +00:00
Evan Cheng	5895fa79d6	Forgot setting -march. llvm-svn: 146244	2011-12-09 06:15:00 +00:00
Akira Hatanaka	8e16aac534	jalr should use t9 ($25) for indirect calls regardless of the relocation model specified. llvm-svn: 146229	2011-12-09 01:45:12 +00:00
Eli Friedman	053a724483	Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514. llvm-svn: 146219	2011-12-09 01:16:26 +00:00
Evan Cheng	b96bca81e7	Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417 llvm-svn: 146196	2011-12-08 22:30:45 +00:00
Evan Cheng	2a217be25f	Add various missing AVX patterns which was causing crashes. Sadly, the generated code looks pretty bad compared to SSE. rdar://10538793 llvm-svn: 146191	2011-12-08 22:05:28 +00:00
Owen Anderson	0b9b9da6c8	Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise. llvm-svn: 146171	2011-12-08 19:32:14 +00:00
Evan Cheng	3294538546	Add test for r146163. llvm-svn: 146167	2011-12-08 19:21:39 +00:00
Daniel Dunbar	c09e4593b2	Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).", it is failing tests. llvm-svn: 146157	2011-12-08 17:32:18 +00:00
NAKAMURA Takumi	0faa233439	test/CodeGen/X86/vec_compare-2.ll: Add explicit -mtriple=i686-linux. llvm-svn: 146152	2011-12-08 15:24:09 +00:00
Nadav Rotem	26edb291ac	Fix a bug in the integer-promotion of bitcast operations on vector types. We must not issue a bitcast operation for integer-promotion of vector types, because the location of the values in the vector may be different. llvm-svn: 146150	2011-12-08 13:10:01 +00:00
Stepan Dyatkovskiy	a4bcf27dae	Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). llvm-svn: 146143	2011-12-08 07:55:03 +00:00
Akira Hatanaka	ae378af667	32 to 64-bit zext pattern. llvm-svn: 146096	2011-12-07 23:14:41 +00:00
Akira Hatanaka	b2e05cb6b1	64-bit WrapperPICPat patterns. llvm-svn: 146086	2011-12-07 22:11:43 +00:00
Akira Hatanaka	c5b5a8d8b1	Modify LowerFCOPYSIGN to handle Mips64. llvm-svn: 146080	2011-12-07 21:48:50 +00:00
Akira Hatanaka	4a04a56a36	Fix 64-bit immediate patterns. llvm-svn: 146059	2011-12-07 20:10:24 +00:00
Eli Friedman	ed8b3e38ec	Support vector bitcasts in the AsmPrinter. PR11495. llvm-svn: 146001	2011-12-07 00:50:54 +00:00
Eli Friedman	0e58cba286	Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494. llvm-svn: 145996	2011-12-07 00:11:56 +00:00
Hal Finkel	0fc34bc2d3	delaying restore-cr changed assigned registers in some tests llvm-svn: 145963	2011-12-06 20:55:46 +00:00
Hal Finkel	0702bc1b28	add a test case that uses RESTORE_CR llvm-svn: 145962	2011-12-06 20:55:41 +00:00
Justin Holewinski	04424665c3	PTX: Continue to fix up the register mess. llvm-svn: 145947	2011-12-06 17:39:48 +00:00
Craig Topper	6572e0f203	Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those. llvm-svn: 145927	2011-12-06 09:04:59 +00:00
Craig Topper	bf41eb3a98	Merge isSHUFPMask and isCommutedSHUFPMask into single function that can do both. Do the same for the 256-bit version. Use loops to reduce size of isVSHUFPYMask. Fix test cases that were incorrectly passing due to isCommutedSHUFPMask not checking for the vector being 128-bit. This caused some 256-bit shuffles to be incorrectly commuted. llvm-svn: 145921	2011-12-06 04:59:07 +00:00
Chad Rosier	c77830d21e	[arm-fast-isel] Doublewords only require word-alignment. rdar://10528060 llvm-svn: 145891	2011-12-06 01:44:17 +00:00
Jakob Stoklund Olesen	2e05db2fa0	Align ARM constant pool islands via their basic block. Previously, all ARM::CONSTPOOL_ENTRY instructions had a hardwired alignment of 4 bytes emitted by ARMAsmPrinter. Now the same alignment is set on the basic block. This is in preparation of supporting ARM constant pool islands with different alignments. llvm-svn: 145890	2011-12-06 01:43:02 +00:00
Akira Hatanaka	20cee2eba1	Add definitions of 64-bit extract and insert instrucions and make PerformANDCombine and PerformOrCombine aware of them. Test cases are included too. llvm-svn: 145853	2011-12-05 21:26:34 +00:00
Akira Hatanaka	34e3df76f9	Have LowerJumpTable support Mips64. Modify 2010-07-20-Switch.ll to test N64 and O32 with relocation-model=pic too. llvm-svn: 145850	2011-12-05 21:03:03 +00:00
Hal Finkel	97a6028b3a	Add test case - this input used to crash because of duplicate generation of SPILL_CRs llvm-svn: 145820	2011-12-05 17:55:22 +00:00
Hal Finkel	8f6834dfa5	enable PPC register scavenging by default (update tests and remove some FIXMEs) llvm-svn: 145819	2011-12-05 17:55:17 +00:00
Hal Finkel	e18c72689c	remove wasted space for extra bit copies of CR2 subregs llvm-svn: 145817	2011-12-05 17:55:06 +00:00
NAKAMURA Takumi	e6efe405de	test/CodeGen/X86/pointer-vector.ll: Add explicit -mtriple=i686-linux. llvm-svn: 145805	2011-12-05 07:54:57 +00:00
Nadav Rotem	3924cb0267	Add support for vectors of pointers. llvm-svn: 145801	2011-12-05 06:29:09 +00:00
Anton Korobeynikov	965e0c6de2	Emit the ctors in the proper order on ARM/EABI. Maybe some targets should use this as well. Patch by Evgeniy Stepanov! llvm-svn: 145781	2011-12-03 23:49:37 +00:00
Venkatraman Govindaraju	6dae604f50	Sparc CodeGen: Fix AnalyzeBranch for PR 10282. Removing addSuccessor() since AnalyzeBranch doesn't change the successor, just the order. llvm-svn: 145779	2011-12-03 21:24:48 +00:00
Sanjoy Das	006e43bcc0	Check for stack space more intelligently. libgcc sets the stack limit field in TCB to 256 bytes above the actual allocated stack limit. This means if the function's stack frame needs less than 256 bytes, we can just compare the stack pointer with the stack limit. This should result in lesser calls to __morestack. llvm-svn: 145766	2011-12-03 09:32:07 +00:00
Sanjoy Das	165ca1d4ba	Fix a bug in the x86-32 code generated for segmented stacks. Currently LLVM pads the call to __morestack with a add and sub of 8 bytes to esp. This isn't correct since __morestack expects the call to be followed directly by a ret. This commit also adjusts the relevant test-case. llvm-svn: 145765	2011-12-03 09:21:07 +00:00
Chad Rosier	ec3b77e00d	[arm-fast-isel] Unaligned stores of floats require special care. rdar://10510150 llvm-svn: 145742	2011-12-03 02:21:57 +00:00
Akira Hatanaka	430f917fbe	Test cases for 64-bit multiplication and division. llvm-svn: 145717	2011-12-02 22:31:36 +00:00
Akira Hatanaka	bbc5555bee	Fix test cases to use FileCheck. llvm-svn: 145716	2011-12-02 22:28:09 +00:00
Chad Rosier	9fd0e55e91	[arm-fast-isel] After promoting a function parameter be sure to update the argument value type. Otherwise, the sign/zero-extend has no effect on arguments passed via the stack (i.e., undefined high-order bits). rdar://10515467 llvm-svn: 145701	2011-12-02 20:25:18 +00:00
Hal Finkel	d87f7af1f3	specify cpu for test to fix failure on some darwin systems with a g4+ cpu llvm-svn: 145699	2011-12-02 19:38:17 +00:00
Craig Topper	abeb79eee3	Add instruction selection support for horizontal add/sub of 256-bit floating point vectors. Also add the test case for 256-bit integer vectors. llvm-svn: 145680	2011-12-02 07:16:01 +00:00
Hal Finkel	9286705955	adjust the instruction ordering in some PPC tests: changes due to postRA haz. rec. llvm-svn: 145678	2011-12-02 04:58:12 +00:00
Eric Christopher	9da7f305a4	For 64-bit the rest of the general regs are ok for the q constraint. Make sure we can emit both the high and low versions of those registers. Fixes rdar://10392864 llvm-svn: 145579	2011-12-01 08:12:41 +00:00
Eli Friedman	d61887dd0a	Pass AVX vectors which are arguments to varargs functions on the stack. <rdar://problem/10463281>. llvm-svn: 145573	2011-12-01 04:49:21 +00:00
Jan Sjödin	9430e284a9	Support for encoding all FMA4 instructions and tablegen patterns for all remaining FMA4 instructions and intrinsics with tests. llvm-svn: 145525	2011-11-30 22:09:42 +00:00
Eli Friedman	6cff9df298	Make GlobalMerge honor the preferred alignment on globals without an explicitly specified alignment. <rdar://problem/10497732>. llvm-svn: 145523	2011-11-30 21:54:15 +00:00
Nadav Rotem	0a1801015c	Add test arch to make it pass on non x86 targets llvm-svn: 145498	2011-11-30 17:34:28 +00:00
Nadav Rotem	66427bcce9	Add a tripple to the test llvm-svn: 145489	2011-11-30 11:20:56 +00:00
Nadav Rotem	96923cc2bb	X86: PerformOrCombine introduced a vselect node with a wrong order of operands. This bug was introduced when a dedicated blend sdnode was replaced with the vselect node (in 139479). llvm-svn: 145488	2011-11-30 10:13:37 +00:00
Jakob Stoklund Olesen	f50d2eafdb	FileCheckize. llvm-svn: 145452	2011-11-29 23:09:16 +00:00

1 2 3 4 5 ...

5602 Commits