llvm-project

Commit Graph

Author	SHA1	Message	Date
Bruno Cardoso Lopes	fa1ca3070b	Change all checks regarding the presence of any SSE level to always take into consideration the presence of AVX. This change, together with the SSEDomainFix enabled for AVX, makes AVX codegen to always (hopefully) emit the same code as SSE for 128-bit vector ops. I don't have a testcase for this, but AVX now beats SSE in performance for 128-bit ops in the majority of programas in the llvm testsuite llvm-svn: 139817	2011-09-15 18:27:36 +00:00
Andrew Trick	76a86d3d4c	[regcoalescing] bug fix for RegistersDefinedFromSameValue. An improper SlotIndex->VNInfo lookup was leading to unsafe copy removal. Fixes PR10920 401.bzip2 miscompile with no IV rewrite. llvm-svn: 139765	2011-09-15 01:09:33 +00:00
Nadav Rotem	d748dbacb0	Add integer promotion support for vselect llvm-svn: 139692	2011-09-14 14:42:15 +00:00
Bruno Cardoso Lopes	333a59eced	Vector shuffle mask <i32 4, i32 5, i32 2, i32 3> should yield "movsd", not "movss". llvm-svn: 139686	2011-09-14 02:36:14 +00:00
Devang Patel	edc216515e	Remove ancient debug info constructs from test cases, they are not relevant to test case's main objective. llvm-svn: 139675	2011-09-14 00:29:50 +00:00
Devang Patel	9cd29103aa	Remove unnecessary old test. llvm-svn: 139674	2011-09-14 00:28:54 +00:00
Akira Hatanaka	73b5d6ddc1	Delete test cases that generate code for allegrex/psp and cannot be repurposed. llvm-svn: 139652	2011-09-13 22:29:13 +00:00
Eli Friedman	f1518216fd	Error out on CodeGen of unaligned load/store. Fix test so it isn't accidentally testing that case. llvm-svn: 139641	2011-09-13 20:50:54 +00:00
Akira Hatanaka	fba4bd62b1	Add pattern used to match MipsLo, which is needed when the instruction selector tries to match a dead MipsLo node (explanation in the link below). http://article.gmane.org/gmane.comp.compilers.llvm.devel/42757/match=dagcombiner+dead llvm-svn: 139634	2011-09-13 20:13:58 +00:00
Akira Hatanaka	f58d6812a9	Disable tests which generate code for allegrex or psp. llvm-svn: 139632	2011-09-13 20:00:35 +00:00
Nadav Rotem	1af0c538e0	update checked pattern llvm-svn: 139631	2011-09-13 19:59:18 +00:00
Nadav Rotem	52202fbf2d	Add vselect target support for targets that do not support blend but do support xor/and/or (For example SSE2). llvm-svn: 139623	2011-09-13 19:17:42 +00:00
Andrew Trick	1191773a62	Generalize this test's CHECK statements to handle different indvars modes. llvm-svn: 139577	2011-09-13 02:46:27 +00:00
Bruno Cardoso Lopes	bf6e1e2717	Change testcase commandline to be more strict and silence buildbots llvm-svn: 139554	2011-09-12 22:59:26 +00:00
Bruno Cardoso Lopes	ff8d8a830e	Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and destination types are equal! llvm-svn: 139553	2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes	973d2921e8	Revert the wrong part of r139528, and fix testcases. llvm-svn: 139541	2011-09-12 21:24:07 +00:00
Bruno Cardoso Lopes	be7a086f58	Not sure how CMPPS and CMPPD had already ever worked, I guess it didn't. However with this fix it does now. Basically the operand order for the x86 target specific node is not the same as the instruction, but since the intrinsic need that specific order at the instruction definition, just change the order during legalization. Also, there were some wrong invertions of condition codes, such as GE => LE, GT => LT, fix that too. Fix PR10907. llvm-svn: 139528	2011-09-12 19:30:40 +00:00
Eli Friedman	57ca95961b	Fix mistake in test runline. llvm-svn: 139505	2011-09-12 17:32:58 +00:00
Richard Osborne	97a2a5c4dc	Associate a MemOperand with LDWCP nodes introduced during ISel. This information is required if we want LDWCP to be hoisted out of loops. llvm-svn: 139495	2011-09-12 14:43:23 +00:00
Eli Friedman	501f541b45	Really un-XFAIL the testcase, like I said I would in r139458. llvm-svn: 139459	2011-09-10 02:02:27 +00:00
Richard Trieu	d9917bef6c	Fixed an assert from: assert("not implemented for target shuffle node"); to: assert(0 && "not implemented for target shuffle node"); This causes a test failure in CodeGen/X86/palignr.ll which has been marked as XFAIL for the time being. Test failure filed at PR10901. llvm-svn: 139454	2011-09-10 01:26:21 +00:00
Akira Hatanaka	5624707684	Fix test cases. Generate code for Mips32r1 unless a Mips32r2 feature is tested. llvm-svn: 139433	2011-09-09 23:14:58 +00:00
Eli Friedman	b7910b79f5	Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407	2011-09-09 21:04:06 +00:00
Akira Hatanaka	4444daeec5	Drop support for Mips1 and Mips2. llvm-svn: 139405	2011-09-09 20:45:50 +00:00
Nadav Rotem	de838daefd	Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type llvm-svn: 139400	2011-09-09 20:29:17 +00:00
Akira Hatanaka	d22a1c6c95	Drop support for Allegrex. Allegrex implements a variant of Mips2. llvm-svn: 139383	2011-09-09 19:00:51 +00:00
Akira Hatanaka	df1df7edf1	Change default target architecture from Mips1 to Mips32r1 in preparation for removing support for Mips1 and Mips2. This change and the ones that follow have been discussed with and approved by Bruno. llvm-svn: 139344	2011-09-09 01:13:27 +00:00
Devang Patel	9d904e1a97	Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges. llvm-svn: 139330	2011-09-08 22:59:09 +00:00
Bruno Cardoso Lopes	46b9cde019	Add a AVX version of a simple i64 -> f64 bitcast. This could be triggered using llc with -O0, which wouldn't let it be folded and expose the lack of this pattern. llvm-svn: 139320	2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes	51920a6191	Reapply testcase from r139309! llvm-svn: 139318	2011-09-08 21:05:43 +00:00
Bruno Cardoso Lopes	f483c081b6	Remove this crashing test, until I figure out what's going wrong here llvm-svn: 139309	2011-09-08 18:32:36 +00:00
Bruno Cardoso Lopes	fb113a0051	Add AVX versions of blend vector operations and fix some issues noticed in Nadav's r139285 and r139287 commits. 1) Rename vsel.ll to a more descriptive name 2) Change the order of BLEND operands to "Op1, Op2, Cond", this is necessary because PBLENDVB is already used in different places with this order, and it was being emitted in the wrong way for vselect 3) Add AVX patterns and tests for the same SSE41 instructions llvm-svn: 139305	2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes	ea8d803bb0	Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl. Triggered using llc -O0. Also fix some SET0PS patterns to their AVX forms and test it on the testcase. llvm-svn: 139304	2011-09-08 18:05:02 +00:00
Nadav Rotem	e114ba4465	This test is already covered by llvm/trunk/test/CodeGen/X86/vsel.ll llvm-svn: 139288	2011-09-08 08:43:23 +00:00
Nadav Rotem	354b7585de	add a testcase for the previous patch llvm-svn: 139287	2011-09-08 08:31:31 +00:00
Nadav Rotem	2550ba2a27	Add X86-SSE4 codegen support for vector-select. llvm-svn: 139285	2011-09-08 08:11:19 +00:00
Eli Friedman	02f2f89a98	Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly fix some subtle bugs involving passes which check mayStore()). This isn't exactly ideal, but it is good enough for the moment. llvm-svn: 139245	2011-09-07 18:48:32 +00:00
Duncan Sands	1257042b70	Another forgotten trampoline testcase. llvm-svn: 139230	2011-09-07 10:05:14 +00:00
Eli Friedman	e978d2f644	Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures for atomic laod/store on ARM. (The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.) llvm-svn: 139221	2011-09-07 02:23:42 +00:00
Devang Patel	9de7a7db26	While sinking machine instructions, sink matching DBG_VALUEs also otherwise live debug variable pass will drop DBG_VALUEs on the floor. llvm-svn: 139208	2011-09-07 00:07:58 +00:00
Nick Lewycky	474c455060	Disable these tests harder. They're XFAIL'd, but that means they still run, and these tests all infinitely recurse, bringing my system down into swapping hell. llvm-svn: 139192	2011-09-06 22:08:18 +00:00
Evan Cheng	0b758ed6ba	Fix fall outs from my recent change on how carry bit is modeled during isel. Now the 'S' instructions, e.g. ADDS, treat S bit as optional operand as well. Also fix isel hook to correctly set the optional operand. rdar://10073745 llvm-svn: 139157	2011-09-06 18:52:20 +00:00
Jakob Stoklund Olesen	50ef7611aa	Atomic pseudos don't use (as in read) CPSR. They clobber it. llvm-svn: 139148	2011-09-06 17:40:35 +00:00
Duncan Sands	a098436b32	Split the init.trampoline intrinsic, which currently combines GCC's init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140	2011-09-06 13:37:06 +00:00
Dan Gohman	5423017526	Revert r129875, XFAILing this test for arm, since the fix was reverted. llvm-svn: 139058	2011-09-03 00:14:24 +00:00
Jakob Stoklund Olesen	1f72dd40c7	Pseudo CMOV instructions don't clobber EFLAGS. The explanation about a 0 argument being materialized as xor is no longer valid. Rematerialization will check if EFLAGS is live before clobbering it. The code produced by X86TargetLowering::EmitLoweredSelect does not clobber EFLAGS. This causes one less testb instruction to be generated in the cmov.ll test case. llvm-svn: 139057	2011-09-02 23:52:55 +00:00
Bill Wendling	4aa2573748	Try to eliminate the use of the 'unwind' instruction. llvm-svn: 139046	2011-09-02 22:41:11 +00:00
Eli Friedman	f3dd6da7a8	Don't fast-isel for atomic load/store; some cases require extra handling missing from fast-isel. llvm-svn: 139044	2011-09-02 22:33:24 +00:00
Bill Wendling	912668d998	Better fix for this testcase. Update it to the new EH scheme entirely. llvm-svn: 139039	2011-09-02 21:27:08 +00:00
Bill Wendling	17706bcffb	Update for new EH stuff. (I'm not sure if this is 100% correct.) llvm-svn: 139038	2011-09-02 21:24:17 +00:00
Duncan Sands	5c04c62765	Darwin wants ctors/dtors to be ordered the other way round to linux. llvm-svn: 139015	2011-09-02 18:07:19 +00:00
Kalle Raiskila	f5769c1070	Pass signed (not unsigned) 10 bit field to SPU 'ori' instruction. llvm-svn: 139004	2011-09-02 10:05:01 +00:00
Dan Gohman	3767be9aee	Revert r131152, r129796, r129761. This code is currently considered to be unreliable on platforms which require memcpy calls, and it is complicating broader legalize cleanups. It is hoped that these cleanups will make memcpy byval easier to implement in the future. llvm-svn: 138977	2011-09-01 23:07:08 +00:00
Benjamin Kramer	6397051ece	Don't drop alignment info on local common symbols. - On COFF the .lcomm directive has an alignment argument. - On ELF we fall back to .local + .comm Based on a patch by NAKAMURA Takumi. Fixes PR9337, PR9483 and PR10128. llvm-svn: 138976	2011-09-01 23:04:27 +00:00
Benjamin Kramer	c032617581	XFAIL this test on arm until the backend is fixed. llvm-svn: 138955	2011-09-01 18:40:03 +00:00
Benjamin Kramer	0f6ff8cb2b	This test depends on cmov being available. llvm-svn: 138954	2011-09-01 18:40:01 +00:00
Jakob Stoklund Olesen	5dc87d0f4d	Permit remat of partial register defs when it is safe. An instruction may define part of a register where the other bits are undefined. In that case, it is safe to rematerialize the instruction. For example: %vreg2:ssub_0<def> = VLDRS <cp#0>, 0, pred:14, pred:%noreg, %vreg2<imp-def> The extra <imp-def> operand indicates that the instruction does not read the other parts of the virtual register, so a remat is safe. This patch simply allows multiple def operands for the virtual register. It is MI->readsVirtualRegister() that determines if we depend on a previous value so remat is impossible. llvm-svn: 138953	2011-09-01 18:27:51 +00:00
Bruno Cardoso Lopes	f61d1c072e	Fix vbroadcast matching logic to early unmatch if the node doesn't have only one use. Fix PR10825. llvm-svn: 138951	2011-09-01 18:15:06 +00:00
Jakob Stoklund Olesen	6357fa2f06	Prevent remat of partial register redefinitions. An instruction that redefines only part of a larger register can never be rematerialized since the virtual register value depends on the old value in other parts of the register. This was fixed for the inline spiller in r138794. This patch fixes the problem for all register allocators, and includes a small test case. <rdar://problem/10032939> llvm-svn: 138944	2011-09-01 17:18:50 +00:00
Andrew Trick	832a6a1909	PreRA scheduler should avoid cloning compares. Added canClobberReachingPhysRegUse() to handle a particular pattern in which a two-address instruction could be forced to interfere with EFLAGS, causing a compare to be unnecessarilly cloned. Fixes rdar://problem/5875261 llvm-svn: 138924	2011-09-01 00:54:31 +00:00
Bill Wendling	55fb73a6e0	Remove old declare statements. llvm-svn: 138905	2011-08-31 21:41:20 +00:00
Bill Wendling	22055c713f	Update more tests to the new EH scheme. llvm-svn: 138904	2011-08-31 21:40:15 +00:00
Bill Wendling	d4e871404d	Update more tests to the new EH scheme. llvm-svn: 138903	2011-08-31 21:39:05 +00:00
Bill Wendling	c4c24f03e9	Revert r138894. This was failing on cmake-clang-i686-msvc10. llvm-svn: 138900	2011-08-31 21:20:25 +00:00
Bill Wendling	e6174a2c85	Update more tests to the new EH scheme. llvm-svn: 138894	2011-08-31 21:04:11 +00:00
Eli Friedman	7c3bdede25	Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM. llvm-svn: 138872	2011-08-31 18:26:09 +00:00
Eli Friedman	1ccecbb9d3	64-bit atomic cmpxchg for ARM. llvm-svn: 138868	2011-08-31 17:52:22 +00:00
David Greene	cdef71f4f3	Compress Repeated Byte Output Emit a repeated sequence of bytes using .zero. This saves an enormous amount of asm file space for certain programs. llvm-svn: 138864	2011-08-31 17:30:56 +00:00
Benjamin Kramer	5247ca0ae5	This test requires sse, otherwise x87 ops will block tailcall optimization llvm-svn: 138859	2011-08-31 16:49:05 +00:00
Bruno Cardoso Lopes	9fc6b8be03	- Move all MOVSS and MOVSD patterns close to their definitions - Duplicate some store patterns to their AVX forms! - Catched a bug while restricting the patterns subtarget, fix it and update a testcase to check it properly llvm-svn: 138851	2011-08-31 03:04:20 +00:00
Evan Cheng	cb1e5bae4c	Fix (movhps load) lowering / pattern to match more cases. rdar://10050549 llvm-svn: 138848	2011-08-31 02:05:24 +00:00
Eli Friedman	2c7bb52f56	Some minor cleanups for r138845. llvm-svn: 138846	2011-08-31 00:41:05 +00:00
Eli Friedman	c3f9c4a852	Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next. llvm-svn: 138845	2011-08-31 00:31:29 +00:00
Benjamin Kramer	50cabb5de4	Fix test typo. llvm-svn: 138843	2011-08-31 00:02:59 +00:00
Rafael Espindola	1450f61e8f	Add a triple. llvm-svn: 138831	2011-08-30 21:19:37 +00:00
Rafael Espindola	9f2edc8d2c	Some test code to check if correct code is being generated. Patch by Sanjoy Das. llvm-svn: 138820	2011-08-30 19:51:29 +00:00
Roman Divacky	71038e7021	Set CR1EQ only when lowering vararg floating arguments (not any vararg arguments as before), unset CR1EQ otherwise. llvm-svn: 138802	2011-08-30 17:04:16 +00:00
Evan Cheng	e891654a58	Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc \| libcall #2 \| libcall #1 \| sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791	2011-08-30 01:34:54 +00:00
Eli Friedman	850b9a9a84	Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802. llvm-svn: 138768	2011-08-29 21:15:46 +00:00
Owen Anderson	baa7edb06a	Add testcase for r138746. llvm-svn: 138747	2011-08-29 18:02:40 +00:00
Duncan Sands	4d63542b82	Fix PR5329: pay attention to constructor/destructor priority when outputting them. With this, the entire LLVM testsuite passes when built with dragonegg. llvm-svn: 138724	2011-08-28 13:17:22 +00:00
Bill Wendling	2f92d2cdae	Update to new EH scheme. llvm-svn: 138699	2011-08-27 04:53:41 +00:00
Bill Wendling	46c720994a	Cannot have an llvm.eh.exception call in a non-landing pad block. llvm-svn: 138698	2011-08-27 04:53:28 +00:00
Eli Friedman	5e5704277f	Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. llvm-svn: 138660	2011-08-26 21:21:21 +00:00
Bill Wendling	ac88ab7cce	Revert r138606 until LowerInvoke has been converted to the new EH scheme. llvm-svn: 138656	2011-08-26 21:11:23 +00:00
Eli Friedman	452aae6202	Atomic load/store on ARM/Thumb. I don't really like the patterns, but I'm having trouble coming up with a better way to handle them. I plan on making other targets use the same legalization ARM-without-memory-barriers is using... it's not especially efficient, but if anyone cares, it's not that hard to fix for a given target if there's some better lowering. llvm-svn: 138621	2011-08-26 02:59:24 +00:00
Bill Wendling	62fe9e9aa6	Update to the new EH scheme. llvm-svn: 138606	2011-08-25 23:48:37 +00:00
Bruno Cardoso Lopes	8347b86293	Add support for AVX 256-bit version of MOVDDUP! llvm-svn: 138588	2011-08-25 21:40:37 +00:00
Andrew Trick	6446bf780a	ARM fix for missing implicit operands on ldmia_ret. rdar://10005094: miscompile of 176.gcc llvm-svn: 138568	2011-08-25 17:50:53 +00:00
Bill Wendling	3fb137f7ef	LSR wants to split the landing pad's critical edge. Let it do it, but use the proper function to do it. llvm-svn: 138550	2011-08-25 05:55:40 +00:00
Bruno Cardoso Lopes	296256fb32	Add support for 256-bit versions of VSHUFPD and VSHUFPS. llvm-svn: 138546	2011-08-25 02:58:26 +00:00
Eli Friedman	9c73a57b20	Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually. llvm-svn: 138505	2011-08-24 22:33:28 +00:00
Eli Friedman	5aabaaa367	Basic tests for atomic load and store on x86. llvm-svn: 138486	2011-08-24 21:16:59 +00:00
Richard Osborne	6e3c83eb1c	Add Uses=[SP] to call instructions. This fixes a miscompilation with a variable sized alloca. llvm-svn: 138433	2011-08-24 13:32:43 +00:00
Craig Topper	de92622aa5	Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711. llvm-svn: 138427	2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes	9e9f2ce32d	Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit permutations. Also tidy up some patterns and make them close to their instruction definition! llvm-svn: 138392	2011-08-23 22:06:37 +00:00
Nick Lewycky	4c8ff77f1b	PerformSubCombine to work on integers larger than i128. Fixes a crasher. llvm-svn: 138354	2011-08-23 19:01:24 +00:00
Craig Topper	6612e35b0d	Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712. llvm-svn: 138321	2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes	2a3ffb5d97	Introduce a pass to insert vzeroupper instructions to avoid AVX to SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317	2011-08-23 01:14:17 +00:00
Bruno Cardoso Lopes	74f090d44c	Add support for breaking 256-bit int VETCC into two 128-bit ones, avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271	2011-08-22 20:31:04 +00:00
Chad Rosier	d6641af80c	With the fix in r138164: "Add <imp-def> operands to QQ and QQQQ stack loads." -verify-machineinstrs can be enabled for this test case. llvm-svn: 138171	2011-08-20 00:34:45 +00:00
Chad Rosier	be7625161e	VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg. Therefore, rather then generate a pseudo instruction, which is later expanded, generate the necessary instructions in place. llvm-svn: 138163	2011-08-20 00:17:25 +00:00
Devang Patel	59e27c5f12	Do not use named md nodes to track variables that are completely optimized. This does not scale while doing LTO with debug info. New approach is to include list of variables in the subprogram info directly. llvm-svn: 138145	2011-08-19 23:28:12 +00:00
Jim Grosbach	8d77bb5f06	Use regex to remove false dependencies on register allocation. llvm-svn: 138137	2011-08-19 23:10:31 +00:00
Jim Grosbach	066e9ec1e4	Update tests. llvm-svn: 138116	2011-08-19 22:19:48 +00:00
Jakob Stoklund Olesen	90b6018c8f	Add test case for r138018. llvm-svn: 138033	2011-08-19 04:30:24 +00:00
Akira Hatanaka	fb4161ae88	Use subword loads instead of a 4-byte load when the size of a structure (or a piece of it) that is being passed by value is smaller than a word. llvm-svn: 138007	2011-08-18 23:39:37 +00:00
Ivan Krasin	d7cbd4c518	FastISel: avoid function calls between the materialization of the constant and its use. llvm-svn: 137993	2011-08-18 22:06:10 +00:00
Jim Grosbach	90103ccc05	Thumb assembly parsing and encoding for LDM instruction. Fix base register type and canonicallize to the "ldm" spelling rather than "ldmia." Add diagnostics for incorrect writeback token and out-of-range registers. llvm-svn: 137986	2011-08-18 21:50:53 +00:00
Richard Osborne	56f3b70225	Add intrinsics for SETEV, GETED, GETET. llvm-svn: 137938	2011-08-18 13:00:48 +00:00
Bruno Cardoso Lopes	3c7d6eb64c	Cleanup vector logical ops in AVX and add use int versions for simple v2i64 llvm-svn: 137919	2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes	1a87fcb9ba	Fix PR10688. Add support for spliting 256-bit vector shifts when the shift amount is variable llvm-svn: 137885	2011-08-17 22:12:20 +00:00
Jim Grosbach	e2a0404a69	Thumb assembly parsing and encoding for ADR. llvm-svn: 137864	2011-08-17 20:37:40 +00:00
Bruno Cardoso Lopes	be5e987379	Introduce matching patterns for vbroadcast AVX instruction. The idea is to match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810	2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes	3400825b41	Update test to not use the scalar type to splat from a load llvm-svn: 137809	2011-08-17 02:29:15 +00:00
Bruno Cardoso Lopes	ed786a346e	Now that we have a canonical way to handle 256-bit splats: vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807	2011-08-17 02:29:10 +00:00
Akira Hatanaka	5360f88355	Add support for ext and ins. llvm-svn: 137804	2011-08-17 02:05:42 +00:00
Bruno Cardoso Lopes	2e99f1b3aa	Instead of always leaving the work to the generic legalizer when there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733	2011-08-16 18:21:54 +00:00
Akira Hatanaka	7d7bec5acf	Add test case for r137711. llvm-svn: 137725	2011-08-16 17:32:01 +00:00
Akira Hatanaka	2263c10946	Fix handling of double precision loads and stores when Mips1 is targeted. Mips1 does not support double precision loads or stores, therefore two single precision loads or stores must be used in place of these instructions. This patch treats double precision loads and stores as if they are legal instructions until MCInstLowering, instead of generating the single precision instructions during instruction selection or Prolog/Epilog code insertion. Without the changes made in this patch, llc produces code that has the same problem described in r137484 or bails out when MipsInstrInfo::storeRegToStackSlot or loadRegFromStackSlot is called before register allocation. llvm-svn: 137711	2011-08-16 03:51:51 +00:00
Bruno Cardoso Lopes	cbe7feeab9	Fix PR10656. It's only profitable to use 128-bit inserts and extracts when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661	2011-08-15 21:45:54 +00:00
Eric Christopher	8c5f3f7624	Fix this test to avoid leaving a temporary file behind. llvm-svn: 137651	2011-08-15 20:55:03 +00:00
Bob Wilson	d1de7764be	Expand VMOVQQQQ pseudo instructions. Apparently we never added code to expand these pseudo instructions, and in over a year, no one has noticed. Our register allocator must be awesome! llvm-svn: 137551	2011-08-13 05:14:55 +00:00
Bruno Cardoso Lopes	f15dfe5818	The VPERM2F128 is a AVX instruction which permutes between two 256-bit vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519	2011-08-12 21:48:26 +00:00
Akira Hatanaka	2fcc1cfdce	Define unaligned load and store. llvm-svn: 137515	2011-08-12 21:30:06 +00:00
Akira Hatanaka	2f6b944f56	Test case for 137484 llvm-svn: 137486	2011-08-12 18:12:06 +00:00
Akira Hatanaka	79d60d0e94	Enclose directive .cprestore with .set macro and nomacro to silence assembler warning. llvm-svn: 137378	2011-08-11 22:42:31 +00:00
Bruno Cardoso Lopes	8fbf023c9b	Add a dag combine to xform 256-bit shuffles into simple vector inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362	2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes	9eb3762e08	Fix the test added by Nadav in r137308. Make it more strict: 1) check for the "v" version of movaps 2) add a couple of CHECK-NOT to guarantee the behavior 3) move to a more appropriate test file llvm-svn: 137361	2011-08-11 21:50:35 +00:00
Bruno Cardoso Lopes	043c820800	Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. llvm-svn: 137324	2011-08-11 18:59:13 +00:00
Jim Grosbach	27ad83d8a9	ARM push of a single register encodes as pre-indexed STR. Per the ARM ARM, a 'push' of a single register encodes as an STR, not an STM. llvm-svn: 137318	2011-08-11 18:07:11 +00:00
Jim Grosbach	8ba76c6d5c	ARM pop of a single register encodes as post-indexed LDR. Per the ARM ARM, a 'pop' of a single register encodes as an LDR, not an LDM. llvm-svn: 137316	2011-08-11 17:35:48 +00:00
Nadav Rotem	1542d5a00a	[AVX] If the data which is going to be saved is already in two XMM registers (for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308	2011-08-11 16:41:21 +00:00
Chris Lattner	5a2c70cc8f	add missing colon, thanks peter. llvm-svn: 137306	2011-08-11 16:15:10 +00:00
Chris Lattner	96710b4308	fix PR10605 / rdar://9930964 by adding a pretty scary missed check. It's somewhat surprising anything works without this. Before we would compile the testcase into: test: # @test movl $4, 8(%rdi) movl 8(%rdi), %eax orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 now we produce: test: # @test movl 8(%rdi), %eax movl $4, 8(%rdi) orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 llvm-svn: 137303	2011-08-11 06:26:54 +00:00
Bruno Cardoso Lopes	a2d8bb97b9	Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296	2011-08-11 02:49:44 +00:00
Bruno Cardoso Lopes	572c9aaf53	Use the splat index to generate the desired shuffle. Otherwise we could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295	2011-08-11 02:49:41 +00:00
Eli Friedman	3ae39f8ad1	Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292	2011-08-11 01:48:05 +00:00
NAKAMURA Takumi	504769fc2f	test/CodeGen/X86/opt-shuff-tstore.ll: Add explicit -mtriple=x86_64-linux. llvm-svn: 137262	2011-08-10 22:52:48 +00:00
Devang Patel	37a62058fe	While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases. llvm-svn: 137250	2011-08-10 21:25:34 +00:00
Nadav Rotem	d2b071f562	Fix the test. Add cpu target. llvm-svn: 137241	2011-08-10 19:49:19 +00:00
Nadav Rotem	410a11fe82	When performing a truncating store, it is sometimes possible to rearrange the data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238	2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes	3ff111c12d	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Rafael Espindola	36a3abc671	Add support for the R and Q constraints. llvm-svn: 137217	2011-08-10 16:26:42 +00:00
Bruno Cardoso Lopes	278ffd7d8e	Fix a bug in vpermilps mask checking. Fix PR10560 llvm-svn: 137194	2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes	72323966c8	Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 llvm-svn: 137179	2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes	fc481959d2	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	6963062a99	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Eli Friedman	4ef2426b87	Fix a couple ridiculous copy-paste errors. rdar://9914773 . llvm-svn: 137160	2011-08-09 22:17:39 +00:00
Bill Wendling	d7f41b7f66	Revert r137134. It breaks some code as Eli pointed out. llvm-svn: 137135	2011-08-09 18:56:35 +00:00
Bill Wendling	84ec8f65d1	Print out the variable declaration only if it is a declaration. Otherwise, a 'static' variable will be emitted twice. PR10081 llvm-svn: 137134	2011-08-09 18:31:50 +00:00
Jakob Stoklund Olesen	53910d6aae	Inflate register classes after coalescing. Coalescing can remove copy-like instructions with sub-register operands that constrained the register class. Examples are: x86: GR32_ABCD:sub_8bit_hi -> GR32 arm: DPR_VFP2:ssub0 -> DPR Recompute the register class of any virtual registers that are used by less instructions after coalescing. This affects code generation for the Cortex-A8 where we use NEON instructions for f32 operations, c.f. fp_convert.ll: vadd.f32 d16, d1, d0 vcvt.s32.f32 d0, d16 The register allocator is now free to use d16 for the temporary, and that comes first in the allocation order because it doesn't interfere with any s-registers. llvm-svn: 137133	2011-08-09 18:19:41 +00:00
Bruno Cardoso Lopes	bed48dc8ff	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	24dd1d4a27	Revert r137114 llvm-svn: 137127	2011-08-09 17:39:01 +00:00
Justin Holewinski	db05c2b963	PTX: Add initial support for device function calls - Calls are supported on SM 2.0+ for function with no return values llvm-svn: 137125	2011-08-09 17:36:31 +00:00
Bruno Cardoso Lopes	ad3453cf2d	Handle sitofp between v4f64 <- v4i32. Fix PR10559 llvm-svn: 137114	2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes	1155b1eafa	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	337a7fdb13	Rename and tidy up tests llvm-svn: 137103	2011-08-09 03:04:23 +00:00
Bruno Cardoso Lopes	2fc107365b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	af6a85484c	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	c96953c12a	Add support for several vector shifts operations while in AVX mode. Fix PR10581 llvm-svn: 137067	2011-08-08 21:31:08 +00:00
Eli Friedman	a27da98921	Fix up the patterns for SXTB, SXTH, UXTB, and UXTH so that they are correctly active without HasT2ExtractPack. PR10611. llvm-svn: 137061	2011-08-08 19:49:37 +00:00
Jakob Stoklund Olesen	4f0ace5674	Don't clobber pending ST regs when FP regs are killed. X86FloatingPoint keeps track of pending ST registers for an upcoming inline asm instruction with fixed stack register constraints. It does this by remembering which FP register holds the value that should appear at a fixed stack position for the inline asm. When that FP register is killed before the inline asm, make sure to duplicate it to a scratch register, so the ST register still has a live FP reference. This could happen when the same FP register was copied to two ST registers, or when a spill instruction is inserted between the ST copy and the inline asm. This fixes PR10602. llvm-svn: 137050	2011-08-08 17:15:43 +00:00
Rafael Espindola	9bc32a96be	print st_shndx with the correct number of bits. llvm-svn: 136880	2011-08-04 15:50:13 +00:00
Rafael Espindola	9528995e3f	print st_other with the correct number of bits. llvm-svn: 136877	2011-08-04 15:38:19 +00:00
Rafael Espindola	96df560ce1	print st_type with the correct number of bits. llvm-svn: 136875	2011-08-04 15:24:00 +00:00
Rafael Espindola	79ef75dc49	Print st_bind with the correct number of bits. llvm-svn: 136874	2011-08-04 15:10:35 +00:00
Rafael Espindola	1848231ad1	Print r_sym with the correct number of bits. llvm-svn: 136873	2011-08-04 14:48:27 +00:00
Rafael Espindola	260af5cef6	Print r_type with the correct number of bits. llvm-svn: 136872	2011-08-04 14:39:30 +00:00
Rafael Espindola	65c559c5fb	Change anther counter to decimal. llvm-svn: 136870	2011-08-04 14:01:03 +00:00
Rafael Espindola	cad9e7f094	Don't print a counter in hex. llvm-svn: 136869	2011-08-04 13:39:15 +00:00
Bill Wendling	e234f6ae0c	Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. Fixes PR10527. llvm-svn: 136853	2011-08-04 00:32:58 +00:00
Benjamin Kramer	3c7e9ee480	Remove underscore that's breaking linux buildbots. llvm-svn: 136833	2011-08-03 23:13:01 +00:00
Jakub Staszak	15e5b742ad	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Jakob Stoklund Olesen	da618420ee	Handle IMPLICIT_DEF instructions in X86FloatingPoint. This fixes PR10575. llvm-svn: 136787	2011-08-03 16:33:19 +00:00
Devang Patel	dc9cbaaf23	Use byte offset, instead of element number, to access merged global. llvm-svn: 136759	2011-08-03 01:25:46 +00:00
Rafael Espindola	c48e10cd54	Assume .cfi_startproc is the first thing in a function. If the function is externally visable, create a local symbol to use in the CFE. If not, use the function label itself. Fixes PR10420. llvm-svn: 136716	2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes	5ada908140	Make this kind of lowering to be supported by 256-bit instructions: shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691	2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes	a8e3673816	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	7513939ddd	Since vectors with all ones can't be created with a 256-bit instruction, avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642	2011-08-01 19:51:53 +00:00
Richard Osborne	0cc000ef29	Fix crash with varargs function with no named parameters. llvm-svn: 136623	2011-08-01 16:45:59 +00:00
Jakob Stoklund Olesen	5670f850c6	Revert "Don't check liveness of unallocatable registers." The ARM target depends on CPSR liveness being tracked after register allocation. llvm-svn: 136548	2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen	95cc5440e9	Don't check liveness of unallocatable registers. This includes registers like EFLAGS and ST0-ST7. We don't check for liveness issues in the verifier and scavenger because registers will never be allocated from these classes. While in SSA form, we do care about the liveness of unallocatable unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for MachineDCE and MachineSinking. llvm-svn: 136541	2011-07-29 23:36:21 +00:00
Eric Christopher	aa5030066f	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Bruno Cardoso Lopes	65ce5ea3ba	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	81eb193f2e	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	d23709b18c	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Jakob Stoklund Olesen	b28ee4115d	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	9c3badceba	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	b16081ce8c	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Bruno Cardoso Lopes	76bc28bac6	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	eca99c4b5a	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	9e2a301216	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes	27a30a7792	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Devang Patel	f098ce2757	It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry. llvm-svn: 136196	2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen	c3bcb02154	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	a79c1e0589	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	124ac2b997	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	f8fe47bd2b	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	93dc04d5ca	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	73a8393a47	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	747430417b	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	06b8b571b2	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	d600a0f878	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	9212bf275d	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	123dff0f58	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Eli Friedman	442d1b199f	Attempt to fix test failure reported on llvm-commits. llvm-svn: 135995	2011-07-25 22:28:51 +00:00
Eli Friedman	cbd3ba91b7	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Eli Friedman	ea8c66fea5	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen	56a56eb80e	Correctly handle <undef> tied uses when rewriting after a split. This fixes PR10463. A two-address instruction with an <undef> use operand was incorrectly rewritten so the def and use no longer used the same register, violating the tie constraint. Fix this by always rewriting <undef> operands with the register a def operand would use. llvm-svn: 135885	2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes	7a2075511b	Fix test check! llvm-svn: 135802	2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes	a89039998d	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Rafael Espindola	77242dd537	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes	612e56174b	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	14a95bda04	Although we already support this, add testcases for consistency llvm-svn: 135728	2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes	91eff5140f	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	178fb40612	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	b878caa5e2	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Devang Patel	bcd50a10d5	While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value. llvm-svn: 135627	2011-07-20 21:57:04 +00:00
Eli Friedman	6ed783228d	PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS. llvm-svn: 135595	2011-07-20 18:14:33 +00:00
Evan Cheng	76792992d6	Add MCObjectFileInfo and sink the MCSections initialization code from TargetLoweringObjectFileImpl down to MCObjectFileInfo. TargetAsmInfo is done to one last method. It's almost gone! llvm-svn: 135569	2011-07-20 05:58:47 +00:00
Eric Christopher	60648578ba	New pointer rotate test. llvm-svn: 135562	2011-07-20 03:09:11 +00:00
Akira Hatanaka	a4c09bce9b	Lower memory barriers to sync instructions. llvm-svn: 135537	2011-07-19 23:30:50 +00:00
Evan Cheng	ccf243d56b	Fix an obvious typo that's preventing x86 (32-bit) from using .literal16. llvm-svn: 135535	2011-07-19 23:14:32 +00:00
Akira Hatanaka	f3b29992d5	Use the correct opcodes: SLLV/SRLV or AND must be used instead of SLL/SRL or ANDi, when the instruction does not have any immediate operands. llvm-svn: 135520	2011-07-19 20:34:00 +00:00
Akira Hatanaka	e450358a21	Remove redundant instructions. - In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the instruction being expanded, instead of masking it in thisMBB. - Remove redundant Or in EmitAtomicCmpSwap. llvm-svn: 135495	2011-07-19 18:14:26 +00:00
Richard Osborne	f1b800998a	Add intrinsics for the zext / sext instructions. llvm-svn: 135476	2011-07-19 13:28:50 +00:00
Richard Osborne	252c43ee88	Add intrinsics for the testct, testwct instructions. llvm-svn: 135475	2011-07-19 13:00:40 +00:00
Richard Osborne	707f0beae1	Add intrinsics for the peek and endin instructions. llvm-svn: 135474	2011-07-19 12:50:25 +00:00
Evan Cheng	2129f59637	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Devang Patel	9ab3cac694	Revert r135423. llvm-svn: 135454	2011-07-19 00:28:24 +00:00
Eli Friedman	4d5532a085	FileCheck-ize a couple tests. llvm-svn: 135427	2011-07-18 21:23:42 +00:00
Devang Patel	4dc76f2438	During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases. [take 2] llvm-svn: 135423	2011-07-18 20:55:23 +00:00
Akira Hatanaka	338879a7f4	Do not treat atomic.load.sub differently than other atomic binary intrinsics. llvm-svn: 135418	2011-07-18 19:58:59 +00:00
Akira Hatanaka	27292638bd	Set mayLoad or mayStore flags for SC and LL in order to prevent LICM from moving them out of the loop. Previously, stores and loads to a stack frame object were inserted to accomplish this. Remove the code that was needed to do this. Patch by Sasa Stankovic. llvm-svn: 135415	2011-07-18 18:52:12 +00:00
Jakob Stoklund Olesen	c45d38e14a	Fix a crash when building 177.mesa for armv6. When splitting a live range immediately before an LDR_POST instruction that redefines the address register, make sure to use the correct value number in leaveIntvBefore. We need the value number entering the instruction. <rdar://problem/9793765> llvm-svn: 135413	2011-07-18 18:47:13 +00:00
Bruno Cardoso Lopes	4208cace5f	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Nick Lewycky	d8921f939c	Delete empty unused file. llvm-svn: 135379	2011-07-18 05:54:06 +00:00
Bruno Cardoso Lopes	4480040191	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	8df9cfc279	Fix a couple of things: 1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313	2011-07-15 22:24:33 +00:00
Owen Anderson	454e1c7abb	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Eric Christopher	92464be28c	Check register class matching instead of width of type matching when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180	2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes	6778597deb	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Eric Christopher	0c666b4664	Add a testcase for r135123. Part of rdar://9761830 llvm-svn: 135133	2011-07-14 06:23:09 +00:00
Benjamin Kramer	15cd5a3f12	Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient. llvm-svn: 135126	2011-07-14 01:38:42 +00:00
Bruno Cardoso Lopes	3c4d652210	We already support 256-bit packed ADD, SUB, DIV, MUL. Add testcases. llvm-svn: 135099	2011-07-13 22:28:55 +00:00
Bruno Cardoso Lopes	9613b64916	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Bruno Cardoso Lopes	1021b4a9dd	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Evan Cheng	f863e3fb73	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00

... 3 4 5 6 7 ...

5197 Commits