llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	7b92c2a9a0	Revert r116220 - thus turning arm fast isel back on by default. llvm-svn: 116762	2010-10-18 22:53:53 +00:00
Kalle Raiskila	5f2034c455	Improve lowering of sext to i128 on SPU. The old algorithm inserted a 'rotqmbyi' instruction which was both redundant and wrong - it made shufb select bytes from the wrong end of the input quad. llvm-svn: 116701	2010-10-18 09:34:19 +00:00
Michael J. Spencer	5e683250ee	X86-Windows: Emit an undefined global __fltused symbol when targeting Windows if any floating point arguments are passed to an external function. llvm-svn: 116665	2010-10-16 08:25:41 +00:00
Bob Wilson	59351844e1	ARM instructions that are both predicated and set the condition codes have been printed with the "S" modifier after the predicate. With ARM's unified syntax, they are supposed to go in the other order. We fixed this for Thumb when we switched to unified syntax but missed changing it for ARM. Apparently we don't generate these instructions often because no one noticed until now. Thanks to Bill Wendling for the testcase! llvm-svn: 116563	2010-10-15 03:23:44 +00:00
Jim Grosbach	8b6a9c1574	Refactor the MOVsr[al]_flag and RRX pseudo-instructions to really be pseudos and let the ARMExpandPseudoInsts pass fix them up into the real (MOVs) instruction form. llvm-svn: 116534	2010-10-14 22:57:13 +00:00
Jim Grosbach	062749cb25	Tweak the ARM backend to use the RRX mnemonic instead of the 'mov a, b, rrx' pseudonym. llvm-svn: 116512	2010-10-14 20:43:44 +00:00
Rafael Espindola	2216af3fa8	Fix another case where we were preferring instructions with large immediates instead of 8 bits ones. llvm-svn: 116410	2010-10-13 17:14:25 +00:00
Rafael Espindola	8ea9b0eb32	Fix PR8365 by adding a more specialized Pat that checks if an 'and' with 8 bit constants can be used. llvm-svn: 116403	2010-10-13 13:31:20 +00:00
Eric Christopher	a237bdbe52	FileCheckize this in a hope to quiet a valgrind warning on grep. llvm-svn: 116376	2010-10-12 23:47:58 +00:00
Andrew Trick	3e02306fed	PR8297 llvm-svn: 116223	2010-10-11 21:08:42 +00:00
Jakob Stoklund Olesen	6c4353ecee	PowerPC varargs functions store live-in registers on the stack. Make sure we use virtual registers for those stores since RegAllocFast requires that each live physreg only be used once. This fixes PR8357. llvm-svn: 116222	2010-10-11 20:43:09 +00:00
Eric Christopher	e2a0b6841a	Found a bug turning this on by default. Disable again for now. llvm-svn: 116220	2010-10-11 20:26:21 +00:00
Eric Christopher	6002b3b3e1	Remove now non-existent option. llvm-svn: 116219	2010-10-11 20:21:21 +00:00
Andrew Trick	e01c9001c9	Fixes bug 8297: i386 cmpxchg8b, missing MachineMemOperand llvm-svn: 116214	2010-10-11 19:02:04 +00:00
Chris Lattner	1ef5e84c31	Per discussion with Sanjiv, remove the PIC16 target from mainline. When/if it comes back, it will be largely a rewrite, so keeping the old codebase in tree isn't helping anyone. llvm-svn: 116190	2010-10-11 05:44:40 +00:00
Michael J. Spencer	00765e5be0	X86: MinGW should always use libgcc on Windows. llvm-svn: 116177	2010-10-10 23:11:06 +00:00
Michael J. Spencer	7a573a5e1f	X86: Call _alldiv instead of __divdi3 on Windows (excluding cygwin). llvm-svn: 116174	2010-10-10 22:04:34 +00:00
Chris Lattner	f8f7537a77	force a triple, varargs isn't supported with the SVR4 ABI the buildbot tells me. llvm-svn: 116170	2010-10-10 18:59:01 +00:00
Chris Lattner	d10babfd65	fix the expansion of va_arg instruction on PPC to know the arg alignment for PPC32/64, avoiding some masking operations. llvm-gcc expands vaarg inline instead of using the instruction so it has never hit this. llvm-svn: 116168	2010-10-10 18:34:00 +00:00
Evan Cheng	05f13e94bf	Correct some load / store instruction itinerary mistakes: 1. Cortex-A8 load / store multiplies can only issue on ALU0. 2. Eliminate A8_Issue, A8_LSPipe will correctly limit the load / store issues. 3. Correctly model all vld1 and vld2 variants. llvm-svn: 116134	2010-10-09 01:03:04 +00:00
Bill Wendling	748265b0da	Simplify test and move into a generic "crash" ll file. llvm-svn: 116130	2010-10-09 00:29:04 +00:00
Bill Wendling	59ebe44049	Check to make sure that the iterator isn't at the beginning of the basic block before decrementing. <rdar://problem/8529919> llvm-svn: 116126	2010-10-09 00:03:48 +00:00
Cameron Esfahani	d57f9ecd4a	Recommit 116056, now with the missing file... llvm-svn: 116083	2010-10-08 19:24:18 +00:00
Andrew Trick	cf97db2402	reverting 116056: win64_params.ll may need to be conditionalized? llvm-svn: 116063	2010-10-08 17:22:42 +00:00
Cameron Esfahani	a07b5c291d	Small patch to restore home register stack space allocation for the Win64 case. Add test case. This code eventually needs to be tighter, since it's always allocating it, even in leaf routines. llvm-svn: 116056	2010-10-08 10:31:30 +00:00
Bob Wilson	056b694de1	Change register allocation order for ARM VFP and NEON registers to put the callee-saved registers at the end of the lists. Also prefer to avoid using the low registers that are in register subclasses required by certain instructions, so that those registers will more likely be available when needed. This change makes a huge improvement in spilling in some cases. Thanks to Jakob for helping me realize the problem. Most of this patch is fixing the testsuite. There are quite a few places where we're checking for specific registers. I changed those to wildcards in places where that doesn't weaken the tests. The spill-q.ll and thumb2-spill-q.ll tests stopped spilling with this change, so I added a bunch of live values to force spills on those tests. llvm-svn: 116055	2010-10-08 06:15:13 +00:00
Chris Lattner	3e210eb398	testcase that goes with r116053 llvm-svn: 116054	2010-10-08 05:12:30 +00:00
Chris Lattner	8ed76f87cf	rename test llvm-svn: 116052	2010-10-08 05:05:06 +00:00
Chris Lattner	420cf26d99	merge tests llvm-svn: 116051	2010-10-08 05:04:58 +00:00
Chris Lattner	6a8a65cb43	filecheckize. llvm-svn: 116050	2010-10-08 05:02:29 +00:00
Chris Lattner	dd77477690	reapply: Use the new TB_NOT_REVERSABLE flag instead of special reapply: reimplement the second half of the or/add optimization. We should now with no changes. Turns out that one missing "Defs = [EFLAGS]" can upset things a bit. llvm-svn: 116040	2010-10-08 03:57:25 +00:00
Daniel Dunbar	efdf08b5b8	Revert "reimplement the second half of the or/add optimization. We should now", which depends on r116007, which I am about to revert. llvm-svn: 116031	2010-10-08 02:07:26 +00:00
Chris Lattner	134f415bf8	reimplement the second half of the or/add optimization. We should now only end up emitting LEA instead of OR. If we aren't able to promote something into an LEA, we should never be emitting it as an ADD. Add some testcases that we emit "or" in cases where we used to produce an "add". llvm-svn: 116026	2010-10-08 01:05:10 +00:00
Chris Lattner	ae8d67d3bb	convert cmp to use a multipattern llvm-svn: 115978	2010-10-07 20:56:25 +00:00
Evan Cheng	5c31bf0619	Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311 llvm-svn: 115977	2010-10-07 20:50:20 +00:00
Jim Grosbach	742adc328a	Allow use of the 16-bit literal move instruction in CMOVs for ARM mode. llvm-svn: 115884	2010-10-07 00:42:42 +00:00
Evan Cheng	49d4c0bd18	- Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755	2010-10-06 06:27:31 +00:00
Bill Wendling	10a0fdeab5	PSHUFW is in SSE, not SSSE3. llvm-svn: 115691	2010-10-05 21:58:12 +00:00
Owen Anderson	d8d1dcc09a	Use a more efficient lowering of uint64_t --> float that can take advantage of hardware signed integer conversion without having to do a double cast (uint64_t --> double --> float). This is based on the algorithm from compiler_rt's __floatundisf for X86-64. llvm-svn: 115634	2010-10-05 17:24:05 +00:00
NAKAMURA Takumi	7681b41720	test/CodeGen/X86/atomic_op.ll: Rename @main to @func. Extra sequences will be inserted to @main as prologue on cygming, to fail. llvm-svn: 115611	2010-10-05 11:16:24 +00:00
Anton Korobeynikov	d77a443631	va_args support for Win64. Patch by Cameron! llvm-svn: 115480	2010-10-03 22:52:07 +00:00
Anton Korobeynikov	ff85688559	Properly emit stack probe on win64 (for non-mingw targets). Based on the patch by Cameron Esfahani! llvm-svn: 115479	2010-10-03 22:02:38 +00:00
Chris Lattner	f909b07340	unbreak buildbot llvm-svn: 115476	2010-10-03 20:02:48 +00:00
Bill Wendling	5d9089ae14	Add test to make sure that the MMX intrinsic calls make it out the other end in tact. llvm-svn: 115458	2010-10-03 03:30:30 +00:00
Bill Wendling	bf73fe5e8d	Need to specify SSE4 for machines which don't have SSE4. The code checked for is generated by SSE4. Otherwise, we get something else. llvm-svn: 115352	2010-10-01 21:39:35 +00:00
Bill Wendling	a39904e6b9	We must check for something. llvm-svn: 115309	2010-10-01 10:20:10 +00:00
Bill Wendling	0e5e4b7b76	Disable tests until I can figure out why they're failing on just two machines but not others. llvm-svn: 115308	2010-10-01 10:01:10 +00:00
Bill Wendling	b3a1022572	Try adding an mtriple. llvm-svn: 115307	2010-10-01 09:40:50 +00:00
Kalle Raiskila	56f7cd255b	Zap some redundant 'ori $?, $?, 0' from SPU. Also remove some code that died in the process. One now non-existant ori is checked for. llvm-svn: 115306	2010-10-01 09:20:01 +00:00
Bill Wendling	3b2b1e7942	FileCheck-ize this test. llvm-svn: 115304	2010-10-01 08:55:48 +00:00
Bill Wendling	9b6853c6eb	FileCheck-ize this test. llvm-svn: 115303	2010-10-01 08:50:12 +00:00
Chris Lattner	a205055857	fix rdar://8494845 + PR8244 - a miscompile exposed by my patch in r101350 llvm-svn: 115294	2010-10-01 05:36:09 +00:00
Dale Johannesen	f419de0852	One more +sse2. llvm-svn: 115293	2010-10-01 05:08:18 +00:00
Dale Johannesen	bb6b961867	Mark all these as needing SSE2. Should fix PPC and maybe even Linux. llvm-svn: 115291	2010-10-01 04:17:55 +00:00
Dale Johannesen	ab60ae3cf3	Disable these tests for now; it's not obvious why they fail on Linux. llvm-svn: 115257	2010-10-01 00:59:21 +00:00
Dale Johannesen	c6f17f7420	Make test not sensitive to register choice. llvm-svn: 115250	2010-10-01 00:16:17 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
NAKAMURA Takumi	bb995ae261	test/CodeGen/X86/sibcall.ll: Add explicit triplets and remove XFAIL: apple-darwin8. llvm-svn: 115215	2010-09-30 22:02:06 +00:00
Jakob Stoklund Olesen	eb12f49fb7	Try again to disable critical edge splitting in CodeGenPrepare. The bug that broke i386 linux has been fixed in r115191. llvm-svn: 115204	2010-09-30 20:51:52 +00:00
Jakob Stoklund Olesen	665aa6efcc	When isel is emitting instructions for an x86 target without CMOV, the CFG is edited during emission. If the basic block ends in a switch that gets lowered to a jump table, any phis at the default edge were getting updated wrong. The jump table data structure keeps a pointer to the header blocks that wasn't getting updated after the MBB is split. This bug was exposed on 32-bit Linux when disabling critical edge splitting in codegen prepare. The fix is to uipdate stale MBB pointers whenever a block is split during emission. llvm-svn: 115191	2010-09-30 19:44:31 +00:00
Jason W Kim	645f6c2bef	Tiny patch for proof-of-concept cleanup of ARMAsmPrinter::EmitStartOfAsmFile() Small test for sanity check of resulting ARM .s file. Tested against -r115129. llvm-svn: 115133	2010-09-30 02:45:56 +00:00
Bob Wilson	97bf273870	Increase ARM APCS preferred alignment for i64 and f64 from 32 bits to 64 bits. LDM/STM instructions can run one cycle faster on some ARM processors if the memory address is 64-bit aligned. Radar 8489376. llvm-svn: 115047	2010-09-29 17:54:10 +00:00
Gabor Greif	c2eb72dc2a	do not compare actual branch labels; this may fix llvm-gcc-x86_64-darwin10-cross-mingw32 buildbot too llvm-svn: 115034	2010-09-29 10:45:43 +00:00
Gabor Greif	d36e3e8850	improve heuristics to find the 'and' corresponding to 'tst' to also catch opportunities on thumb2 added some doxygen on the way llvm-svn: 115033	2010-09-29 10:12:08 +00:00
Bill Wendling	cc91601211	And remove r114997's test. llvm-svn: 115003	2010-09-28 23:24:18 +00:00
Bill Wendling	b0b2c57149	Revert r114997. It was causing a failure on darwin10-selfhost. llvm-svn: 115002	2010-09-28 23:11:55 +00:00
Bill Wendling	d848beb1e5	Fix a FIXME. _foo.eh symbols are currently always exported so that the linker knows about them. This is not necessary on 10.6 and later. llvm-svn: 114997	2010-09-28 22:36:56 +00:00
Owen Anderson	a3181e2d79	Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995	2010-09-28 21:57:50 +00:00
Anton Korobeynikov	81bdc93bbb	User proper libcall names & condcodes while compiling for ARM EABI. Patch by Evzen Muller! llvm-svn: 114991	2010-09-28 21:39:26 +00:00
Owen Anderson	88af7d00fc	Part one of switching to using a more sane heuristic for determining if-conversion profitability. Rather than having arbitrary cutoffs, actually try to cost model the conversion. For now, the constants are tuned to more or less match our existing behavior, but these will be changed to reflect realistic values as this work proceeds. llvm-svn: 114973	2010-09-28 18:32:13 +00:00
Bob Wilson	3dc97324c1	Add a command line option "-arm-strict-align" to disallow unaligned memory accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941	2010-09-28 04:09:35 +00:00
Jakob Stoklund Olesen	415a7a6fec	Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now" This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost. It seems there is a downstream bug that is exposed by -cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back in. Note that the changes to tailcallfp2.ll are not reverted. They were good are required. llvm-svn: 114859	2010-09-27 18:43:48 +00:00
Jakob Stoklund Olesen	4f3443e74d	Explicitly disable CGP critical edge splitting for this test so it won't break by reenabling it temporarily. llvm-svn: 114858	2010-09-27 18:43:43 +00:00
Jakob Stoklund Olesen	f2a279b902	Don't depend on basic block numbering. llvm-svn: 114857	2010-09-27 18:43:40 +00:00
Chris Lattner	9f06f911d1	the latest assembler that runs on powerpc 10.4 machines doesn't support aligned comm. Detect when compiling for 10.4 and don't emit an alignment for comm. THis will hopefully fix PR8198. llvm-svn: 114817	2010-09-27 06:44:54 +00:00
Che-Liang Chiou	d6142976de	Add test case for PTX ret instruction llvm-svn: 114789	2010-09-25 07:49:54 +00:00
Che-Liang Chiou	299479020a	Add ret instruction to PTX backend llvm-svn: 114788	2010-09-25 07:46:17 +00:00
Evan Cheng	dbcc4b4d4d	Enable code placement optimization pass for ARM. llvm-svn: 114746	2010-09-24 19:07:23 +00:00
Bob Wilson	7fbbe9a43a	Set alignment operand for NEON VST instructions. llvm-svn: 114709	2010-09-23 23:42:37 +00:00
Bob Wilson	9eeb890172	Set alignment operand for NEON VLD instructions. llvm-svn: 114696	2010-09-23 21:43:54 +00:00
Evan Cheng	794aaa79e2	Disable codegen prepare critical edge splitting. Machine instruction passes now break critical edges on demand. llvm-svn: 114633	2010-09-23 06:55:34 +00:00
Owen Anderson	3231d13ddd	A select between a constant and zero, when fed by a bit test, can be efficiently lowered using a series of shifts. Fixes <rdar://problem/8285015>. llvm-svn: 114599	2010-09-22 22:58:22 +00:00
Cameron Esfahani	bbb9287080	Fix PR8201: Update the code to call via X86::CALL64pcrel32 in the 64-bit case. llvm-svn: 114597	2010-09-22 22:35:21 +00:00
Chris Lattner	bd85725341	Fix an inconsistency in the x86 backend that led it to reject "calll foo" on x86-32: 32-bit calls were named "call" not "calll". 64-bit calls were correctly named "callq", so this only impacted x86-32. This fixes rdar://8456370 - llvm-mc rejects 'calll' This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call, I will file a bugzilla. llvm-svn: 114534	2010-09-22 05:49:14 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	505af598d0	linux has a different stack alignment than the mac, relax this a bit. llvm-svn: 114519	2010-09-22 00:46:26 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	0cefa51114	filecheckize llvm-svn: 114507	2010-09-21 23:57:27 +00:00
Evan Cheng	d757c88bba	OptimizeCompareInstr should avoid iterating pass the beginning of the MBB when the 'and' instruction is after the comparison. llvm-svn: 114506	2010-09-21 23:49:07 +00:00
Owen Anderson	61158f98ab	Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes irrelevant, but add a new test for the new, improved functionality. llvm-svn: 114494	2010-09-21 22:51:46 +00:00
Devang Patel	d92f42d1d0	Use FileCheck llvm-svn: 114475	2010-09-21 20:50:32 +00:00
Owen Anderson	f4b1a5bdc4	When adding the carry bit to another value on X86, exploit the fact that the carry-materialization (sbbl x, x) sets the registers to 0 or ~0. Combined with two's complement arithmetic, we can fold the intermediate AND and the ADD into a single SUB. This fixes <rdar://problem/8449754>. llvm-svn: 114460	2010-09-21 18:41:19 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Evan Cheng	f3e9a48584	Enable machine sinking critical edge splitting. e.g. define double @foo(double %x, double %y, i1 %c) nounwind { %a = fdiv double %x, 3.2 %z = select i1 %c, double %a, double %y ret double %z } Was: _foo: divsd LCPI0_0(%rip), %xmm0 testb $1, %dil jne LBB0_2 movaps %xmm1, %xmm0 LBB0_2: ret Now: _foo: testb $1, %dil je LBB0_2 divsd LCPI0_0(%rip), %xmm0 ret LBB0_2: movaps %xmm1, %xmm0 ret This avoids the divsd when early exit is taken. rdar://8454886 llvm-svn: 114372	2010-09-20 22:52:00 +00:00
Owen Anderson	5e4734245d	CombinerAA is now reordering these stores. llvm-svn: 114354	2010-09-20 20:56:29 +00:00
Owen Anderson	272ff94916	When TCO is turned on, it is possible to end up with aliasing FrameIndex's. Therefore, CombinerAA cannot assume that different FrameIndex's never alias, but can instead use MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing. This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll when CombinerAA is enabled, modulo a different register allocation sequence. llvm-svn: 114348	2010-09-20 20:39:59 +00:00
Jim Grosbach	94dfd6fc4f	Simplify ARM callee-saved register handling by removing the distinction between the high and low registers for prologue/epilogue code. This was a Darwin-only thing that wasn't providing a realistic benefit anymore. Combining the save areas simplifies the compiler code and results in better ARM/Thumb2 codegen. For example, previously we would generate code like: push {r4, r5, r6, r7, lr} add r7, sp, #12 stmdb sp!, {r8, r10, r11} With this change, we combine the register saves and generate: push {r4, r5, r6, r7, r8, r10, r11, lr} add r7, sp, #12 rdar://8445635 llvm-svn: 114340	2010-09-20 19:32:20 +00:00
NAKAMURA Takumi	b912c27fc9	test/CodeGen/X86: Add explicit triplet -mtriple=i686-linux to 3 tests incompatible to Win32 codegen. r114297 raises 3 failures. They might fail also on mingw. llvm-svn: 114317	2010-09-19 21:58:55 +00:00
Eric Christopher	dbb199d89b	Add the exit instruction to the PTX target. Patch by Che-Liang Chiou <clchiou@gmail.com>! llvm-svn: 114294	2010-09-18 18:52:28 +00:00
Owen Anderson	b92b13d8a0	Invert the logic of reachesChainWithoutSideEffects(). What we want to check is that there is NO path to the destination containing side effects, not that SOME path contains no side effects. In practice, this only manifests with CombinerAA enabled, because otherwise the chain has little to no branching, so "any" is effectively equivalent to "all". llvm-svn: 114268	2010-09-18 04:45:14 +00:00
Bob Wilson	cb6db98897	Add target-specific DAG combiner for BUILD_VECTOR and VMOVRRD. An i64 value should be in GPRs when it's going to be used as a scalar, and we use VMOVRRD to make that happen, but if the value is converted back to a vector we need to fold to a simple bit_convert. Radar 8407927. llvm-svn: 114233	2010-09-17 22:59:05 +00:00
Jim Grosbach	7a6c37d3e7	Teach the (non-MC) instruction printer to use the cannonical names for push/pop, and shift instructions on ARM. Update the tests to match. llvm-svn: 114230	2010-09-17 22:36:38 +00:00
Evan Cheng	e53ab6dffc	Teach machine sink to 1) Do forward copy propagation. This makes it easier to estimate the cost of the instruction being sunk. 2) Break critical edges on demand, including cases where the value is used by PHI nodes. Critical edge splitting is not yet enabled by default. llvm-svn: 114227	2010-09-17 22:28:18 +00:00
Jim Grosbach	6d800f88da	Update tests to handle MC-inst instruction printing of shift operations. The legacy asm printer uses instructions of the form, "mov r0, r0, lsl #3", while the MC-instruction printer uses the form "lsl r0, r0, #3". The latter mnemonic is correct and preferred according the ARM documentation (A8.6.98). The former are pseudo-instructions for the latter. llvm-svn: 114221	2010-09-17 21:58:46 +00:00
Jim Grosbach	4a5e54021a	FileCheck-ize llvm-svn: 114218	2010-09-17 21:46:16 +00:00
Jim Grosbach	20da4e360b	Move thumb2 tests to the thumb2 directory llvm-svn: 114206	2010-09-17 20:34:09 +00:00
Jim Grosbach	9b0cd20f72	tweak test to check instructions rather than relying on the comment string llvm-svn: 114204	2010-09-17 20:27:26 +00:00
Dan Gohman	534db8a5c8	Avoid emitting a PIC base register if no PIC addresses are needed. This fixes rdar://8396318. llvm-svn: 114201	2010-09-17 20:24:24 +00:00
Jim Grosbach	f3ceecec7e	tweak test to check instructions rather than relying on the comment string llvm-svn: 114200	2010-09-17 20:21:03 +00:00
Jim Grosbach	c18a460adc	tweak test to check instructions rather than relying on the comment string llvm-svn: 114199	2010-09-17 20:17:41 +00:00
Dale Johannesen	f95f59a0c2	When substituting sunkaddrs into indirect arguments an asm, we were walking the asm arguments once and stashing their Values. This is wrong because the same memory location can be in the list twice, and if the first one has a sunkaddr substituted, the stashed value for the second one will be wrong (use-after-free). PR 8154. llvm-svn: 114104	2010-09-16 18:30:55 +00:00
Kalle Raiskila	c0e9b8d8bb	Change SPU register re-interpretations from OR to COPY_TO_REGCLASS instruction. This cleans up after the mess r108567 left in the CellSPU backend. ORCvt-instruction were used to reinterpret registers, and the ORs were then removed by isMoveInstr(). This patch now removes 350 instrucions of format: or $3, $3, $3 (from the 52 testcases in CodeGen/CellSPU). One case of a nonexistant or is checked for. Some moves of the form 'ori $., $., 0' and 'ai $., $., 0' still remain. llvm-svn: 114074	2010-09-16 12:29:33 +00:00
Bob Wilson	660d7ecf32	Reapply Gabor's 113839, 113840, and 113876 with a fix for a problem encountered while building llvm-gcc for arm. This is probably the same issue that the ppc buildbot hit. llvm::prior works on a MachineBasicBlock::iterator, not a plain MachineInstr. llvm-svn: 113983	2010-09-15 17:12:08 +00:00
Gabor Greif	9ae4b271f2	the darwin9-powerpc buildbot keeps consistently crashing, backing out following to get it back to green, so I can investigate in peace: svn merge -c -113840 llvm/test/CodeGen/ARM/arm-and-tst-peephole.ll svn merge -c -113876 -c -113839 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm-svn: 113980	2010-09-15 16:53:07 +00:00
Gabor Greif	00e34f4b32	forgot the testcase change for r113839 llvm-svn: 113840	2010-09-14 09:30:17 +00:00
Gabor Greif	5dbe800203	test for and-tst peephole optimization documents the status-quo with its opportunities llvm-svn: 113838	2010-09-14 08:50:43 +00:00
Owen Anderson	c237a849e3	Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms to expose greater opportunities for store narrowing in codegen. This patch fixes a potential infinite loop in instcombine caused by one of the introduced transforms being overly aggressive. llvm-svn: 113763	2010-09-13 17:59:27 +00:00
Eric Christopher	26abd3e0c2	Revert 113679, it was causing an infinite loop in a testcase that I've sent on to Owen. llvm-svn: 113720	2010-09-12 06:09:23 +00:00
Evan Cheng	1d6aa46cd7	Fix test so it passes on non-Darwin hosts. llvm-svn: 113577	2010-09-10 06:20:01 +00:00
Bob Wilson	8617234658	Fix merging base-updates for VLDM/VSTM: Before I switched these instructions to use AddrMode4, there was a count of the registers stored in one of the operands. I changed that to just count the operands but forgot to adjust for the size of D registers. This was noticed by Evan as a performance problem but it is a potential correctness bug as well, since it is possible that this could merge a base update with a non-matching immediate. llvm-svn: 113576	2010-09-10 05:15:04 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Bruno Cardoso Lopes	e8501a468c	Add one more pattern to fallback movddup llvm-svn: 113522	2010-09-09 18:48:34 +00:00
Bob Wilson	4adbaf1843	Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from the VST pseudos. The VLD/VST scheduling still needs work (see pr6722), but at least we shouldn't confuse the loads with the stores. llvm-svn: 113473	2010-09-09 05:40:26 +00:00
Jim Grosbach	504d23bd05	Re-enable usage of the ARM base pointer. r113394 fixed the known failures. Re-running some nightly testers w/ it enabled to verify. llvm-svn: 113399	2010-09-08 20:12:02 +00:00
Eric Christopher	ca2ec95154	Remove ssp from this test. llvm-svn: 113392	2010-09-08 19:32:34 +00:00
Kalle Raiskila	e542972828	Fix CellSPU vector shuffles, again. Some cases of lowering to rotate were miscompiled. llvm-svn: 113355	2010-09-08 11:53:38 +00:00
Jim Grosbach	261df12f64	disable for the moment while tracking down a few Thumb2-O0 failure that look related. (attempt deux, complete w/ test update this time) llvm-svn: 113333	2010-09-08 02:00:34 +00:00
Devang Patel	3f4abf397c	remove these tests for now. llvm-svn: 113293	2010-09-07 22:03:44 +00:00
Devang Patel	b0af23a1f6	There is no need to force target if the test is going to run on other x86 platforms. llvm-svn: 113285	2010-09-07 20:59:09 +00:00
Devang Patel	e50b23e223	Fix command line used to link these test cases. llvm-svn: 113237	2010-09-07 18:17:56 +00:00
Devang Patel	9dc0e5be58	Reintroduce dbg-declare tests. llvm-svn: 113232	2010-09-07 18:01:49 +00:00
Devang Patel	688338eec3	Remove last three tests. I need to make them independent of my setup. llvm-svn: 113213	2010-09-07 17:08:57 +00:00
Devang Patel	55a3bab0d2	Add a test case to check handling of dbg-declare during hybrid mode where we begin using fast-isel but switch back to DAG building at some point. llvm-svn: 113210	2010-09-07 17:03:44 +00:00
Devang Patel	29a775adf1	Add a test case to check handling of dbg-declare by selection DAG builder. llvm-svn: 113209	2010-09-07 16:56:35 +00:00
Devang Patel	184c81c3e2	Add a test case to check handling of dbg-declare by fast-isel. llvm-svn: 113208	2010-09-07 16:40:53 +00:00
Chris Lattner	eeba0c73e5	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Dale Johannesen	367afb5a00	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
Jim Grosbach	03f4be86ba	Re-apply r112883: "For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs." r112986 fixed a latent bug exposed by the above. llvm-svn: 112989	2010-09-03 18:37:12 +00:00
Daniel Dunbar	2ac3386ef3	Revert "For ARM stack frames that utilize variable sized objects and have either", it is breaking oggenc with Clang for ARMv6. This reverts commit 8d6e29cfda270be483abf638850311670829ee65. llvm-svn: 112962	2010-09-03 15:26:42 +00:00
NAKAMURA Takumi	24d039ebe3	test/CodeGen/X86: Add explicit -mtriple=(i686\|x86_64)-linux for Win32 host. llvm-svn: 112947	2010-09-03 03:24:08 +00:00
Bruno Cardoso Lopes	d6634a5b2e	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Bob Wilson	f65c9ef720	Replace NEON vabdl, vaba, and vabal intrinsics with combinations of the vabd intrinsic and add and/or zext operations. In the case of vaba, this also avoids the need for a DAG combine pattern to combine vabd with add. Update tests. Auto-upgrade the old intrinsics. llvm-svn: 112941	2010-09-03 01:35:08 +00:00
Anton Korobeynikov	a5a645559c	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Jim Grosbach	7fd9aea67c	For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs. rdar://7352504 rdar://8374540 rdar://8355680 llvm-svn: 112883	2010-09-02 22:29:01 +00:00
Dan Gohman	3c9b5f394b	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
Sandeep Patel	0ca17f7e8a	Fix an unnecessary XFAIL llvm-svn: 112853	2010-09-02 20:19:24 +00:00
Jim Grosbach	66c681a644	Now that register allocation properly considers reserved regs, simplify the ARM register class allocation order functions to take advantage of that. llvm-svn: 112841	2010-09-02 18:14:29 +00:00
Bob Wilson	75a6408f88	Convert VLD1 and VLD2 instructions to use pseudo-instructions until after regalloc. llvm-svn: 112825	2010-09-02 16:00:54 +00:00
NAKAMURA Takumi	a224e5563e	test/loop-strength-reduce4: Add explicit triplet for Win32 host. llvm-svn: 112802	2010-09-02 03:45:58 +00:00
NAKAMURA Takumi	54ce546865	test/twoaddr-coalesce: Do not use @main . Win32 codegen emits implicit invoking __main into, to fail. llvm-svn: 112801	2010-09-02 03:45:51 +00:00
Bob Wilson	38ab35a911	Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply, add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773	2010-09-01 23:50:19 +00:00
Bruno Cardoso Lopes	fea81b4831	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Jakob Stoklund Olesen	4b6fd48bba	Teach RemoveCopyByCommutingDef to check all aliases, not just subregisters. This caused a miscompilation in WebKit where %RAX had conflicting defs when RemoveCopyByCommutingDef was commuting a %EAX use. llvm-svn: 112751	2010-09-01 22:15:35 +00:00
Chris Lattner	39eccb4754	temporarily revert r112664, it is causing a decoding conflict, and the testcases should be merged. llvm-svn: 112711	2010-09-01 16:00:50 +00:00
Dan Gohman	110ed64fbb	Revert 112442 and 112440 until the compile time problems introduced by 112440 are resolved. llvm-svn: 112692	2010-09-01 01:45:53 +00:00
Bill Wendling	6789f8b6ae	We have a chance for an optimization. Consider this code: int x(int t) { if (t & 256) return -26; return 0; } We generate this: tst.w r0, #256 mvn r0, #25 it eq moveq r0, #0 while gcc generates this: ands r0, r0, #256 it ne mvnne r0, #25 bx lr Scandalous really! During ISel time, we can look for this particular pattern. One where we have a "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND instruction to 0. Something like this (greatly simplified): %r0 = ISD::AND ... ARMISD::CMPZ %r0, 0 @ sets [CPSR] %r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR] All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR] when it's zero. The zero value will all ready be in the %r0 register and we only need to change it if the AND wasn't zero. Easy! llvm-svn: 112664	2010-08-31 22:41:22 +00:00
Jim Grosbach	ad9b6de3b6	Update test for 112609 llvm-svn: 112610	2010-08-31 17:58:47 +00:00
Anton Korobeynikov	3a1d87a7ba	Fix borken test llvm-svn: 112555	2010-08-30 23:41:49 +00:00
Bob Wilson	4cd8a126c3	Remove NEON vmovn intrinsic, replacing it with vector truncate operations. Auto-upgrade the old intrinsic and update tests. llvm-svn: 112507	2010-08-30 20:02:30 +00:00
Chris Lattner	34bfab0ad5	two changes: 1) nuke ConstDataCoalSection, which is dead. 2) revise my previous patch for rdar://8018335, which was completely wrong. Specifically, it doesn't make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS, because it is for readonly data. templates (it turns out) go to const_coal_nt. The real fix for rdar://8018335 was to give ConstTextCoalSection a section kind of ReadOnly instead of Text. llvm-svn: 112496	2010-08-30 18:12:35 +00:00
Duncan Sands	68c30907cc	Correct bogus module triple specifications. llvm-svn: 112469	2010-08-30 10:48:29 +00:00
Dan Gohman	3a08ed7904	Make IVUsers iterative instead of recursive. This has the side effect of reversing the order of most of IVUser's results. llvm-svn: 112442	2010-08-29 16:40:03 +00:00
Dan Gohman	6665550bca	Make this test less dependent on register allocation choices. llvm-svn: 112426	2010-08-29 14:49:42 +00:00
Kalle Raiskila	1e616572d9	Fix lowering of INSERT_VECTOR_ELT in SPU. The IDX was treated as byte index, not element index. llvm-svn: 112422	2010-08-29 12:41:50 +00:00
Bob Wilson	d0c054886c	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Chris Lattner	c2887bc283	merge a bunch of shuffle tests into sse2.ll llvm-svn: 112398	2010-08-29 03:19:04 +00:00
Chris Lattner	b1ff978406	add some nounwind's llvm-svn: 112396	2010-08-29 03:07:47 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Dan Gohman	e06905d1f0	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Bob Wilson	13ce07fa92	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Chris Lattner	7413e87b6d	get this test passing on linux builders. llvm-svn: 112280	2010-08-27 18:49:08 +00:00
Bob Wilson	edf722add3	Add alignment arguments to all the NEON load/store intrinsics. Update all the tests using those intrinsics and add support for auto-upgrading bitcode files with the old versions of the intrinsics. llvm-svn: 112271	2010-08-27 17:13:24 +00:00
Daniel Dunbar	1844a71e66	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Chris Lattner	af23e9a798	Add a hackaround for PR7993 which is causing failures on x86 builders that lack sse2. llvm-svn: 112175	2010-08-26 06:57:07 +00:00
Chris Lattner	66afba7aa4	I think enough general codegen bugs are fixed to allow this to work on random hosts, lets see! llvm-svn: 112172	2010-08-26 05:52:42 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	825294b85f	Make sure this forces the x86 targets llvm-svn: 112169	2010-08-26 05:25:05 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Jim Grosbach	08da771ec3	Enable pre-RA virtual frame base register allocation. rdar://8277890 llvm-svn: 112127	2010-08-26 00:58:06 +00:00
Bob Wilson	4629f423f8	Revert svn 107892 (with changes to work with trunk). It caused a crash if a VLD result was not used (Radar 8355607). It should also fix pr7988, but I haven't verified that yet. llvm-svn: 112118	2010-08-26 00:13:36 +00:00
Chris Lattner	c7fb446a9d	temporarily disable this, which started failing on the llvm-i686-linux builder. I will investigate tonight. llvm-svn: 112113	2010-08-25 23:43:14 +00:00
Chris Lattner	75ff053497	Change handling of illegal vector types to widen when possible instead of expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This affects two places in the code: handling cross block values and handling function return and arguments. Since vectors are already widened by legalizetypes, this gives us much better code and unblocks x86-64 abi and SPU abi work. For example, this (which is a silly example of a cross-block value): define <4 x float> @test2(<4 x float> %A) nounwind { %B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1> %C = fadd <2 x float> %B, %B br label %BB BB: %D = fadd <2 x float> %C, %C %E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> ret <4 x float> %E } Now compiles into: _test2: ## @test2 ## BB#0: addps %xmm0, %xmm0 addps %xmm0, %xmm0 ret previously it compiled into: _test2: ## @test2 ## BB#0: addps %xmm0, %xmm0 pshufd $1, %xmm0, %xmm1 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm1, %xmm0 addps %xmm0, %xmm0 ret This implements rdar://8230384 llvm-svn: 112101	2010-08-25 22:49:25 +00:00
Daniel Dunbar	a54a1b0edf	ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. llvm-svn: 112053	2010-08-25 16:58:05 +00:00
Eric Christopher	6b1533a1a9	Add another basic test cribbed from the x86 fast-isel tests. llvm-svn: 112036	2010-08-25 07:57:29 +00:00
Eric Christopher	37d547aee6	Run this on thumb and arm. llvm-svn: 112035	2010-08-25 07:53:15 +00:00
Eric Christopher	e58c03698e	Make this testcase actually executed with fast-isel on arm. llvm-svn: 112033	2010-08-25 07:47:00 +00:00
Bruno Cardoso Lopes	0bc919fa35	Convert test to use filecheck and make it more specific llvm-svn: 112016	2010-08-25 01:47:16 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Kalle Raiskila	7e25bc4145	Fix SPU BE to use all the available return registers. llc used to assert on the added testcase. llvm-svn: 111911	2010-08-24 11:50:48 +00:00
Chris Lattner	58bd73a5a7	Add a new llvm.x86.int intrinsic, allowing access to the x86 int and int3 instructions. Patch by Peter Housel! llvm-svn: 111831	2010-08-23 19:39:25 +00:00
Dan Gohman	42ef669d81	Fix x86 fast-isel's cmp+branch folding to avoid folding when the comparison is in a different basic block from the branch. In such cases, the comparison's operands may not have initialized virtual registers available. llvm-svn: 111709	2010-08-21 02:32:36 +00:00
Bob Wilson	be745d8c00	Replace some NEON vmovl intrinsic that I missed earlier. llvm-svn: 111696	2010-08-20 23:22:43 +00:00
Bob Wilson	9a511c07e4	Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend and zero-extend operations. llvm-svn: 111614	2010-08-20 04:54:02 +00:00
Evan Cheng	361b9be7c6	It's possible to sink a def if its local uses are PHI's. llvm-svn: 111537	2010-08-19 18:33:29 +00:00
Dan Gohman	82656fb0e1	When sending stats output to stdout for grepping, don't emit normal output to standard output also. llvm-svn: 111435	2010-08-18 22:22:44 +00:00
Dan Gohman	2470818942	When sending stats output to stdout for grepping, don't emit normal output to standard output also. llvm-svn: 111401	2010-08-18 20:32:46 +00:00
Kalle Raiskila	e60b5161d1	Fix a bug with insertelement on SPU. The previous algorithm in LowerVECTOR_SHUFFLE didn't check all requirements for "monotonic" shuffles. llvm-svn: 111361	2010-08-18 10:20:29 +00:00
Kalle Raiskila	ab49360f59	Remove all traces of v2[i,f]32 on SPU. The "half vectors" are now widened to full size by the legalizer. The only exception is in parameter passing, where half vectors are expanded. This causes changes to some dejagnu tests. llvm-svn: 111360	2010-08-18 10:04:39 +00:00
Kalle Raiskila	f3984d1ef6	Change SPU C calling convention to match that described in "SPU Application Binary Interface Specification, v1.9" by IBM. Specifically: use r3-r74 to pass parameters and the return value. llvm-svn: 111358	2010-08-18 09:50:30 +00:00
Bob Wilson	fb7eaff759	Expand ZERO_EXTEND operations for NEON vector types. Testcase from Nick Lewycky. llvm-svn: 111341	2010-08-18 01:45:52 +00:00
Dan Gohman	ed2b005842	Tweak IVUsers' concept of "interesting" to exclude add recurrences where the step value is an induction variable from an outer loop, to avoid trouble trying to re-expand such expressions. This effectively hides such expressions from indvars and lsr, which prevents them from getting into trouble. llvm-svn: 111317	2010-08-17 22:50:37 +00:00
Evan Cheng	efdc74ea59	Add nounwind. llvm-svn: 111312	2010-08-17 22:35:20 +00:00
Dale Johannesen	16f96445c3	Make fast scheduler handle asm clobbers correctly. PR 7882. Follows suggestion by Amaury Pouly, thanks. llvm-svn: 111306	2010-08-17 22:17:24 +00:00
Bob Wilson	942b10f511	Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoid printing "lsl #0". This fixes the remaining parts of pr7792. Make corresponding changes for encoding/decoding these instructions. llvm-svn: 111251	2010-08-17 17:23:19 +00:00
Bob Wilson	411dfad981	Allow more cases of undef shuffle indices and add tests for them. llvm-svn: 111226	2010-08-17 05:54:34 +00:00
Evan Cheng	f259efde47	PHI elimination should not break back edge. It can cause some significant code placement issues. rdar://8263994 good: LBB0_2: mov r2, r0 . . . mov r1, r2 bne LBB0_2 bad: LBB0_2: mov r2, r0 . . . @ BB#3: mov r1, r2 b LBB0_2 llvm-svn: 111221	2010-08-17 01:20:36 +00:00
Bob Wilson	eee4824f74	Add a testcase for svn 111208. llvm-svn: 111212	2010-08-16 23:44:29 +00:00
Bob Wilson	804f6159f1	Generalize a pattern for PKHTB: an SRL of 16-31 bits will guarantee that the high halfword is zero. The shift need not be exactly 16 bits. llvm-svn: 111196	2010-08-16 22:26:55 +00:00
Bob Wilson	3fd1e0dcda	Convert test to FileCheck. llvm-svn: 111195	2010-08-16 22:21:13 +00:00
Bob Wilson	8f553757c4	Convert a test to use FileCheck. llvm-svn: 111153	2010-08-16 17:05:27 +00:00
Benjamin Kramer	cbc55d9dc0	Test expects SSE, give him SSE. llvm-svn: 111115	2010-08-15 23:32:03 +00:00
Benjamin Kramer	4566466b7f	Restore arch on these test, they fail on arm. llvm-svn: 111109	2010-08-15 20:42:56 +00:00
Dale Johannesen	339423c460	Mark as XFAIL on darwin 8. PR 7886. llvm-svn: 111108	2010-08-15 19:40:29 +00:00
Bob Wilson	3c9ed76ba5	Temporarily disable tail calls on ARM to work around some linker problems. llvm-svn: 111050	2010-08-13 22:43:33 +00:00
Dale Johannesen	8d3c89e765	Revert 110491. While not wrong, it was based on a misanalysis and is undesirable. llvm-svn: 111028	2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes	7f704b31a9	- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary. - Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too. - Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX. - Add a testcase for a simple 128-bit zero vector creation. llvm-svn: 110946	2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Devang Patel	48595bf2bc	This is x86 only test. llvm-svn: 110887	2010-08-12 00:17:38 +00:00
Bruno Cardoso Lopes	1675ee7a02	Add testcases for all AVX 256-bit intrinsics added in the last couple days llvm-svn: 110854	2010-08-11 21:12:09 +00:00
Bruno Cardoso Lopes	29c8818ad9	Reapply r109881 using a more strict command line for llc. llvm-svn: 110833	2010-08-11 17:39:23 +00:00
Jim Grosbach	a5f923b1a1	fix silly typo llvm-svn: 110831	2010-08-11 17:32:46 +00:00
Jim Grosbach	2bf8bd1e19	Add a target triple, as the runtime library invocation varies a bit by platform. It's apparently "bl __muldf3" on linux, for example. Since that's not what we're checking here, it's more robust to just force a triple. We just wwant to check that the inline FP instructions are only generated on cpus that have them." llvm-svn: 110830	2010-08-11 17:31:12 +00:00
Evan Cheng	b0276814d5	Fix test and re-enable it. llvm-svn: 110829	2010-08-11 17:25:51 +00:00
Dan Gohman	4df4114870	Temporarily disable some failing tests, until they can be properly investigated. llvm-svn: 110825	2010-08-11 16:36:07 +00:00
Jim Grosbach	4d5dc3e7e5	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Dan Gohman	f3d783a6d2	Temporarily disable some failing tests, until they can be properly investigated. llvm-svn: 110808	2010-08-11 15:09:00 +00:00
Bill Wendling	6a98131468	Consider this code snippet: float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799	2010-08-11 08:43:16 +00:00
Evan Cheng	5190f09291	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	40921a4e62	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) llvm-svn: 110795	2010-08-11 06:51:54 +00:00
Evan Cheng	49e02fc414	Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit instructions: dmb, dsb, isb, msr, and mrs. llvm-svn: 110786	2010-08-11 06:30:38 +00:00
Evan Cheng	6e809de90c	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Bill Wendling	79937dfc5b	Update test to match output of optimize compares for ARM. llvm-svn: 110765	2010-08-11 01:05:02 +00:00
Bill Wendling	871d4e1170	The optimize comparisons pass removes the "cmp" instruction this is checking for. llvm-svn: 110739	2010-08-10 22:16:05 +00:00
Evan Cheng	3f251fb26e	Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707	2010-08-10 19:30:19 +00:00
Daniel Dunbar	0dd47bfca3	Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FP register is", it breaks a couple test-suite tests. llvm-svn: 110701	2010-08-10 18:32:02 +00:00
Jakob Stoklund Olesen	5730846c2f	Fix test for more architectures. Patch by Tobias Grosser. llvm-svn: 110685	2010-08-10 16:48:24 +00:00
Tobias Grosser	fedeff8015	Fix failing testcase. Those look like typos to me. llvm-svn: 110664	2010-08-10 09:54:29 +00:00
Devang Patel	b219746c80	Handle TAG_constant for integers. llvm-svn: 110656	2010-08-10 07:11:13 +00:00
Evan Cheng	8d5d1c1331	Fix ARM hasFP() semantics. It should return true whenever FP register is reserved, not available for general allocation. This eliminates all the extra checks for Darwin. This change also fixes the use of FP to access frame indices in leaf functions and cleaned up some confusing code in epilogue emission. llvm-svn: 110655	2010-08-10 06:26:49 +00:00
Kalle Raiskila	999da1f3a0	Have SPU handle halfvec stores aligned by 8 bytes. llvm-svn: 110576	2010-08-09 16:33:00 +00:00
Dale Johannesen	a3bd31a923	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Rafael Espindola	027d5bcf89	Fix eabi calling convention when a 64 bit value shadows r3. Without this what was happening was: * R3 is not marked as "used" * ARM backend thinks it has to save it to the stack because of vaarg * Offset computation correctly ignores it * Offsets are wrong llvm-svn: 110446	2010-08-06 15:35:32 +00:00
Eric Christopher	e1fb772aa5	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Devang Patel	cc3f3b341d	Move x86 specific tests into test/CodeGen/X86. llvm-svn: 110372	2010-08-05 20:25:37 +00:00
Dan Gohman	c53ee449a5	Move x86-specific tests out of test/Transforms/LoopStrengthReduce and into test/CodeGen/X86, so that they aren't run when the x86 target is not enabled. Fix uglygep.ll to not be x86-specific. llvm-svn: 110343	2010-08-05 17:04:15 +00:00
Daniel Dunbar	e62e664656	tests: CodeGen/X86/GC tests require X86. llvm-svn: 110338	2010-08-05 15:45:33 +00:00
Bill Wendling	ca1cb13646	The lower invoke pass needs to have unreachable code elimination run after it because it could create such things. This fixes a MingW buildbot test failure. llvm-svn: 110279	2010-08-04 23:36:02 +00:00

... 3 4 5 6 7 ...

3908 Commits