llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	bdd4c8b04d	Treat clobber operands like early clobbers: if we have any, we force sdisel to do all regalloc for an asm. This leads to gross but correct codegen. This fixes the rest of PR2078. llvm-svn: 47454	2008-02-21 19:43:13 +00:00
Tanya Lattner	3cdf542f5a	Remove llvm-upgrade and update tests. llvm-svn: 47432	2008-02-21 07:42:26 +00:00
Chris Lattner	83c93d5afd	Fix a (harmless) but where vregs were added to the used reg lists for inline asms. Fix PR2078 by marking aliases of registers used when a register is marked used. This prevents EAX from being allocated when AX is listed in the clobber set for the asm. llvm-svn: 47426	2008-02-21 04:55:52 +00:00
Evan Cheng	0aa9f2a7f3	XFAIL this for now. llvm-svn: 47355	2008-02-20 02:38:58 +00:00
Chris Lattner	a0b1cc41ef	this test requires sse2 llvm-svn: 47331	2008-02-19 18:07:46 +00:00
Chris Lattner	97b9662f78	Don't fold and's into test instructions if they have multiple uses. This compiles test-nofold.ll into: _test: movl $15, %ecx andl 4(%esp), %ecx testl %ecx, %ecx movl $42, %eax cmove %ecx, %eax ret instead of: _test: movl 4(%esp), %eax movl %eax, %ecx andl $15, %ecx testl $15, %eax movl $42, %eax cmove %ecx, %eax ret llvm-svn: 47330	2008-02-19 17:37:35 +00:00
Chris Lattner	08162d9515	rename tests to avoid a test- prefix when they aren't related to the test instruction. llvm-svn: 47329	2008-02-19 17:33:52 +00:00
Nick Lewycky	0e2e21b8b9	Don't spew stats to stderr. llvm-svn: 47308	2008-02-19 03:11:47 +00:00
Nick Lewycky	b54a803a2e	Fix up the run line for this new test. llc: for the -info-output-file option: requires a value! llvm-svn: 47306	2008-02-19 02:58:36 +00:00
Evan Cheng	634a8f9275	New test. llvm-svn: 47302	2008-02-19 02:09:58 +00:00
Evan Cheng	6200c225e0	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Dan Gohman	a589ee11bb	Don't mark scalar integer multiplication as Expand on x86, since x86 has plain one-result scalar integer multiplication instructions. This avoids expanding such instructions into MUL_LOHI sequences that must be special-cased at isel time, and avoids the problem with that code that provented memory operands from being folded. This fixes PR1874, addressesing the most common case. The uncommon cases of optimizing multiply-high operations will require work in DAGCombiner. llvm-svn: 47277	2008-02-18 17:55:26 +00:00
Andrew Lenharth	9b254eed32	llvm.memory.barrier, and impl for x86 and alpha llvm-svn: 47204	2008-02-16 01:24:58 +00:00
Evan Cheng	6edbbe0c25	This test is not interesting. llvm-svn: 47189	2008-02-15 23:06:21 +00:00
Chris Lattner	558a3ba17f	Fix a miscompilation from Dan's recent apintification. llvm-svn: 47128	2008-02-14 18:48:56 +00:00
Chris Lattner	3bd37f549a	This readme entry is done, testcase here: CodeGen/X86/zero-remat.ll llvm-svn: 47106	2008-02-14 05:39:46 +00:00
Evan Cheng	a4621f04bb	Fix test. llvm-svn: 47102	2008-02-14 01:32:53 +00:00
Chris Lattner	a08af08a88	In SDISel, for targets that support FORMAL_ARGUMENTS nodes, lower this node as soon as we create it in SDISel. Previously we would lower it in legalize. The problem with this is that it only exposes the argument loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2 can hack on them. This causes us to miss some optimizations because datatype expansion also happens here. Exposing the loads early allows us to do optimizations on them. For example we now compile arg-cast.ll to: _foo: movl $2147483647, %eax andl 8(%esp), %eax ret where we previously produced: _foo: subl $12, %esp movsd 16(%esp), %xmm0 movsd %xmm0, (%esp) movl $2147483647, %eax andl 4(%esp), %eax addl $12, %esp ret It might also make sense to do this for ISD::CALL nodes, which have implicit stores on many targets. llvm-svn: 47054	2008-02-13 07:39:09 +00:00
Evan Cheng	ea8530d82c	New tests. llvm-svn: 47047	2008-02-13 03:23:53 +00:00
Evan Cheng	724029151b	Don't mask the isel bug. llvm-svn: 47018	2008-02-12 19:11:29 +00:00
Evan Cheng	3069a26f63	This test assumes no SSE4.1. llvm-svn: 47017	2008-02-12 19:11:08 +00:00
Evan Cheng	b21301fbe7	Fix some test cases. llvm-svn: 46998	2008-02-12 07:22:46 +00:00
Dale Johannesen	43a2ed8611	Alignment of struct containing vectors depends on whether SSE is present, on Darwin anyway. Make it explicit. llvm-svn: 46909	2008-02-09 19:04:25 +00:00
Evan Cheng	3b3286d4bc	It's not always safe to fold movsd into xorpd, etc. Check the alignment of the load address first to make sure it's 16 byte aligned. llvm-svn: 46893	2008-02-08 21:20:40 +00:00
Evan Cheng	8d59dd119b	Added missing entries in X86 load / store folding tables. llvm-svn: 46866	2008-02-08 00:12:56 +00:00
Evan Cheng	a20a773654	Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode. Before: _main: subq $8, %rsp leaq _X(%rip), %rax movsd 8(%rax), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Now: _main: subq $8, %rsp movsd _X+8(%rip), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Notice there is another idiotic codegen issue that needs to be fixed asap: xorl %ecx, %ecx movl %ecx, %eax llvm-svn: 46850	2008-02-07 08:53:49 +00:00
Evan Cheng	87fbd66f9f	Fix PR1975: dag isel emitter produces patterns that isel wrong flag result. llvm-svn: 46776	2008-02-05 22:50:29 +00:00
Chris Lattner	f4e5e556fd	Add target triples to these so they don't fail on linux. llvm-svn: 46496	2008-01-29 06:26:07 +00:00
Chris Lattner	888560d62c	Implement some dag combines that allow doing fneg/fabs/fcopysign in integer registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414	2008-01-27 17:42:27 +00:00
Chris Lattner	596704405f	New test to verify that "merging 4 loads into a vec load" continues to work and continues to infer alignment info. llvm-svn: 46403	2008-01-26 20:06:45 +00:00
Chris Lattner	e30e33af4f	Infer alignment of loads and increase their alignment when we can tell they are from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 * andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401	2008-01-26 19:45:50 +00:00
Chris Lattner	364963d41c	remove a useless xfailed test. llvm-svn: 46400	2008-01-26 19:35:46 +00:00
Bill Wendling	1a17ef02c8	If there's no instructions being emitted on X86 for a function, emit a nop. Emit the nop directly for PPC. llvm-svn: 46398	2008-01-26 09:03:52 +00:00
Chris Lattner	84ab724e06	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	a91f77eaac	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Chris Lattner	001d781c41	take these with a pr # llvm-svn: 46303	2008-01-24 06:35:44 +00:00
Evan Cheng	35abd840a6	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Evan Cheng	1e0d4d2aa8	SSE varargs arguments are passed in memory. llvm-svn: 46262	2008-01-22 23:26:53 +00:00
Dale Johannesen	4768c3c9b6	Test is correct again for the moment. llvm-svn: 46172	2008-01-18 19:53:31 +00:00
Chris Lattner	1ea55cf816	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Evan Cheng	54c20b559e	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	7be1528004	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Duncan Sands	32b0ff6814	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	6e3379c07b	make sure to use a cpu that has sse. llvm-svn: 46060	2008-01-16 06:32:02 +00:00
Chris Lattner	8f7cec859e	My previous commit had an incomplete message, it should have been: make the 'fp return in ST(0)' optimization smart enough to look through token factor nodes. THis allows us to compile testcases like CodeGen/X86/fp-stack-retcopy.ll into: _carg: subl $12, %esp call L_foo$stub fstpl (%esp) fldl (%esp) addl $12, %esp ret instead of: _carg: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret Still not optimal, but much better and this is a trivial patch. Fixing the rest requires invasive surgery that is is not llvm 2.2 material. llvm-svn: 46054	2008-01-16 05:56:59 +00:00
Chris Lattner	915ec14073	verify x86 generates ud2 for llvm.trap llvm-svn: 46023	2008-01-15 22:22:02 +00:00
Dale Johannesen	04b99780cf	Disable for now. llvm-svn: 45881	2008-01-11 20:47:33 +00:00
Duncan Sands	53c954fa86	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0f8c7c4a73	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Evan Cheng	0e400d4cb7	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00
Evan Cheng	a31824a08e	Fix sse2.psrl.w and sse2.psrl.q definitions. llvm-svn: 45772	2008-01-09 02:16:44 +00:00
Chris Lattner	51b01bf8a5	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Duncan Sands	7b1460cca4	Crashes llc when using Chris's new legalization logic. llvm-svn: 45758	2008-01-08 21:51:53 +00:00
Nate Begeman	d3d49df3f1	Update test to catch recent x86 insert regression and improvements llvm-svn: 45705	2008-01-07 17:49:23 +00:00
Chris Lattner	41e423a6f5	fix this to use a valid triple. llvm-svn: 45509	2008-01-02 22:21:45 +00:00
Chris Lattner	5d998c5712	verify that aligned common support doesn't break. llvm-svn: 45495	2008-01-02 19:48:24 +00:00
Chris Lattner	d2b8a36f0e	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00
Chris Lattner	0d90c8f016	upgrade this test llvm-svn: 45406	2007-12-29 19:24:06 +00:00
Chris Lattner	3b6a82118b	Fold comparisons against a constant nan, and optimize ORD/UNORD comparisons with a constant. This allows us to compile isnan to: _foo: fcmpu cr7, f1, f1 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr instead of: LCPI1_0: ; float .space 4 _foo: lis r2, ha16(LCPI1_0) lfs f0, lo16(LCPI1_0)(r2) fcmpu cr7, f1, f0 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 45405	2007-12-29 08:37:08 +00:00
Chris Lattner	33de0c6e92	this xform is implemented. llvm-svn: 45404	2007-12-29 08:19:39 +00:00
Chris Lattner	07ccbfa64a	Codegen: as: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstps (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret instead of: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstpl (%esi) cvtsd2ss (%esi), %xmm0 movss %xmm0, (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret llvm-svn: 45401	2007-12-29 06:57:38 +00:00
Chris Lattner	8013bd339b	avoid going through a stack slot to convert from fpstack to xmm reg if we are just going to store it back anyway. This improves things like: double foo(); void bar(double P) { P = foo(); } llvm-svn: 45399	2007-12-29 06:41:28 +00:00
Chris Lattner	bc13df19a8	one fewer uncond branch with my codegenprepare hack for single-mbb backedges. llvm-svn: 45360	2007-12-26 17:23:47 +00:00
Evan Cheng	483a969ece	Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id. llvm-svn: 45167	2007-12-18 19:38:14 +00:00
Evan Cheng	91e0fc9cb4	FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit. llvm-svn: 45157	2007-12-18 08:42:10 +00:00
Evan Cheng	23d2d4dc6c	Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. llvm-svn: 45058	2007-12-15 03:00:47 +00:00
Evan Cheng	0e6408124e	Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero. llvm-svn: 45029	2007-12-14 08:30:15 +00:00
Evan Cheng	e9fbc3f014	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Evan Cheng	37c36ed79a	Be extra careful with extension use optimation. Now turned on by default. llvm-svn: 44981	2007-12-13 03:32:53 +00:00
Evan Cheng	827d30db19	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Evan Cheng	6e68381e02	Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960	2007-12-12 23:12:09 +00:00
Dan Gohman	7a7742c2fe	Allow vector integer constants to be created with SelectionDAG::getConstant, in the same way as vector floating-point constants. This allows the legalize expansion code for @llvm.ctpop and friends to be usable with vector types. llvm-svn: 44954	2007-12-12 22:21:26 +00:00
Evan Cheng	0f42730722	Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. llvm-svn: 44929	2007-12-12 07:55:34 +00:00
Evan Cheng	0a1254f634	Add a test case for -optimize-ext-uses. llvm-svn: 44928	2007-12-12 07:54:08 +00:00
Evan Cheng	2a98956796	Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part. llvm-svn: 44921	2007-12-12 06:45:40 +00:00
Evan Cheng	4fbf459549	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Christopher Lamb	d202e03fe5	Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices. llvm-svn: 44785	2007-12-10 07:24:06 +00:00
Evan Cheng	bfd373a53e	Much improved v8i16 shuffles. (Step 1). llvm-svn: 44676	2007-12-07 08:07:39 +00:00
Evan Cheng	26593a04db	New test case. llvm-svn: 44672	2007-12-07 01:48:46 +00:00
Evan Cheng	5cb41390ab	Fix a bogus test case. llvm-svn: 44668	2007-12-06 22:12:45 +00:00
Evan Cheng	8393dc7378	Turning simple splitting on. Start testing new coalescer heuristics as new llcbeta. llvm-svn: 44660	2007-12-06 08:54:31 +00:00
Chris Lattner	eedaf92fcf	third time around: instead of disabling this completely, only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658	2007-12-06 07:47:55 +00:00
Chris Lattner	b5fdfb9612	Actually, disable this code for now. More analysis and improvements to the X86 backend are needed before this should be enabled by default. llvm-svn: 44657	2007-12-06 07:44:31 +00:00
Chris Lattner	7c709a5d08	implement a readme entry, compiling the code into: _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656	2007-12-06 07:33:36 +00:00
Chris Lattner	dfa39289a5	fix this when run on non x86 hosts. llvm-svn: 44645	2007-12-06 01:05:52 +00:00
Evan Cheng	69fda0a716	Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0. llvm-svn: 44479	2007-12-01 02:07:52 +00:00
Evan Cheng	b10dc27b20	Do not fold reload into an instruction with multiple uses. It issues one extra load. llvm-svn: 44467	2007-11-30 21:23:43 +00:00
Dan Gohman	f151c8e760	Remove unnecessary && from the RUN lines of this test. llvm-svn: 44342	2007-11-27 00:03:38 +00:00
Dan Gohman	9a69341725	Don't lower srem/urem X%C to X-X/C*C unless the division is actually optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341	2007-11-26 23:46:11 +00:00
Chris Lattner	5728bdd4db	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00
Chris Lattner	f5dfd15e98	upgrade this test llvm-svn: 44298	2007-11-24 05:39:29 +00:00
Dan Gohman	36347a26f9	Add support in SplitVectorOp for remainder operators. llvm-svn: 44233	2007-11-19 15:15:03 +00:00
Chris Lattner	861302e264	fix bogus test that the more strict lexer is finding. llvm-svn: 44216	2007-11-18 18:26:45 +00:00
Evan Cheng	13e8b022f5	Typo. llvm-svn: 44196	2007-11-16 23:55:08 +00:00
Evan Cheng	2c1a50455c	Fix a thinko in post-allocation coalescer. llvm-svn: 44166	2007-11-15 08:13:29 +00:00
Anton Korobeynikov	2c6387803e	Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied to all targets uses GOT-relative offsets for PIC (Alpha?) llvm-svn: 44108	2007-11-14 09:18:41 +00:00
Arnold Schwaighofer	d2c16ff905	Update tailcall code to include inline attribute operand for memcpy. llvm-svn: 43978	2007-11-10 10:48:01 +00:00
Evan Cheng	05b94b8c13	Fix tests. llvm-svn: 43961	2007-11-09 20:46:00 +00:00
Evan Cheng	ece4c68b82	If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it. llvm-svn: 43888	2007-11-08 09:25:29 +00:00
Evan Cheng	2dbffa4e76	Add pseudo dependency to force two-address instruction to be scheduled after other uses. There was a overly restricted check that prevented some obvious cases. llvm-svn: 43762	2007-11-06 08:44:59 +00:00
Dan Gohman	08143e397d	Add support for vector remainder operations. llvm-svn: 43744	2007-11-05 23:35:22 +00:00
Dale Johannesen	4646aa3e33	Make labels work in asm blocks; allow labels as parameters. Rename ValueRefList to ParamList in AsmParser, since its only use is for parameters. llvm-svn: 43734	2007-11-05 21:20:28 +00:00
Evan Cheng	a406b47f14	Handle cases where a register and one of its super-register are both marked as defined on the same instruction. This fixes PR1767. llvm-svn: 43699	2007-11-05 03:11:55 +00:00
Evan Cheng	e12363dac5	Fix test case. Chris didn't do make check. :-) llvm-svn: 43698	2007-11-05 03:04:26 +00:00
Evan Cheng	c68023a955	Doh. PR1187 -> PR1766. llvm-svn: 43693	2007-11-05 01:00:44 +00:00
Evan Cheng	a8044084ac	Fix PR1187. llvm-svn: 43692	2007-11-05 00:59:10 +00:00
Chris Lattner	9329e780cd	Fix PR1761 by not printing (rip) suffix when in -static mode. Evan, please review this. llvm-svn: 43680	2007-11-04 19:23:28 +00:00
Chris Lattner	296160d443	Fix PR1763 by allowing the 'q' constraint to work with 64-bit regs on x86-64. llvm-svn: 43669	2007-11-04 06:51:12 +00:00
Evan Cheng	66298e226f	There are times when the coalescer would not coalesce away a copy but the copy can be eliminated by the allocator is the destination and source targets the same register. The most common case is when the source and destination registers are in different class. For example, on x86 mov32to32_ targets GR32_ which contains a subset of the registers in GR32. The allocator can do 2 things: 1. Set the preferred allocation for the destination of a copy to that of its source. 2. After allocation is done, change the allocation of a copy destination (if legal) so the copy can be eliminated. This eliminates 443 extra moves from 403.gcc. llvm-svn: 43662	2007-11-03 07:20:12 +00:00
Evan Cheng	0442889b18	Add run line. llvm-svn: 43645	2007-11-02 17:36:58 +00:00
Evan Cheng	f851163c53	One more extract_subreg coalescing bug. llvm-svn: 43644	2007-11-02 17:35:08 +00:00
Evan Cheng	e453ff4913	Missing a getNumOperands check. llvm-svn: 43630	2007-11-02 01:26:22 +00:00
Dale Johannesen	440f9abab4	Test that expand_vector_elt(v2i64) works in 32-bit mode. llvm-svn: 43598	2007-11-01 02:38:24 +00:00
Evan Cheng	c2dbfee43f	It's not safe to tell SplitCriticalEdge to merge identical edges. It may delete the phi instruction that's being processed. llvm-svn: 43524	2007-10-30 22:27:26 +00:00
Evan Cheng	b024c4c81d	- Bug fixes. - Allow icmp rewrite using an iv / stride of a smaller integer type. llvm-svn: 43480	2007-10-29 22:07:18 +00:00
Dan Gohman	ae95d72a52	Fix a DAGCombiner abort on a bitcast from a scalar to a vector. llvm-svn: 43470	2007-10-29 20:44:42 +00:00
Evan Cheng	e106e2f142	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Chris Lattner	5e99fd8c0d	Add support for the x86-64 'q' regigster modifier, and add support for the b/h/w/k/q inline asm memory modifiers, which are just ignored. This fixes PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll llvm-svn: 43430	2007-10-29 03:09:07 +00:00
Evan Cheng	7f3d02471d	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Evan Cheng	133694db06	If a loop termination compare instruction is the only use of its stride, and the compaison is against a constant value, try eliminate the stride by moving the compare instruction to another stride and change its constant operand accordingly. e.g. loop: ... v1 = v1 + 3 v2 = v2 + 1 if (v2 < 10) goto loop => loop: ... v1 = v1 + 3 if (v1 < 30) goto loop llvm-svn: 43336	2007-10-25 09:11:16 +00:00
Dale Johannesen	52bbe1b171	This was failing on Darwin, which defaults to PIC; no lea was generated. I think this follows the intent. llvm-svn: 43312	2007-10-24 20:58:14 +00:00
Dan Gohman	e0c3d9f338	Strength reduction improvements. - Avoid attempting stride-reuse in the case that there are users that aren't addresses. In that case, there will be places where the multiplications won't be folded away, so it's better to try to strength-reduce them. - Several SSE intrinsics have operands that strength-reduction can treat as addresses. The previous item makes this more visible, as any non-address use of an IV can inhibit stride-reuse. - Make ValidStride aware of whether there's likely to be a base register in the address computation. This prevents it from thinking that things like stride 9 are valid on x86 when the base register is already occupied. Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid stride-reuse elimintes the LEA in the loop, so the test is no longer testing what it was intended to test. llvm-svn: 43231	2007-10-22 20:40:42 +00:00
Dan Gohman	bf474959a3	Fix the folding of multiplication into addresses on x86, which was broken by the recent {U,S}MUL_LOHI changes. llvm-svn: 43230	2007-10-22 20:22:24 +00:00
Evan Cheng	f52a6fc50c	New test case. llvm-svn: 43193	2007-10-19 22:05:00 +00:00
Rafael Espindola	813a0b1d29	Test byval with a 8 bit aligned struct llvm-svn: 43173	2007-10-19 11:29:21 +00:00
Rafael Espindola	846c19dd70	Add support for byval function whose argument is not 32 bit aligned. To do this it is necessary to add a "always inline" argument to the memcpy node. For completeness I have also added this node to memmove and memset. I have also added getMem* functions, because the extra argument makes it cumbersome to use getNode and because I get confused by it :-) llvm-svn: 43172	2007-10-19 10:41:11 +00:00
Evan Cheng	e6a41c066a	Really fix PR1734. Carefully track which register uses are sub-register uses by traversing inverse register coalescing map. llvm-svn: 43118	2007-10-18 07:49:59 +00:00
Dan Gohman	8f518b9875	Add support for ISD::SELECT in SplitVectorOp. llvm-svn: 43072	2007-10-17 14:48:28 +00:00
Evan Cheng	7587d1bd19	Yet another test case for extract_subreg coalescing crash. llvm-svn: 43063	2007-10-17 02:15:06 +00:00
Evan Cheng	fab7ca89d5	Fix PR1734. llvm-svn: 43035	2007-10-16 19:29:47 +00:00
Dale Johannesen	e43b960d3b	New test for svn rev 43033, radar 5538745. llvm-svn: 43034	2007-10-16 18:10:14 +00:00
Evan Cheng	7bcfd8f880	LowerFP_TO_SINT must not create a stack object if it's not needed. llvm-svn: 43004	2007-10-15 20:11:21 +00:00
Dan Gohman	e862243e1c	Reapply the fix in 42908 for this file. This changes the function names from "test" to "foo" so that they don't match the grep -i ST. llvm-svn: 43001	2007-10-15 19:22:17 +00:00
Evan Cheng	a5abba65b6	Fix PR1729: watch out for val# with no def. llvm-svn: 42996	2007-10-15 18:33:50 +00:00
Tanya Lattner	9486b19066	Fix run line. llvm-svn: 42990	2007-10-15 16:35:13 +00:00
Evan Cheng	d8771e915c	New test case. llvm-svn: 42963	2007-10-14 10:15:03 +00:00
Evan Cheng	cdf3609130	Revert 42908 for now. llvm-svn: 42960	2007-10-14 05:57:21 +00:00
Evan Cheng	f86204baf4	Fix test case. llvm-svn: 42949	2007-10-13 03:14:06 +00:00
Evan Cheng	54bec86754	New tests. llvm-svn: 42948	2007-10-13 03:10:54 +00:00
Dan Gohman	e0ad9ea7cd	Fix this test to not depend on the assembly output containing something that includes the string "st". This probably fixes the regression on Darwin. llvm-svn: 42932	2007-10-12 20:42:14 +00:00
Dan Gohman	dc35bd79ca	Change the names used for internal labels to use the current function symbol name instead of a codegen-assigned function number. Thanks Evan! :-) llvm-svn: 42908	2007-10-12 14:53:36 +00:00
Evan Cheng	f8e28b152a	Doh. llvm-svn: 42901	2007-10-12 09:10:27 +00:00
Evan Cheng	b83a379f4f	EXTRACT_SUBREG test case. llvm-svn: 42900	2007-10-12 09:03:31 +00:00
Arnold Schwaighofer	9653e677d5	Added missing -march=x86 flag. llvm-svn: 42893	2007-10-12 07:49:48 +00:00
Dan Gohman	be37007e64	Add intrinsics for sin, cos, and pow. These use llvm_anyfloat_ty, and so may be overloaded with vector types. And add a testcase for codegen for these. llvm-svn: 42885	2007-10-12 00:01:22 +00:00
Dan Gohman	3554448947	Add an explicit target triple to make this test behave as expected on non-Apple hosts. And use the count script instead of wc + grep. llvm-svn: 42878	2007-10-11 23:04:36 +00:00
Arnold Schwaighofer	9ccea99165	Added tail call optimization to the x86 back end. It can be enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870	2007-10-11 19:40:01 +00:00
Dan Gohman	678387a299	These two tests now require only two multiply instructions, instead of four. llvm-svn: 42784	2007-10-09 15:39:37 +00:00
Evan Cheng	3b3e6097a3	Update test. llvm-svn: 42775	2007-10-08 22:20:32 +00:00
Dan Gohman	a24b431b27	These two tests now require only three multiply instructions, instead of four. llvm-svn: 42765	2007-10-08 20:48:12 +00:00
Dale Johannesen	bcfa7c1255	Make test work on non-x86 hosts. llvm-svn: 42671	2007-10-06 01:22:39 +00:00
Evan Cheng	5ee9cf6bca	Test case for 3-address conversion. llvm-svn: 42664	2007-10-05 23:33:09 +00:00
Evan Cheng	484cab7a2f	Enable convertToThreeAddress for X86 by default. llvm-svn: 42655	2007-10-05 22:31:10 +00:00
Evan Cheng	90a4185b5f	New test case. llvm-svn: 42628	2007-10-05 01:44:22 +00:00
Evan Cheng	89ca5b091f	-pre-RA-sched=none, simple, simple-noitin are gone. llvm-svn: 42505	2007-10-01 22:17:20 +00:00
Dan Gohman	a90183e7d1	Teach SplitVectorOp how to split INSERT_VECTOR_ELT. llvm-svn: 42457	2007-09-28 23:53:40 +00:00
Rafael Espindola	6c04ac1db0	Refactor the memcpy lowering for the x86 target. The only generated code difference is that now we call memcpy when the size of the array is unknown. This matches GCC behavior and is better since the run time value can be arbitrarily large. llvm-svn: 42433	2007-09-28 12:53:01 +00:00
Dale Johannesen	25a00a63eb	Add sqrt and powi intrinsics for long double. llvm-svn: 42423	2007-09-28 01:08:20 +00:00
Dale Johannesen	b805d35d16	Modernize fabs.ll, add long double. Add tests for direct codegen of fsin/fcos. llvm-svn: 42369	2007-09-26 21:12:10 +00:00
Dan Gohman	31599685c7	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00
Dale Johannesen	97d4bf2c41	Some tests for APFloat conversions. llvm-svn: 42303	2007-09-25 17:50:55 +00:00
Evan Cheng	6cb71f7fe0	Forgot to check in the changes. Fix test case so it doesn't break with any scheduling changes. llvm-svn: 42302	2007-09-25 17:47:38 +00:00
Dan Gohman	6002818999	Use the correct result value type instead of using getValueType(0) in ExpandEXTRACT_VECTOR_ELT and SplitVectorOp. This fixes an abort in the included testcase. llvm-svn: 42264	2007-09-24 15:54:53 +00:00
Dale Johannesen	ae4bb05103	Implementation of +sse -sse2 has changed; add -sse to preserve intent of this test. llvm-svn: 42247	2007-09-23 14:58:14 +00:00
Rafael Espindola	4730c04904	Don't add a default STACK_ALIGN (use the generic ABI alignment) Implement calls to functions with byval arguments on X86 llvm-svn: 42192	2007-09-21 15:50:22 +00:00
Dan Gohman	4dbc582a36	Fix several more entries in the x86 reload/remat folding tables. llvm-svn: 42162	2007-09-20 14:17:21 +00:00
Evan Cheng	e7ff9da64b	Clean up. llvm-svn: 42112	2007-09-18 22:56:31 +00:00
Evan Cheng	e2e8f2d96b	Fix a bogus splat xform: shuffle <undef, undef, x, undef>, <undef, undef, undef, undef>, <2, 2, 2, 2> != <undef, undef, x, undef> llvm-svn: 42111	2007-09-18 21:54:37 +00:00
Bill Wendling	067f1d8e95	Objective-C was generating EH frame info like this: "_-[NSString(local) isNullOrNil]".eh = 0 .no_dead_strip "_-[NSString(local) isNullOrNil]".eh The ".eh" should be inside the quotes. llvm-svn: 42074	2007-09-18 01:47:22 +00:00
Dan Gohman	863bdc332d	Emit integer x<1 as x<=0, as comparisons with zero (now includeing 64-bit) can use test instead of cmp with an immediate. llvm-svn: 42026	2007-09-17 14:49:27 +00:00
Dan Gohman	51d1929b9e	Use "test reg,reg" in place of "cmp reg,0" for 64-bit operands. This was previously only done for 32-bit and smaller operands. llvm-svn: 42024	2007-09-17 14:35:24 +00:00
Rafael Espindola	272f7304f0	Add support for functions with byval arguments on x86 llvm-svn: 41953	2007-09-14 15:48:13 +00:00
Dan Gohman	a95cbb0007	Avoid storing and reloading zeros and other constants from stack slots by flagging the associated instructions as being trivially rematerializable. llvm-svn: 41775	2007-09-07 21:32:51 +00:00
Rafael Espindola	1de0c86717	Add support for having different alignment for objects on call frames. The x86-64 ABI states that objects passed on the stack have 8 byte alignment. Implement that. llvm-svn: 41768	2007-09-07 14:52:14 +00:00
Anton Korobeynikov	122bf4be7e	Split eh.select / eh.typeid.for intrinsics into i32/i64 versions. This is needed, because they just "mark" register liveins and we let frontend solve type issue, not lowering code :) llvm-svn: 41763	2007-09-07 11:39:35 +00:00
Anton Korobeynikov	a07765b8f4	Proper handle case, when aliasee is external weak symbol referenced only by alias itself. Also, fix a case, when target doesn't have weak symbols supported. llvm-svn: 41746	2007-09-06 17:21:48 +00:00
Evan Cheng	189df733ed	Fix a bug in X86InstrInfo::convertToThreeAddress that caused it to codegen: leal (,%rcx,8), %rcx It should be leal (,%rcx,8), %ecx llvm-svn: 41735	2007-09-06 00:14:41 +00:00
Dale Johannesen	6480cc6f8c	Change all floating constants that are not exactly representable to use hex format. llvm-svn: 41722	2007-09-05 17:50:36 +00:00
Evan Cheng	e0cb6bb8da	Fix for PR1632. EHSELECTION always produces a i32 value. llvm-svn: 41712	2007-09-04 20:39:26 +00:00
Rafael Espindola	e636fc05d6	Initial support for calling functions with byval arguments on x86-64 llvm-svn: 41643	2007-08-31 15:06:30 +00:00
Evan Cheng	2e9d48aa0d	Update test case to reflect Dale's change. llvm-svn: 41639	2007-08-31 06:29:32 +00:00
Tanya Lattner	ffb806cf0e	Do not run on darwin. llvm-svn: 41608	2007-08-30 16:07:20 +00:00
Evan Cheng	ebb8540067	Added support to fold X86 load / store instructions. This allow rematerialized loads to be folded into their uses. llvm-svn: 41599	2007-08-30 05:54:07 +00:00
Dan Gohman	312b70a970	Add explicit triples to avoid default behavior that varies by host. llvm-svn: 41510	2007-08-27 20:54:48 +00:00
Dan Gohman	8dc0b93151	If the source and destination pointers in an llvm.memmove are known to not alias each other, it can be translated as an llvm.memcpy. llvm-svn: 41489	2007-08-27 16:26:13 +00:00
Rafael Espindola	ff33241e16	call libc memcpy/memset if array size is bigger then threshold. Coping 100MB array (after a warmup) shows that glibc 2.6.1 implementation on x86-64 (core 2) is 30% faster (from 0.270917s to 0.188079s) llvm-svn: 41479	2007-08-27 10:18:20 +00:00
Evan Cheng	595401079e	Test dag xform: Fold C ? 0 : 1 to ~C or zext(~C) or trunc(~C) llvm-svn: 41164	2007-08-18 06:11:57 +00:00
Evan Cheng	2c4ea1e411	New test. Make sure dynamic_stackalloc size is rounded up. llvm-svn: 41135	2007-08-16 23:52:23 +00:00
Evan Cheng	1393d88cc6	Update test: dynamic_stackalloc size must be rounded to ensure stack ptr be left in a valid state. llvm-svn: 41134	2007-08-16 23:51:28 +00:00
Rafael Espindola	4ba05408ac	add byval test llvm-svn: 41123	2007-08-16 13:09:02 +00:00
Dan Gohman	ada7205b76	Convert tests using "grep -c ... \| grep ..." to use the count script. llvm-svn: 41100	2007-08-15 13:49:33 +00:00
Dan Gohman	85c1e51b34	Delete extraneous uses of wc -l. llvm-svn: 41099	2007-08-15 13:45:35 +00:00
Dan Gohman	5327cf7b48	Convert another test to use the count script. This one didn't fit the regex used to convert all the others because the first '\|' was on a separate line. llvm-svn: 41098	2007-08-15 13:42:36 +00:00
Dan Gohman	f9dd170e36	Convert tests using "\| wc -l \| grep ..." to use the count script. llvm-svn: 41097	2007-08-15 13:36:28 +00:00
Chris Lattner	687dbf1a99	tcl seems to hate \|& for some reason. llvm-svn: 41073	2007-08-14 16:19:35 +00:00
Chris Lattner	0e92458068	switch this to use fastcc to avoid fpstack traffic on x86-32. Switch to using the count script instead of wc -l llvm-svn: 41072	2007-08-14 16:14:10 +00:00
Evan Cheng	5c6d53d2ff	Update test case. A spill should now be deleted. llvm-svn: 41070	2007-08-14 09:16:00 +00:00
Evan Cheng	5e221dbe8f	Spiller reuse test case. llvm-svn: 41068	2007-08-14 05:51:03 +00:00
Evan Cheng	2814fe847d	Now capable of rematerializing coalesced live intervals. llvm-svn: 41061	2007-08-13 23:54:16 +00:00
Dan Gohman	ccb3611881	When x86 addresses matching exceeds its recursion limit, check to see if the base register is already occupied before assuming it can be used. This fixes bogus code generation in the accompanying testcase. llvm-svn: 41049	2007-08-13 20:03:06 +00:00
Chris Lattner	4e7f673f65	Fix PR1607 llvm-svn: 41048	2007-08-13 18:42:37 +00:00
Christopher Lamb	030a59d967	Fix test so it passes. llvm-svn: 41012	2007-08-10 22:20:57 +00:00
Christopher Lamb	b372abab14	Increase efficiency of sign_extend_inreg by using subregisters for truncation. As the README suggests sign_extend_subreg is selected to (sext(trunc)). llvm-svn: 41010	2007-08-10 21:48:46 +00:00
Christopher Lamb	d36d30b53c	Add 2-addr to 3-addr promotion code that allows 32-bit LEA to be used via subregisters when 16-bit LEA is disabled. llvm-svn: 41007	2007-08-10 21:18:25 +00:00
Dan Gohman	a17799a3bd	Fix EXTRACT_ELEMENT, EXTRACT_SUBVECTOR, and EXTRACT_VECTOR_ELT to use an intptr ValueType instead of i32 for the index operand in getCopyToParts. llvm-svn: 40987	2007-08-10 14:59:38 +00:00
Chris Lattner	39d751058a	allow this to pass on ppc hosts. llvm-svn: 40846	2007-08-05 18:48:18 +00:00
Dan Gohman	8932bff7fe	Fix the alignment requirements of several unpck and shuf instructions. Generalize isPSHUFDMask and add a unary SHUFPD pattern so that SHUFPD's memory operand alignment can be tested as well, with a fix to avoid breaking MMX's use of isPSHUFDMask. llvm-svn: 40756	2007-08-02 21:17:01 +00:00
Dan Gohman	fa3eeeedc0	Mark the SSE and MMX load instructions that X86InstrInfo::isReallyTriviallyReMaterializable knows how to handle with the isReMaterializable flag so that it is given a chance to handle them. Without hoisting constant-pool loads from loops this isn't very visible, though it does keep CodeGen/X86/constant-pool-remat-0.ll from making a copy of the constant pool on the stack. llvm-svn: 40736	2007-08-02 14:27:55 +00:00
Evan Cheng	824693c87a	Fix test. llvm-svn: 40721	2007-08-02 05:04:16 +00:00
Evan Cheng	41ccce7169	New test. Bogus implicit-def prevented a copy from being coalesced. llvm-svn: 40690	2007-08-01 20:26:40 +00:00
Chris Lattner	9182684222	we're now handling this right :) llvm-svn: 40675	2007-08-01 17:10:30 +00:00
Evan Cheng	09a141df31	Requires SSE2. llvm-svn: 40657	2007-08-01 00:10:12 +00:00
Dan Gohman	54ec4bfa5f	Change the x86 assembly output to use tab characters to separate the mnemonics from their operands instead of single spaces. This makes the assembly output a little more consistent with various other compilers (f.e. GCC), and slightly easier to read. Also, update the regression tests accordingly. llvm-svn: 40648	2007-07-31 20:11:57 +00:00
Evan Cheng	12c6be84ff	Redo and generalize previously removed opt for pinsrw: (vextract (v4i32 bc (v4f32 s2v (f32 load ))), 0) -> (i32 load ) llvm-svn: 40628	2007-07-31 08:04:03 +00:00
Dan Gohman	4788552deb	Re-apply 40504, but with a fix for the segfault it caused in oggenc: Make the alignedload and alignedstore patterns always require 16-byte alignment. This way when they are used in the "Fs" instructions, in which a vector instruction is used for a scalar purpose, they can still require the full vector alignment. And add a regression test for this. llvm-svn: 40555	2007-07-27 17:16:43 +00:00
Evan Cheng	931de40afa	Reverting 40504 for now. It's breaking oggenc. llvm-svn: 40547	2007-07-27 01:37:47 +00:00
Evan Cheng	dfa5d283fd	Test case for PR1573. llvm-svn: 40539	2007-07-26 17:45:57 +00:00
Evan Cheng	e9ba8e0765	Fix test. llvm-svn: 40536	2007-07-26 17:07:03 +00:00
Dan Gohman	8455bd3fae	Remove X86ISD::LOAD_PACK and X86ISD::LOAD_UA and associated code from the x86 target, replacing them with the new alignment attributes on memory references. llvm-svn: 40504	2007-07-26 00:31:09 +00:00
Dan Gohman	f906c7286f	Use movaps to load a v4f32 build_vector of all-constant values into a register instead of loading each element individually. llvm-svn: 40478	2007-07-24 22:55:08 +00:00
Dan Gohman	45863cc202	Update these regression tests to accomodate X86InstrSSE.td now using movups/movaps for everything. llvm-svn: 40101	2007-07-20 16:31:26 +00:00
Evan Cheng	f195429a0e	New test. llvm-svn: 40077	2007-07-20 00:27:56 +00:00
Evan Cheng	a39fd10e32	New test. llvm-svn: 40073	2007-07-19 23:53:50 +00:00
Evan Cheng	8ab393548f	Try fixing it again. llvm-svn: 40072	2007-07-19 23:53:29 +00:00
Reid Spencer	314e1cb7ee	For PR1553: Change the keywords for the zext and sext parameter attributes to be zeroext and signext so they don't conflict with the keywords for the instructions of the same name. This gets around the ambiguity. llvm-svn: 40069	2007-07-19 23:13:04 +00:00
Bill Wendling	dd96b98bf6	Don't need the "&&" to glue lines together. llvm-svn: 40063	2007-07-19 18:06:26 +00:00
Bill Wendling	e8ea3303ce	Testcase for PR1549 llvm-svn: 40041	2007-07-19 06:31:11 +00:00
Evan Cheng	dcc3451f8a	New test. llvm-svn: 40020	2007-07-18 21:39:16 +00:00
Dan Gohman	776962a97a	Implement initial memory alignment awareness for SSE instructions. Vector loads and stores that have a specified alignment of less than 16 bytes now use instructions that support misaligned memory references. llvm-svn: 40015	2007-07-18 20:23:34 +00:00
Dan Gohman	a7b65c30a3	It's not necessary to do rounding for alloca operations when the requested alignment is equal to the stack alignment. llvm-svn: 40004	2007-07-18 16:29:46 +00:00
Evan Cheng	5184c9d787	Fix test. llvm-svn: 39976	2007-07-17 18:16:09 +00:00
Evan Cheng	9ae2eb43d8	Use push / pop for prologues and epilogues. llvm-svn: 39967	2007-07-17 07:59:08 +00:00
Dale Johannesen	2182f06f2d	Skeleton of post-RA scheduler; doesn't do anything yet. Change name of -sched option and DEBUG_TYPE to pre-RA-sched; adjust testcases. llvm-svn: 39816	2007-07-13 17:13:54 +00:00
Evan Cheng	ade8183fe8	Add test case for PR1545. llvm-svn: 39749	2007-07-11 19:29:05 +00:00
Dan Gohman	60d6f96da3	Change the peep for EXTRACT_VECTOR_ELT of BUILD_PAIR to look for the new CONCAT_VECTORS node type instead, as that's what legalize uses now. And add a peep for EXTRACT_VECTOR_ELT of INSERT_VECTOR_ELT. llvm-svn: 38503	2007-07-10 18:20:44 +00:00
Dan Gohman	4bd45b3d53	Add a regression test for folding spill code into scalar min and max. llvm-svn: 38492	2007-07-10 15:34:29 +00:00
Chris Lattner	318ff8dd94	force a cpu without SSE llvm-svn: 38466	2007-07-09 17:35:18 +00:00
Chris Lattner	0937478073	allow this to work on ppc-darwin llvm-svn: 38465	2007-07-09 17:32:28 +00:00
Bill Wendling	3053244b27	Allow a GR64 to be moved into an MMX register via the "movd" instruction. Still need to have JIT generate this code. llvm-svn: 37863	2007-07-04 00:19:54 +00:00
Dale Johannesen	a8bf39ee31	New testcases for rev 37847 (PR's 1489 and 1505). llvm-svn: 37848	2007-07-03 00:58:37 +00:00
Dan Gohman	9ff9908413	Add a basic test-case for passing and returning <4 x double> and <8 x float> values on X86. llvm-svn: 37845	2007-07-02 16:23:47 +00:00
Dan Gohman	11a4008a59	New test case. DAGCombiner should be able to fold -sin(-x) in -enable-unsafe-fp-math mode. llvm-svn: 37841	2007-07-02 15:43:20 +00:00
Evan Cheng	d21e3ab873	New test. llvm-svn: 37823	2007-06-29 23:17:15 +00:00
Evan Cheng	62acd01275	New test. llvm-svn: 37815	2007-06-29 21:40:30 +00:00
John Criswell	2660cef6d7	Convert .cvsignore files llvm-svn: 37801	2007-06-29 16:35:07 +00:00
Evan Cheng	b99f152f29	New tests. llvm-svn: 37787	2007-06-29 00:27:18 +00:00
Evan Cheng	6ba3e8b27d	New test case: identity operation of RHS / LHS of a VECTOR_SHUFFLE. llvm-svn: 37637	2007-06-19 00:06:08 +00:00
Chris Lattner	5f232faf8c	ensure we don't regress on these tests. We generate aweful code in x86-32 for these though. llvm-svn: 37619	2007-06-17 23:29:57 +00:00
Bill Wendling	ec6be07c22	XFAILing until I can fix properly. llvm-svn: 37618	2007-06-16 23:57:51 +00:00
Bill Wendling	c8f293b4f9	Testcase for MMX int to MMX register failure. llvm-svn: 37612	2007-06-16 06:31:47 +00:00

... 3 4 5 6 7 ...

562 Commits