llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	54c20b559e	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	7be1528004	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Duncan Sands	32b0ff6814	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	6e3379c07b	make sure to use a cpu that has sse. llvm-svn: 46060	2008-01-16 06:32:02 +00:00
Chris Lattner	8f7cec859e	My previous commit had an incomplete message, it should have been: make the 'fp return in ST(0)' optimization smart enough to look through token factor nodes. THis allows us to compile testcases like CodeGen/X86/fp-stack-retcopy.ll into: _carg: subl $12, %esp call L_foo$stub fstpl (%esp) fldl (%esp) addl $12, %esp ret instead of: _carg: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret Still not optimal, but much better and this is a trivial patch. Fixing the rest requires invasive surgery that is is not llvm 2.2 material. llvm-svn: 46054	2008-01-16 05:56:59 +00:00
Chris Lattner	915ec14073	verify x86 generates ud2 for llvm.trap llvm-svn: 46023	2008-01-15 22:22:02 +00:00
Dale Johannesen	04b99780cf	Disable for now. llvm-svn: 45881	2008-01-11 20:47:33 +00:00
Duncan Sands	53c954fa86	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0f8c7c4a73	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Evan Cheng	0e400d4cb7	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00
Evan Cheng	a31824a08e	Fix sse2.psrl.w and sse2.psrl.q definitions. llvm-svn: 45772	2008-01-09 02:16:44 +00:00
Chris Lattner	51b01bf8a5	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Duncan Sands	7b1460cca4	Crashes llc when using Chris's new legalization logic. llvm-svn: 45758	2008-01-08 21:51:53 +00:00
Nate Begeman	d3d49df3f1	Update test to catch recent x86 insert regression and improvements llvm-svn: 45705	2008-01-07 17:49:23 +00:00
Chris Lattner	41e423a6f5	fix this to use a valid triple. llvm-svn: 45509	2008-01-02 22:21:45 +00:00
Chris Lattner	5d998c5712	verify that aligned common support doesn't break. llvm-svn: 45495	2008-01-02 19:48:24 +00:00
Chris Lattner	d2b8a36f0e	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00
Chris Lattner	0d90c8f016	upgrade this test llvm-svn: 45406	2007-12-29 19:24:06 +00:00
Chris Lattner	3b6a82118b	Fold comparisons against a constant nan, and optimize ORD/UNORD comparisons with a constant. This allows us to compile isnan to: _foo: fcmpu cr7, f1, f1 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr instead of: LCPI1_0: ; float .space 4 _foo: lis r2, ha16(LCPI1_0) lfs f0, lo16(LCPI1_0)(r2) fcmpu cr7, f1, f0 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 45405	2007-12-29 08:37:08 +00:00
Chris Lattner	33de0c6e92	this xform is implemented. llvm-svn: 45404	2007-12-29 08:19:39 +00:00
Chris Lattner	07ccbfa64a	Codegen: as: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstps (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret instead of: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstpl (%esi) cvtsd2ss (%esi), %xmm0 movss %xmm0, (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret llvm-svn: 45401	2007-12-29 06:57:38 +00:00
Chris Lattner	8013bd339b	avoid going through a stack slot to convert from fpstack to xmm reg if we are just going to store it back anyway. This improves things like: double foo(); void bar(double P) { P = foo(); } llvm-svn: 45399	2007-12-29 06:41:28 +00:00
Chris Lattner	bc13df19a8	one fewer uncond branch with my codegenprepare hack for single-mbb backedges. llvm-svn: 45360	2007-12-26 17:23:47 +00:00
Evan Cheng	483a969ece	Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id. llvm-svn: 45167	2007-12-18 19:38:14 +00:00
Evan Cheng	91e0fc9cb4	FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit. llvm-svn: 45157	2007-12-18 08:42:10 +00:00
Evan Cheng	23d2d4dc6c	Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. llvm-svn: 45058	2007-12-15 03:00:47 +00:00
Evan Cheng	0e6408124e	Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero. llvm-svn: 45029	2007-12-14 08:30:15 +00:00
Evan Cheng	e9fbc3f014	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Evan Cheng	37c36ed79a	Be extra careful with extension use optimation. Now turned on by default. llvm-svn: 44981	2007-12-13 03:32:53 +00:00
Evan Cheng	827d30db19	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Evan Cheng	6e68381e02	Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960	2007-12-12 23:12:09 +00:00
Dan Gohman	7a7742c2fe	Allow vector integer constants to be created with SelectionDAG::getConstant, in the same way as vector floating-point constants. This allows the legalize expansion code for @llvm.ctpop and friends to be usable with vector types. llvm-svn: 44954	2007-12-12 22:21:26 +00:00
Evan Cheng	0f42730722	Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. llvm-svn: 44929	2007-12-12 07:55:34 +00:00
Evan Cheng	0a1254f634	Add a test case for -optimize-ext-uses. llvm-svn: 44928	2007-12-12 07:54:08 +00:00
Evan Cheng	2a98956796	Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part. llvm-svn: 44921	2007-12-12 06:45:40 +00:00
Evan Cheng	4fbf459549	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Christopher Lamb	d202e03fe5	Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices. llvm-svn: 44785	2007-12-10 07:24:06 +00:00
Evan Cheng	bfd373a53e	Much improved v8i16 shuffles. (Step 1). llvm-svn: 44676	2007-12-07 08:07:39 +00:00
Evan Cheng	26593a04db	New test case. llvm-svn: 44672	2007-12-07 01:48:46 +00:00
Evan Cheng	5cb41390ab	Fix a bogus test case. llvm-svn: 44668	2007-12-06 22:12:45 +00:00
Evan Cheng	8393dc7378	Turning simple splitting on. Start testing new coalescer heuristics as new llcbeta. llvm-svn: 44660	2007-12-06 08:54:31 +00:00
Chris Lattner	eedaf92fcf	third time around: instead of disabling this completely, only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658	2007-12-06 07:47:55 +00:00
Chris Lattner	b5fdfb9612	Actually, disable this code for now. More analysis and improvements to the X86 backend are needed before this should be enabled by default. llvm-svn: 44657	2007-12-06 07:44:31 +00:00
Chris Lattner	7c709a5d08	implement a readme entry, compiling the code into: _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656	2007-12-06 07:33:36 +00:00
Chris Lattner	dfa39289a5	fix this when run on non x86 hosts. llvm-svn: 44645	2007-12-06 01:05:52 +00:00
Evan Cheng	69fda0a716	Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0. llvm-svn: 44479	2007-12-01 02:07:52 +00:00
Evan Cheng	b10dc27b20	Do not fold reload into an instruction with multiple uses. It issues one extra load. llvm-svn: 44467	2007-11-30 21:23:43 +00:00
Dan Gohman	f151c8e760	Remove unnecessary && from the RUN lines of this test. llvm-svn: 44342	2007-11-27 00:03:38 +00:00
Dan Gohman	9a69341725	Don't lower srem/urem X%C to X-X/C*C unless the division is actually optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341	2007-11-26 23:46:11 +00:00
Chris Lattner	5728bdd4db	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00

1 2 3 4 5 ...

322 Commits