llvm-project

Commit Graph

Author	SHA1	Message	Date
Dale Johannesen	0e28b3b9ee	Allow for rounding up of stack frame. llvm-svn: 52751	2008-06-26 01:55:32 +00:00
Chris Lattner	b1e66ce3bb	when we know the signbit of an input to uint_to_fp is zero, change it to sint_to_fp on targets where that is cheaper (and visaversa of course). This allows us to compile uint_to_fp to: _test: movl 4(%esp), %eax shrl $23, %eax cvtsi2ss %eax, %xmm0 movl 8(%esp), %eax movss %xmm0, (%eax) ret instead of: .align 3 LCPI1_0: ## double .long 0 ## double least significant word 4.5036e+15 .long 1127219200 ## double most significant word 4.5036e+15 .text .align 4,0x90 .globl _test _test: subl $12, %esp movl 16(%esp), %eax shrl $23, %eax movl %eax, (%esp) movl $1127219200, 4(%esp) movsd (%esp), %xmm0 subsd LCPI1_0, %xmm0 cvtsd2ss %xmm0, %xmm0 movl 20(%esp), %eax movss %xmm0, (%eax) addl $12, %esp ret llvm-svn: 52747	2008-06-26 00:16:49 +00:00
Evan Cheng	3fc2372d3a	- Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into a shift. - Add a readme entry for a missing vector_shuffle optimization that results in awful codegen. llvm-svn: 52740	2008-06-25 20:52:59 +00:00
Mon P Wang	6a490371c9	Added MemOperands to Atomic operations since Atomics touches memory. Added abstract class MemSDNode for any Node that have an associated MemOperand Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and atomic.lss => atomic.load.sub llvm-svn: 52706	2008-06-25 08:15:39 +00:00
Evan Cheng	73db52ebf8	Enable two-address remat by default. llvm-svn: 52701	2008-06-25 01:16:38 +00:00
Dale Johannesen	6316bc705e	v2f32 is now a valid (MMX) type which breaks this test (doesn't work for any MMX vector types, it's not me). Rewritten to use v2i16 which is generic and going to stay that way; I think that preserves the point of the test. llvm-svn: 52692	2008-06-24 22:03:36 +00:00
Evan Cheng	3f2ceac565	If it's determined safe, remat MOV32r0 (i.e. xor r, r) and others as it is instead of using the longer MOV32ri instruction. llvm-svn: 52670	2008-06-24 07:10:51 +00:00
Bill Wendling	559067c55c	Make test work on non-x86 machines (like my G4 PPC). llvm-svn: 52619	2008-06-23 06:16:31 +00:00
Evan Cheng	f593a65497	Undo spill weight tweak. Need to investigate the performance regressions. llvm-svn: 52572	2008-06-21 06:45:54 +00:00
Eli Friedman	8d66e98c92	Fix a bug with <8 x i16> shuffle lowering on X86 where parts of the shuffle could be skipped. The check is invalid because the loop index i doesn't correspond to the element actually inserted. The correct check is already done a few lines earlier, for whether the element is already in the right spot, so this shouldn't have any effect on the codegen for code that was already correct. llvm-svn: 52486	2008-06-19 06:09:51 +00:00
Evan Cheng	b139ee95b4	New test case. llvm-svn: 52483	2008-06-19 01:50:24 +00:00
Evan Cheng	d6d9cfa72a	This also got better (55 - 51 instructions). But doing one more re-materialization. llvm-svn: 52482	2008-06-19 01:50:13 +00:00
Evan Cheng	1d1907d778	This got better. llvm-svn: 52481	2008-06-19 01:46:43 +00:00
Evan Cheng	1cde1f8d5e	Do not issue identity copies. llvm-svn: 52373	2008-06-16 22:52:53 +00:00
Evan Cheng	49bad4c9e1	- Add "Commutative" property to intrinsics. This allows tblgen to generate the commuted variants for dagisel matching code. - Mark lots of X86 intrinsics as "Commutative" to allow load folding. llvm-svn: 52353	2008-06-16 20:29:38 +00:00
Evan Cheng	fb79059b85	Teach the spiller to commute instructions in order to fold a reload. This hits 410 times on 444.namd and 122 times on 252.eon. llvm-svn: 52266	2008-06-13 23:58:02 +00:00
Duncan Sands	8651e9c584	Disable some DAG combiner optimizations that may be wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. llvm-svn: 52254	2008-06-13 19:07:40 +00:00
Evan Cheng	2d788ce3fb	Fix some tests. llvm-svn: 52245	2008-06-12 21:23:38 +00:00
Dale Johannesen	01b7cae58d	Fix parameter spelling: sse not sse1 llvm-svn: 52185	2008-06-10 17:57:58 +00:00
Matthijs Kooijman	07f4eecd2a	Fix some more quoting issues in RUN lines, this time regarding unintended variable expansions involving the $ character. This fixes 4 tests that were not running properly before. llvm-svn: 52183	2008-06-10 16:10:32 +00:00
Matthijs Kooijman	400c49c781	Remove double pipes in RUN commandlines. This fixes 5 testcases that were not being run properly before. llvm-svn: 52180	2008-06-10 15:11:36 +00:00
Dan Gohman	1b095b443c	Convert several tests to use temporary files instead of redundantly executing the test commands. llvm-svn: 52163	2008-06-10 00:36:41 +00:00
Rafael Espindola	29479df2ac	add support for PIC on linux x86-64 llvm-svn: 52139	2008-06-09 09:52:31 +00:00
Evan Cheng	976b1eee81	Fix a memcpy lowering bug. Even though the memcpy alignment is smaller than the desired alignment, the frame destination alignment may still be larger than the desired alignment. Don't change its alignment to something smaller. llvm-svn: 51970	2008-06-04 23:37:54 +00:00
Dan Gohman	92d62b43c2	Fix the position of MemOperands in nodes that use variadic_ops in DAGISelEmitter output. This bug was recently uncovered by the addition of patterns for CALL32m and CALL64m, which are nodes that now have both MemOperands and variadic_ops. This bug was especially visible with PIC in various configurations, because the new patterns are matching the indirect call code used in many PIC configurations. llvm-svn: 51877	2008-06-02 17:40:38 +00:00
Dan Gohman	96af4ddb62	Add patterns for CALL32m and CALL64m. They aren't matched in most cases due to an isel deficiency already noted in lib/Target/X86/README.txt, but they can be matched in this fold-call.ll testcase, for example. This is interesting mainly because it exposes a tricky tblgen bug; tblgen was incorrectly computing the starting index for variable_ops in the case of a complex pattern. llvm-svn: 51706	2008-05-29 21:50:34 +00:00
Dan Gohman	714663ab94	Expand small memmovs using inline code. Set the X86 threshold for expanding memmove to a more plausible value, now that it's actually being used. llvm-svn: 51696	2008-05-29 19:42:22 +00:00
Evan Cheng	5e28227dbd	Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. llvm-svn: 51667	2008-05-29 08:22:04 +00:00
Evan Cheng	6892c5507f	Add nounwind. llvm-svn: 51665	2008-05-29 07:09:24 +00:00
Evan Cheng	68079268f5	Fix PR2289: vr defined by multiple implicit_def as result of coalescing. llvm-svn: 51648	2008-05-28 17:40:10 +00:00
Evan Cheng	427412e7c8	Teach local register allocator to deal with landing pad MBB's. llvm-svn: 51647	2008-05-28 17:22:32 +00:00
Dan Gohman	221e9d0d22	Specify a target so that this tests tests what it's intended to test. llvm-svn: 51600	2008-05-27 17:55:57 +00:00
Dan Gohman	923a375053	Make this test independent of the target-triple; the stack alignment is specifically what this test depends on. llvm-svn: 51599	2008-05-27 17:44:23 +00:00
Nick Lewycky	213e114a2c	The Linux ABI emits an extra "movl %esp, %ebp" in function prologue and sometimes a "mov %ebp, %esp" in the epilogue. Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0 where they were written. llvm-svn: 51568	2008-05-26 20:18:56 +00:00
Evan Cheng	948627aadd	New loadl_pd and loadh_pd tests. llvm-svn: 51525	2008-05-24 00:10:02 +00:00
Evan Cheng	04d24edcbb	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Dan Gohman	3388d022ac	Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add load-folding table entries for PMULDQ and PMULLD. llvm-svn: 51489	2008-05-23 17:49:40 +00:00
Evan Cheng	f3be7a7ea7	Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks. Also fixed some 80 col. violations. llvm-svn: 51462	2008-05-23 00:37:07 +00:00
Evan Cheng	a1100782d5	Add a couple of test cases. llvm-svn: 51441	2008-05-22 21:19:19 +00:00
Evan Cheng	53963b775e	Add missing patterns. llvm-svn: 51435	2008-05-22 18:56:56 +00:00
Chris Lattner	a87f1a568c	testcase for PR2267 llvm-svn: 51408	2008-05-22 04:45:22 +00:00
Evan Cheng	a5d27ae586	Fix PR2343. An interesting coalescer bug. BB1: vr1025 = copy vr1024 .. BB2: vr1024 = op = op vr1025 <loop eventually branch back to BB1> Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop. llvm-svn: 51394	2008-05-21 22:34:12 +00:00
Gabor Greif	1e427c3264	sabre brings to my attention that the 'tr' suffix is also obsolete llvm-svn: 51349	2008-05-20 21:00:03 +00:00
Gabor Greif	f45ff35bfe	Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too. llvm-svn: 51328	2008-05-20 19:52:04 +00:00
Dan Gohman	cd2e772d08	Run vortex-bug as x86-64, which is what the original bug was triggered on. llvm-svn: 51289	2008-05-20 00:54:39 +00:00
Dale Johannesen	0bf92b14f1	Use common where we mean common, not weak. llvm-svn: 51173	2008-05-16 00:52:30 +00:00
Dan Gohman	0a0fa7cf78	Fix a bug in LoopStrengthReduce that caused it to emit IR with use-before-def. The problem comes up in code with multiple PHIs where one PHI is being rewritten in terms of the other, but the other needs to be casted first. LLVM rules requre the cast instruction to be inserted after any PHI instructions, but when instructions were inserted to replace the second PHI value with a function of the first, they were ended up going before the cast instruction. Avoid this problem by remembering the location of the cast instruction, when one is needed, and inserting the expansion of the new value after it. This fixes a bug that surfaced in 255.vortex on x86-64 when instcombine was removed from the middle of the loop optimization passes. llvm-svn: 51169	2008-05-15 23:26:57 +00:00
Dan Gohman	3ab94df276	When bit-twiddling CondCode values for integer comparisons produces SETOEQ, is it does with (SETEQ & SETULE), map it to SETEQ. llvm-svn: 51112	2008-05-14 18:17:09 +00:00
Evan Cheng	1120279ae6	Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax llvm-svn: 51026	2008-05-13 08:35:03 +00:00
Evan Cheng	3f40c69083	On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16. llvm-svn: 51019	2008-05-13 00:54:02 +00:00

1 2 3 4 5 ...

562 Commits