llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan Sands	53c954fa86	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0f8c7c4a73	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Chris Lattner	e34d7d0e24	new testcase for PR1845 llvm-svn: 45795	2008-01-10 00:30:38 +00:00
Evan Cheng	0e400d4cb7	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00
Evan Cheng	a31824a08e	Fix sse2.psrl.w and sse2.psrl.q definitions. llvm-svn: 45772	2008-01-09 02:16:44 +00:00
Chris Lattner	51b01bf8a5	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Duncan Sands	7b1460cca4	Crashes llc when using Chris's new legalization logic. llvm-svn: 45758	2008-01-08 21:51:53 +00:00
Chris Lattner	b17db3afa8	remove darwin/i386 t-t llvm-svn: 45743	2008-01-08 06:52:51 +00:00
Chris Lattner	89f36e6b21	Finally implement correct ordered comparisons for PPC, even though the code generated is not wonderful. This turns a miscompilation into a code quality bug (noted in the ppc readme). This fixes PR642, which is over 2 years old (!). Nate, please review this. llvm-svn: 45742	2008-01-08 06:46:30 +00:00
Nate Begeman	d3d49df3f1	Update test to catch recent x86 insert regression and improvements llvm-svn: 45705	2008-01-07 17:49:23 +00:00
Gordon Henriksen	c7e991b7c3	Setting GlobalDirective in TargetAsmInfo by default rather than providing a misleading facility. It's used once in the MIPS backend and hardcoded as "\t.globl\t" everywhere else. llvm-svn: 45676	2008-01-07 02:31:11 +00:00
Gordon Henriksen	6047b6e140	With this patch, the LowerGC transformation becomes the ShadowStackCollector, which additionally has reduced overhead with no sacrifice in portability. Considering a function @fun with 8 loop-local roots, ShadowStackCollector introduces the following overhead (x86): ; shadowstack prologue movl L_llvm_gc_root_chain$non_lazy_ptr, %eax movl (%eax), %ecx movl $___gc_fun, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) movl $0, 32(%esp) movl $0, 36(%esp) movl $0, 40(%esp) movl $0, 44(%esp) movl $0, 48(%esp) movl $0, 52(%esp) movl %ecx, 16(%esp) leal 16(%esp), %ecx movl %ecx, (%eax) ; shadowstack loop overhead (none) ; shadowstack epilogue movl 48(%esp), %edx movl %edx, (%ecx) ; shadowstack metadata .align 3 ___gc_fun: # __gc_fun .long 8 .space 4 In comparison to LowerGC: ; lowergc prologue movl L_llvm_gc_root_chain$non_lazy_ptr, %eax movl (%eax), %ecx movl %ecx, 48(%esp) movl $8, 52(%esp) movl $0, 60(%esp) movl $0, 56(%esp) movl $0, 68(%esp) movl $0, 64(%esp) movl $0, 76(%esp) movl $0, 72(%esp) movl $0, 84(%esp) movl $0, 80(%esp) movl $0, 92(%esp) movl $0, 88(%esp) movl $0, 100(%esp) movl $0, 96(%esp) movl $0, 108(%esp) movl $0, 104(%esp) movl $0, 116(%esp) movl $0, 112(%esp) ; lowergc loop overhead leal 44(%esp), %eax movl %eax, 56(%esp) leal 40(%esp), %eax movl %eax, 64(%esp) leal 36(%esp), %eax movl %eax, 72(%esp) leal 32(%esp), %eax movl %eax, 80(%esp) leal 28(%esp), %eax movl %eax, 88(%esp) leal 24(%esp), %eax movl %eax, 96(%esp) leal 20(%esp), %eax movl %eax, 104(%esp) leal 16(%esp), %eax movl %eax, 112(%esp) ; lowergc epilogue movl 48(%esp), %edx movl %edx, (%ecx) ; lowergc metadata (none) llvm-svn: 45670	2008-01-07 01:30:53 +00:00
Chris Lattner	41e423a6f5	fix this to use a valid triple. llvm-svn: 45509	2008-01-02 22:21:45 +00:00
Chris Lattner	5d998c5712	verify that aligned common support doesn't break. llvm-svn: 45495	2008-01-02 19:48:24 +00:00
Duncan Sands	57a60f0466	Fix PR1833 - eh.exception and eh.selector return two values, which means doing extra legalization work. It would be easier to get this kind of thing right if there was some documentation... llvm-svn: 45472	2007-12-31 18:35:50 +00:00
Chris Lattner	d2b8a36f0e	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00
Chris Lattner	0d90c8f016	upgrade this test llvm-svn: 45406	2007-12-29 19:24:06 +00:00
Chris Lattner	3b6a82118b	Fold comparisons against a constant nan, and optimize ORD/UNORD comparisons with a constant. This allows us to compile isnan to: _foo: fcmpu cr7, f1, f1 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr instead of: LCPI1_0: ; float .space 4 _foo: lis r2, ha16(LCPI1_0) lfs f0, lo16(LCPI1_0)(r2) fcmpu cr7, f1, f0 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 45405	2007-12-29 08:37:08 +00:00
Chris Lattner	33de0c6e92	this xform is implemented. llvm-svn: 45404	2007-12-29 08:19:39 +00:00
Chris Lattner	07ccbfa64a	Codegen: as: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstps (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret instead of: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstpl (%esi) cvtsd2ss (%esi), %xmm0 movss %xmm0, (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret llvm-svn: 45401	2007-12-29 06:57:38 +00:00
Chris Lattner	8013bd339b	avoid going through a stack slot to convert from fpstack to xmm reg if we are just going to store it back anyway. This improves things like: double foo(); void bar(double P) { P = foo(); } llvm-svn: 45399	2007-12-29 06:41:28 +00:00
Chris Lattner	bc13df19a8	one fewer uncond branch with my codegenprepare hack for single-mbb backedges. llvm-svn: 45360	2007-12-26 17:23:47 +00:00
Gordon Henriksen	d89e645c38	Tests for changes made in r45356, where IPO optimizations would drop collector algorithms. llvm-svn: 45357	2007-12-26 02:47:37 +00:00
Gordon Henriksen	b969c5981b	GC poses hazards to the inliner. Consider: define void @f() { ... call i32 @g() ... } define void @g() { ... } The hazards are: - @f and @g have GC, but they differ GC. Inlining is invalid. This may never occur. - @f has no GC, but @g does. g's GC must be propagated to @f. The other scenarios are safe: - @f and @g have the same GC. - @f and @g have no GC. - @g has no GC. This patch adds inliner checks for the former two scenarios. llvm-svn: 45351	2007-12-25 03:10:07 +00:00
Gordon Henriksen	fb56bde933	Noting and enforcing that GC intrinsics are valid only within a function with GC. This will catch the error when the inliner inlines a function with GC into a caller with no GC. llvm-svn: 45350	2007-12-25 02:31:26 +00:00
Gordon Henriksen	9157c499fc	Adjusting verification of "llvm.gc*" intrinsic prototypes to match LangRef. llvm-svn: 45349	2007-12-25 02:02:10 +00:00
Evan Cheng	ddc9af11f0	Remove xfail. This is fixed. llvm-svn: 45254	2007-12-20 02:25:21 +00:00
Scott Michel	5f1470f03a	More working CellSPU tests: - vec_const.ll: Vector constant loads - immed64.ll: i64, f64 constant loads llvm-svn: 45242	2007-12-20 00:44:13 +00:00
Scott Michel	5ecac82f71	CellSPU testcase, extract_elt.ll: extract vector element. llvm-svn: 45219	2007-12-19 21:17:42 +00:00
Scott Michel	a246e09aa0	More working CellSPU test cases: - call.ll: Function call - ctpop.ll: Count population - dp_farith.ll: DP arithmetic - eqv.ll: Equivalence primitives - fcmp.ll: SP comparisons - fdiv.ll: SP division - fneg-fabs.ll: SP negation, aboslute value - int2fp.ll: Integer -> SP conversion - rotate_ops.ll: Rotation primitives - select_bits.ll: (a & c) \| (b & ~c) bit selection - shift_ops.ll: Shift primitives - sp_farith.ll: SP arithmentic llvm-svn: 45217	2007-12-19 20:50:49 +00:00
Scott Michel	098c113bc8	Two more test cases: or_ops.ll (arithmetic or operations) and vecinsert.ll (vector insertions) llvm-svn: 45216	2007-12-19 20:15:47 +00:00
Scott Michel	9b834469e0	Add new immed16.ll test case, fix CellSPU errata to make test case work. llvm-svn: 45196	2007-12-19 07:35:06 +00:00
Evan Cheng	483a969ece	Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id. llvm-svn: 45167	2007-12-18 19:38:14 +00:00
Evan Cheng	91e0fc9cb4	FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit. llvm-svn: 45157	2007-12-18 08:42:10 +00:00
Scott Michel	8172f85e2f	i32 immediate constant test case for CellSPU llvm-svn: 45134	2007-12-17 23:45:52 +00:00
Scott Michel	c5cccb9e60	- Restore some i8 functionality in CellSPU - New test case: nand.ll llvm-svn: 45130	2007-12-17 22:32:34 +00:00
Duncan Sands	b5a79d0eaa	Make invokes of inline asm legal. Teach codegen how to lower them (with no attempt made to be efficient, since they should only occur for unoptimized code). llvm-svn: 45108	2007-12-17 18:08:19 +00:00
Evan Cheng	23d2d4dc6c	Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. llvm-svn: 45058	2007-12-15 03:00:47 +00:00
Scott Michel	0aa7133f82	Start committing working test cases for CellSPU. llvm-svn: 45050	2007-12-15 00:38:50 +00:00
Evan Cheng	0e6408124e	Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero. llvm-svn: 45029	2007-12-14 08:30:15 +00:00
Evan Cheng	e9fbc3f014	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Evan Cheng	37c36ed79a	Be extra careful with extension use optimation. Now turned on by default. llvm-svn: 44981	2007-12-13 03:32:53 +00:00
Evan Cheng	827d30db19	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Evan Cheng	6e68381e02	Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960	2007-12-12 23:12:09 +00:00
Dan Gohman	7a7742c2fe	Allow vector integer constants to be created with SelectionDAG::getConstant, in the same way as vector floating-point constants. This allows the legalize expansion code for @llvm.ctpop and friends to be usable with vector types. llvm-svn: 44954	2007-12-12 22:21:26 +00:00
Evan Cheng	0f42730722	Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. llvm-svn: 44929	2007-12-12 07:55:34 +00:00
Evan Cheng	0a1254f634	Add a test case for -optimize-ext-uses. llvm-svn: 44928	2007-12-12 07:54:08 +00:00
Evan Cheng	2a98956796	Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part. llvm-svn: 44921	2007-12-12 06:45:40 +00:00
Evan Cheng	4fbf459549	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Christopher Lamb	d202e03fe5	Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices. llvm-svn: 44785	2007-12-10 07:24:06 +00:00

1 2 3 4 5 ...

602 Commits