llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	84b49d51be	Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll llvm-svn: 28017	2006-04-28 21:56:10 +00:00
Nate Begeman	4ca2ea5b43	JumpTable support! What this represents is working asm and jit support for x86 and ppc for 100% dense switch statements when relocations are non-PIC. This support will be extended and enhanced in the coming days to support PIC, and less dense forms of jump tables. llvm-svn: 27947	2006-04-22 18:53:45 +00:00
Chris Lattner	518834c67e	Fix a crash on: void foo2(vector float A, vector float B) { vector float C = (vector float)vec_cmpeq(A, B); if (!vec_any_eq(A, B)) B = (vector float){0,0,0,0}; A = C; } llvm-svn: 27808	2006-04-18 18:28:22 +00:00
Chris Lattner	1e174c87c3	pretty print node name llvm-svn: 27806	2006-04-18 18:05:58 +00:00
Chris Lattner	9754d142a4	Implement an important entry from README_ALTIVEC: If an altivec predicate compare is used immediately by a branch, don't use a (serializing) MFCR instruction to read the CR6 register, which requires a compare to get it back to CR's. Instead, just branch on CR6 directly. :) For example, for: void foo2(vector float A, vector float B) { if (!vec_any_eq(A, B)) *B = (vector float){0,0,0,0}; } We now generate: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 bne cr6, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr instead of: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 cmpwi cr0, r3, 0 beq cr0, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr This implements CodeGen/PowerPC/vec_br_cmp.ll. llvm-svn: 27804	2006-04-18 17:59:36 +00:00
Chris Lattner	96d50487c9	Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793	2006-04-18 04:28:57 +00:00
Chris Lattner	d6d82aa889	Implement v16i8 multiply with this code: vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792	2006-04-18 03:57:35 +00:00
Chris Lattner	7e439874cb	Lower v8i16 multiply into this code: li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789	2006-04-18 03:43:48 +00:00
Chris Lattner	a2cae1bb10	Custom lower v4i32 multiplies into a cute sequence, instead of having legalize scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788	2006-04-18 03:24:30 +00:00
Chris Lattner	e54133cfba	Make sure to check splats of every constant we can, handle splat(31) by being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764	2006-04-17 18:09:22 +00:00
Chris Lattner	264c908e3a	Teach the ppc backend to use rol and vsldoi to generate splatted constants. This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760	2006-04-17 17:55:10 +00:00
Chris Lattner	1b3806ace5	Make some code more general, adding support for constant formation of several new patterns. llvm-svn: 27754	2006-04-17 06:58:41 +00:00
Chris Lattner	f8dd76df5b	Learn how to make odd splatted constants in range [17,29]. This implements PowerPC/vec_constants.ll:test_29. llvm-svn: 27752	2006-04-17 06:07:44 +00:00
Chris Lattner	2a099c04c1	Pull some code out into a helper function. Effeciently codegen even splats in the range [-32,30]. This allows us to codegen <30,30,30,30> as: vspltisw v0, 15 vadduwm v2, v0, v0 instead of as a cp load. llvm-svn: 27750	2006-04-17 06:00:21 +00:00
Chris Lattner	071ad01ceb	Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle, if it can be implemented in 3 or fewer discrete altivec instructions, codegen it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll llvm-svn: 27748	2006-04-17 05:28:54 +00:00
Chris Lattner	06a21ba96b	Implement a TODO: have the legalizer canonicalize a bunch of operations to one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731	2006-04-16 01:37:57 +00:00
Chris Lattner	fa5aa396c2	Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors. Remove some done items from the todo list. llvm-svn: 27729	2006-04-16 01:01:29 +00:00
Chris Lattner	24acbe46c0	Fix a crash when faced with a shuffle vector that has an undef in its mask. llvm-svn: 27726	2006-04-15 23:48:05 +00:00
Chris Lattner	559c8ba466	Allow undef in a shuffle mask llvm-svn: 27714	2006-04-14 23:19:08 +00:00
Chris Lattner	4211ca9108	Move the rest of the PPCTargetLowering::LowerOperation cases out into separate functions, for simplicity and code clarity. llvm-svn: 27693	2006-04-14 06:01:58 +00:00
Chris Lattner	19e9055eb5	Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate functions, which makes the code much cleaner :) llvm-svn: 27692	2006-04-14 05:19:18 +00:00
Chris Lattner	883fb053bd	Force non-darwin targets to use a static relo model. This fixes PR734, tested by CodeGen/Generic/vector.ll llvm-svn: 27657	2006-04-13 17:10:48 +00:00
Chris Lattner	147e50e1c5	Add a new way to match vector constants, which make it easier to bang bits of different types. Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load, implementing PowerPC/vec_constants.ll:test1. This compiles: typedef float vf __attribute__ ((vector_size (16))); typedef int vi __attribute__ ((vector_size (16))); void test(vi P1, vi P2, vf P3) { P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000}; P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF}; P3 = vec_abs((vector float)*P3); } to: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 vspltisw v0, -1 vslw v0, v0, v0 lvx v1, 0, r3 vand v1, v1, v0 stvx v1, 0, r3 lvx v1, 0, r4 vandc v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vandc v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr instead of (with two constant pool entries): _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 li r6, lo16(LCPI1_0) lis r7, ha16(LCPI1_0) li r8, lo16(LCPI1_1) lis r9, ha16(LCPI1_1) lvx v0, r7, r6 lvx v1, 0, r3 vand v0, v1, v0 stvx v0, 0, r3 lvx v0, r9, r8 lvx v1, 0, r4 vand v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vand v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr GCC produces (with 2 cp entries): _test: mfspr r0,256 stw r0,-4(r1) oris r0,r0,0xc00c mtspr 256,r0 lis r2,ha16(LC0) lis r9,ha16(LC1) la r2,lo16(LC0)(r2) lvx v0,0,r3 lvx v1,0,r5 la r9,lo16(LC1)(r9) lwz r12,-4(r1) lvx v12,0,r2 lvx v13,0,r9 vand v0,v0,v12 stvx v0,0,r3 vspltisw v0,-1 vslw v12,v0,v0 vandc v1,v1,v12 stvx v1,0,r5 lvx v0,0,r4 vand v0,v0,v13 stvx v0,0,r4 mtspr 256,r12 blr llvm-svn: 27624	2006-04-12 19:07:14 +00:00
Chris Lattner	74cf9ff761	Rename get_VSPLI_elt -> get_VSPLTI_elt Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each form, eliminating a bunch of Pat patterns in the .td file and allowing us to CSE stuff more aggressively. This implements PowerPC/buildvec_canonicalize.ll:VSPLTI llvm-svn: 27614	2006-04-12 17:37:20 +00:00
Chris Lattner	e318a7574e	Ensure that zero vectors are always v4i32, which forces them to CSE with each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll llvm-svn: 27609	2006-04-12 16:53:28 +00:00
Chris Lattner	e4db08a2f1	Vector function results go into V2 according to GCC. The darwin ABI doc doesn't say where they go :-/ llvm-svn: 27579	2006-04-11 01:38:39 +00:00
Chris Lattner	92533cfb4a	Move some return-handling code from lowerarguments to the ISD::RET handling stuff. No functionality change. llvm-svn: 27577	2006-04-11 01:21:43 +00:00
Chris Lattner	3a68f3c3ca	properly mark vector selects as expanded to select_cc llvm-svn: 27544	2006-04-08 22:59:15 +00:00
Chris Lattner	0a3d1bbca4	Add VRRC select support llvm-svn: 27543	2006-04-08 22:45:08 +00:00
Chris Lattner	d9e80f4516	Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a constant pool load. llvm-svn: 27538	2006-04-08 07:14:26 +00:00
Chris Lattner	d71a1f946d	Change the interface to the predicate that determines if vsplti* can be used. No functionality changes. llvm-svn: 27536	2006-04-08 06:46:53 +00:00
Chris Lattner	466841ddc7	Make sure to return the result in the right type. llvm-svn: 27469	2006-04-06 23:12:19 +00:00
Chris Lattner	a4bbfaed5c	Match vpku[hw]um(x,x). Convert vsldoi(x,x) to work the same way other (x,x) cases work. llvm-svn: 27467	2006-04-06 22:28:36 +00:00
Chris Lattner	f38e033270	Add support for matching vmrg(x,x) patterns llvm-svn: 27463	2006-04-06 22:02:42 +00:00
Chris Lattner	d1dcb52093	Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles. llvm-svn: 27457	2006-04-06 21:11:54 +00:00
Chris Lattner	1d33819194	Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to lower it and LLVM to have one fewer intrinsic. This implements CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27450	2006-04-06 18:26:28 +00:00
Chris Lattner	e8b83b4206	Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into vperm with a perm mask lvx'd from the constant pool. llvm-svn: 27448	2006-04-06 17:23:16 +00:00
Chris Lattner	39cc717c65	Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll llvm-svn: 27439	2006-04-05 17:39:25 +00:00
Evan Cheng	2cf4232ced	Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered. llvm-svn: 27433	2006-04-05 06:09:26 +00:00
Chris Lattner	4a744e5c9d	Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'. llvm-svn: 27413	2006-04-04 22:28:35 +00:00
Chris Lattner	95c7adc7cb	Ask legalize to promote all vector shuffles to be v16i8 instead of having to handle all 4 PPC vector types. This simplifies the matching code and allows us to eliminate a bunch of patterns. This also adds cases we were missing, such as CodeGen/PowerPC/vec_splat.ll:splat_h. llvm-svn: 27400	2006-04-04 17:25:31 +00:00
Chris Lattner	447a7968af	Revert accidentally committed hunks. llvm-svn: 27386	2006-04-03 23:58:04 +00:00
Chris Lattner	533aed9a35	Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand. llvm-svn: 27385	2006-04-03 23:55:43 +00:00
Chris Lattner	c5287c0ece	Inform the dag combiner that the predicate compares only return a low bit. llvm-svn: 27359	2006-04-02 06:26:07 +00:00
Chris Lattner	9b2d6e7886	Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into "vspltisb v0, 8" instead of a constant pool load. llvm-svn: 27335	2006-04-02 00:43:36 +00:00
Chris Lattner	baa73e0d91	Rearrange code a bit llvm-svn: 27306	2006-03-31 19:52:36 +00:00
Chris Lattner	754b41c84b	Add, sub and shuffle are legal for all vector types llvm-svn: 27305	2006-03-31 19:48:58 +00:00
Chris Lattner	829a061abf	note to self: save file, then check it in llvm-svn: 27291	2006-03-31 06:04:53 +00:00
Chris Lattner	d4058a59d4	Implement an item from the readme, folding vcmp/vcmp. instructions with identical instructions into a single instruction. For example, for: void test(vector float x, vector float y, int P) { int v = vec_any_out(x, y); x = (vector float)vec_cmpb(x, y); P = v; } we now generate: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v0, v1, v0 mfcr r4, 2 stvx v0, 0, r3 rlwinm r3, r4, 27, 31, 31 xori r3, r3, 1 stw r3, 0(r5) mtspr 256, r2 blr instead of: _test: mfspr r2, 256 oris r6, r2, 57344 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v2, v1, v0 mfcr r4, 2 ** vcmpbfp v0, v1, v0 rlwinm r4, r4, 27, 31, 31 stvx v0, 0, r3 xori r3, r4, 1 stw r3, 0(r5) mtspr 256, r2 blr Testcase here: CodeGen/PowerPC/vcmp-fold.ll llvm-svn: 27290	2006-03-31 06:02:07 +00:00
Chris Lattner	d7495ae7e9	Lower vector compares to VCMP nodes, just like we lower vector comparison predicates to VCMPo nodes. llvm-svn: 27285	2006-03-31 05:13:27 +00:00
Chris Lattner	bca5fbe914	Mark INSERT_VECTOR_ELT as expand llvm-svn: 27276	2006-03-31 01:48:55 +00:00
Nate Begeman	1b3928765d	Add a few more altivec intrinsics llvm-svn: 27215	2006-03-28 04:15:58 +00:00
Chris Lattner	cb5ec07cc3	Use normal lvx for scalar_to_vector instead of lve*x. They do the exact same thing and we have a dag node for the former. llvm-svn: 27205	2006-03-28 01:43:22 +00:00
Chris Lattner	e55d171ccd	Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums. llvm-svn: 27201	2006-03-28 00:40:33 +00:00
Nate Begeman	ed728c1291	SelectionDAGISel can now natively handle Switch instructions, in the same manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary search tree of basic blocks. The new approach has several advantages: it is faster, it generates significantly smaller code in many cases, and it paves the way for implementing dense switch tables as a jump table by handling switches directly in the instruction selector. This functionality is currently only enabled on x86, but should be safe for every target. In anticipation of making it the default, the cfg is now properly updated in the x86, ppc, and sparc select lowering code. llvm-svn: 27156	2006-03-27 01:32:24 +00:00
Chris Lattner	6961fc76bb	Codegen vector predicate compares. llvm-svn: 27151	2006-03-26 10:06:40 +00:00
Evan Cheng	b1ddc988af	Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead llvm-svn: 27149	2006-03-26 09:52:32 +00:00
Chris Lattner	1cb91b3cd9	Add some basic patterns for other datatypes llvm-svn: 27116	2006-03-25 07:39:07 +00:00
Chris Lattner	2771e2c960	Codegen things like: <int -1, int -1, int -1, int -1> and <int 65537, int 65537, int 65537, int 65537> Using things like: vspltisb v0, -1 and: vspltish v0, 1 instead of using constant pool loads. This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32\|16}. llvm-svn: 27106	2006-03-25 06:12:06 +00:00
Chris Lattner	a90b7141ed	Disable the i32->float G5 optimization. It is unsafe, as documented in the comment. This fixes 177.mesa, and McCat/09-vor with the td scheduler. llvm-svn: 27060	2006-03-24 07:53:47 +00:00
Chris Lattner	ab882abce8	add support for using vxor to build zero vectors. This implements Regression/CodeGen/PowerPC/vec_zero.ll llvm-svn: 27059	2006-03-24 07:48:08 +00:00
Chris Lattner	4a66d69433	When possible, custom lower 32-bit SINT_TO_FP to this: _foo2: extsw r2, r3 std r2, -8(r1) lfd f0, -8(r1) fcfid f0, f0 frsp f1, f0 blr instead of this: _foo2: lis r2, ha16(LCPI2_0) lis r4, 17200 xoris r3, r3, 32768 stw r3, -4(r1) stw r4, -8(r1) lfs f0, lo16(LCPI2_0)(r2) lfd f1, -8(r1) fsub f0, f1, f0 frsp f1, f0 blr This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s with llcbeta (16.7% and 38.1% respectively). llvm-svn: 26943	2006-03-22 05:30:33 +00:00
Chris Lattner	00f4683bf6	These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will. llvm-svn: 26930	2006-03-21 20:51:05 +00:00
Chris Lattner	6d74b09da7	remove dead variable llvm-svn: 26907	2006-03-20 22:37:23 +00:00
Chris Lattner	a1bc294f0c	Fix a couple of bugs in permute/splat generate, thanks to Nate for actually figuring these out! :) llvm-svn: 26904	2006-03-20 18:26:51 +00:00
Chris Lattner	a9a1313386	Add support for generating vspltw, instead of a vperm instruction with a constant pool load. This generates significantly nicer code for splats. When tblgen gets bugfixed, we can remove the custom selection code. llvm-svn: 26898	2006-03-20 06:51:10 +00:00
Chris Lattner	a8fbb6dd3d	Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate. llvm-svn: 26897	2006-03-20 06:37:44 +00:00
Chris Lattner	ffc475689b	fix duplicate definition errors llvm-svn: 26896	2006-03-20 06:33:01 +00:00
Chris Lattner	a8713b1ee6	Custom lower arbitrary VECTOR_SHUFFLE's to VPERM. TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized operations like vsplt* llvm-svn: 26887	2006-03-20 01:53:53 +00:00
Chris Lattner	7e9440a4fc	Custom lower SCALAR_TO_VECTOR into lve*x. llvm-svn: 26868	2006-03-19 06:55:52 +00:00
Chris Lattner	b1ee9c7e24	PPC doesn't have SCALAR_TO_VECTOR llvm-svn: 26865	2006-03-19 06:17:19 +00:00
Chris Lattner	f7b6e7212f	rename these nodes llvm-svn: 26848	2006-03-19 01:13:28 +00:00
Nate Begeman	bb01d4f272	Remove BRTWOWAY* Make the PPC backend not dependent on BRTWOWAY_CC and make the branch selector smarter about the code it generates, fixing a case in the readme. llvm-svn: 26814	2006-03-17 01:40:33 +00:00
Evan Cheng	2dd2c652b2	Added getTargetLowering() to TargetMachine. Refactored targets to support this. llvm-svn: 26742	2006-03-13 23:20:37 +00:00
Chris Lattner	9c7f50376a	Copysign needs to be expanded everywhere. Note that Alpha and IA64 should implement copysign as a native op if they have it. llvm-svn: 26541	2006-03-05 05:08:37 +00:00
Chris Lattner	27f5345b1f	Compile this: void foo(float a, int b) { b = a; } to this: _foo: fctiwz f0, f1 stfiwx f0, 0, r4 blr instead of this: _foo: fctiwz f0, f1 stfd f0, -8(r1) lwz r2, -4(r1) stw r2, 0(r4) blr This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the right thing for GCC bugzilla 26505. llvm-svn: 26447	2006-03-01 05:50:56 +00:00
Chris Lattner	f418435819	Use a target-specific dag-combine to implement CodeGen/PowerPC/fp-int-fp.ll. llvm-svn: 26445	2006-03-01 04:57:39 +00:00
Evan Cheng	1926427351	Vector op lowering. llvm-svn: 26438	2006-03-01 01:11:20 +00:00
Evan Cheng	73136dfecc	- Added option -relocation-model to set relocation model. Valid values include static, pic, dynamic-no-pic, and default. PPC and x86 default is dynamic-no-pic for Darwin, pic for others. - Removed options -enable-pic and -ppc-static. llvm-svn: 26315	2006-02-22 20:19:42 +00:00
Chris Lattner	7ad77dfc2a	split register class handling from explicit physreg handling. llvm-svn: 26308	2006-02-22 00:56:39 +00:00
Chris Lattner	7bb4696dc3	Updates to match change of getRegForInlineAsmConstraint prototype llvm-svn: 26305	2006-02-21 23:11:00 +00:00
Evan Cheng	5f99760ae7	Moved PICEnabled to include/llvm/Target/TargetOptions.h llvm-svn: 26272	2006-02-18 00:08:58 +00:00
Chris Lattner	3a0ad47b39	Switch to using getCALLSEQ_START instead of using our own creation calls llvm-svn: 26142	2006-02-13 08:55:29 +00:00
Chris Lattner	203b2f1288	Implement getConstraintType for PPC. llvm-svn: 26042	2006-02-07 20:16:30 +00:00
Chris Lattner	15a6c4c444	Add the simple PPC integer constraints llvm-svn: 26027	2006-02-07 00:47:13 +00:00
Nate Begeman	7e7f439f85	Fix some of the stuff in the PPC README file, and clean up legalization of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes. llvm-svn: 25875	2006-02-01 07:19:44 +00:00
Evan Cheng	32be2dc0af	Allow the specification of explicit alignments for constant pool entries. llvm-svn: 25855	2006-01-31 22:23:14 +00:00
Chris Lattner	0151361d21	add info about the inline asm register constraints for PPC llvm-svn: 25853	2006-01-31 19:20:21 +00:00
Nate Begeman	a162f208ee	Codegen bool %test(int %X) { %Y = seteq int %X, 13 ret bool %Y } as _test: addi r2, r3, -13 cntlzw r2, r2 srwi r3, r2, 5 blr rather than _test: cmpwi cr7, r3, 13 mfcr r2 rlwinm r3, r2, 31, 31, 31 blr This has very little effect on most code, but speeds up analyzer 23% and mason 11% llvm-svn: 25848	2006-01-31 08:17:29 +00:00
Chris Lattner	32058cfb7b	Functions that are lazily streamed in from the .bc file are not external. This fixes llvm-test/SingleSource/UnitTests/2006-01-29-SimpleIndirectCall.c and PR704 llvm-svn: 25793	2006-01-29 20:49:17 +00:00
Chris Lattner	3072af4d4f	Now that OpActions is big enough, we can specify actions for vector types llvm-svn: 25784	2006-01-29 08:41:37 +00:00
Chris Lattner	d7738e6b32	disable this for now llvm-svn: 25778	2006-01-29 07:31:33 +00:00
Chris Lattner	d33c60b52b	Request expansion of ConstantVec nodes. llvm-svn: 25773	2006-01-29 06:32:58 +00:00
Chris Lattner	61c9a8e942	Targets all now request ConstantFP to be legalized into TargetConstantFP. 'fpimm' in .td files is now TargetConstantFP. llvm-svn: 25771	2006-01-29 06:26:08 +00:00
Chris Lattner	30432e07f0	Fix a bug in my elimination of ISD::CALL this morning. PPC now has to provide the expansion for i64 calls itself llvm-svn: 25735	2006-01-28 07:33:03 +00:00
Chris Lattner	f424a66524	Use PPCISD::CALL instead of ISD::CALL llvm-svn: 25717	2006-01-27 23:34:02 +00:00
Chris Lattner	4d967a4cbb	Make llvm.frame/returnaddr not crash on ppc llvm-svn: 25710	2006-01-27 22:25:06 +00:00
Nate Begeman	8c47c3a3b1	Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for the same functionality. This addresses another piece of bug 680. Next, on to fixing Alpha VAARG, which I broke last time. llvm-svn: 25696	2006-01-27 21:09:22 +00:00
Evan Cheng	030e002fb9	Set SchedulingForLatency to be the default scheduling preference for all. llvm-svn: 25607	2006-01-25 18:52:42 +00:00
Nate Begeman	e74795cd70	First part of bug 680: Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same way as everything else. llvm-svn: 25606	2006-01-25 18:21:52 +00:00
Evan Cheng	1092a02619	Default scheduling preference is SchedulingForLatency. llvm-svn: 25603	2006-01-25 09:15:54 +00:00
Chris Lattner	ce5066c863	Don't assert on 'select_cc SETUO' llvm-svn: 25423	2006-01-18 19:42:35 +00:00
Chris Lattner	5bd514d7b0	Use the default impl of DYNAMIC_STACKALLOC, allowing us to delete some code. llvm-svn: 25334	2006-01-15 09:02:48 +00:00
Nate Begeman	2fba8a3aaa	bswap implementation llvm-svn: 25312	2006-01-14 03:14:10 +00:00
Chris Lattner	776c326c96	implement stacksave/stackrestore on PPC llvm-svn: 25277	2006-01-13 17:52:03 +00:00
Chris Lattner	8e2f52e645	expand unsupported stacksave/stackrestore nodes llvm-svn: 25272	2006-01-13 02:42:53 +00:00
Nate Begeman	1b8121b227	Add bswap, rotl, and rotr nodes Add dag combiner code to recognize rotl, rotr Add ppc code to match rotl Targets should add rotl/rotr patterns if they have them llvm-svn: 25222	2006-01-11 21:21:00 +00:00
Chris Lattner	602dfea79c	Fix calls that need to store values in stack slots, to not copy the stack pointer. This allows us to emit stuff like this: li r10, 0 stw r10, 56(r1) or r3, r10, r10 or r4, r10, r10 or r5, r10, r10 or r6, r10, r10 or r7, r10, r10 or r8, r10, r10 or r9, r10, r10 bl L_bar$stub instead of this: or r2, r1, r1 ;; Extraneous copy. li r10, 0 stw r10, 56(r2) or r3, r10, r10 or r4, r10, r10 or r5, r10, r10 or r6, r10, r10 or r7, r10, r10 or r8, r10, r10 or r9, r10, r10 bl L_bar$stub wowness. llvm-svn: 25221	2006-01-11 19:55:07 +00:00
Chris Lattner	66f63f72f3	Dead FP arguments still use an incoming FP reg. This fixes Regression/CodeGen/PowerPC/2006-01-11-darwin-fp-argument.ll, which was distilled from a miscompilation in 252.eon. llvm-svn: 25217	2006-01-11 18:21:25 +00:00
Chris Lattner	347ed8a581	Give PPCISD:: nodes legible names in dumps. llvm-svn: 25166	2006-01-09 23:52:17 +00:00
Chris Lattner	b87030358d	linkonce symbols have an extra indirection, just like weak ones do. This fixes Prolangs-C++/family and Prolangs-C++/primes. llvm-svn: 25119	2006-01-06 01:04:03 +00:00
Jim Laskey	deeafa0f00	Had expand logic backward. llvm-svn: 25105	2006-01-05 01:47:43 +00:00
Jim Laskey	762e9ec06c	Added initial support for DEBUG_LABEL allowing debug specific labels to be inserted in the code. llvm-svn: 25104	2006-01-05 01:25:28 +00:00
Nate Begeman	c2c8a6202f	Remove a fixme llvm-svn: 25045	2005-12-30 00:11:07 +00:00
Nate Begeman	9aea6e4691	Fix one of the things in the todo file, and get a bit closer to folding constant offsets from statics into the address arithmetic. llvm-svn: 24999	2005-12-24 01:00:15 +00:00
Chris Lattner	c46fc2482c	make sure bit_converts are expanded llvm-svn: 24978	2005-12-23 05:13:35 +00:00
Chris Lattner	f474034432	Simplify some code by using BIT_CONVERT llvm-svn: 24974	2005-12-23 00:59:59 +00:00
Nate Begeman	b11b8e44fa	Pattern-match return. Includes gross hack! llvm-svn: 24874	2005-12-20 00:26:01 +00:00
Nate Begeman	8e6a8af205	Convert load/store over to being pattern matched llvm-svn: 24871	2005-12-19 23:25:09 +00:00
Nate Begeman	4e56db674c	Add support for TargetConstantPool nodes to the dag isel emitter, and use them in the PPC backend, to simplify some logic out of Select and SelectAddr. llvm-svn: 24657	2005-12-10 02:36:00 +00:00
Chris Lattner	fea33f7e64	Use new PPC-specific nodes to represent shifts which require the 6-bit amount handling that PPC provides. These are generated by the lowering code and prevents the dag combiner from assuming (rightfully) that the shifts don't only look at 5 bits. This fixes a miscompilation of crafty with the new front-end. llvm-svn: 24615	2005-12-06 02:10:38 +00:00
Chris Lattner	3713e6b49c	Fix Regression/CodeGen/PowerPC/2005-11-30-vastart-crash.ll llvm-svn: 24547	2005-11-30 20:40:54 +00:00
Nate Begeman	3e7db9c6d5	Hook up one type, v4f32, to the VR RegisterClass for now. llvm-svn: 24517	2005-11-29 08:17:20 +00:00
Chris Lattner	9c415364cf	No targets support line number info yet. llvm-svn: 24513	2005-11-29 06:16:21 +00:00
Chris Lattner	3570cf456b	add an option to generate completely non-pic code, corresponding to what gcc -static produces on PPC. This is used for building kexts and other things. With this, materializing the address of a global looks like: lis r2, ha16(L_H$non_lazy_ptr) la r3, lo16(L_H$non_lazy_ptr)(r2) we're still emitting stubs for functions, which is wrong. That is next. llvm-svn: 24399	2005-11-17 18:55:48 +00:00
Chris Lattner	8f8ed28a64	Fix a bug that resistor on IRC hit where we tried to create token factor nodes of load results, not of their chain results. llvm-svn: 24398	2005-11-17 18:30:17 +00:00
Chris Lattner	5aba6ae3b3	Enable global address legalization, fixing a todo and allowing the removal of some code. This exposes the implicit load from the stubs to the DAG, allowing them to be optimized by the dag combiner. It also moves darwin specific stuff out of the isel into the legalizer, and allows more to be moved to the .td file. llvm-svn: 24397	2005-11-17 18:26:56 +00:00
Chris Lattner	3648c20472	Use the right accessor to create this node llvm-svn: 24394	2005-11-17 17:51:38 +00:00
Chris Lattner	595088aa0f	Add an initial hack at legalizing GlobalAddress into the appropriate nodes on Darwin to remove smarts from the isel. This is currently disabled by default (uncomment setOperationAction(ISD::GlobalAddress to enable it). tblgen needs to become smarter about tglobaladdr nodes and bigger patterns needed to be added to the .td file. However, we can currently emit stuff like this: :) li r2, lo16(L_x$non_lazy_ptr) lis r3, ha16(L_x$non_lazy_ptr) lwzx r2, r3, r2 The obvious improvements will follow. llvm-svn: 24390	2005-11-17 07:30:41 +00:00
Chris Lattner	b7025749e1	When lowering direct calls, lower them to use a targetglobaladress directly instead of a globaladdress. This has no effect on the generated code at all. llvm-svn: 24386	2005-11-17 05:56:14 +00:00
Chris Lattner	f718a9e17b	Fix an assert compiling MallocBench/gs llvm-svn: 24017	2005-10-26 18:01:11 +00:00
Nate Begeman	762bf809b5	Correctly Expand or Promote FP_TO_UINT based on the capabilities of the machine. This allows us to generate great code for i32 FP_TO_UINT now on targets with 64 bit extensions. llvm-svn: 23993	2005-10-25 23:48:36 +00:00
Chris Lattner	65845a2f7c	Expose the fextend on the DAG instead of doing it in the matcher llvm-svn: 23986	2005-10-25 20:54:57 +00:00
Nate Begeman	4dd383120f	Invert the TargetLowering flag that controls divide by consant expansion. Add a new flag to TargetLowering indicating if the target has really cheap signed division by powers of two, make ppc use it. This will probably go away in the future. Implement some more ISD::SDIV folds in the dag combiner Remove now dead code in the x86 backend. llvm-svn: 23853	2005-10-21 00:02:42 +00:00
Nate Begeman	c6f067a8c4	Move the target constant divide optimization up into the dag combiner, so that the nodes can be folded with other nodes, and we can not duplicate code in every backend. Alpha will probably want this too. llvm-svn: 23835	2005-10-20 02:15:44 +00:00
Nate Begeman	78afac2ddd	Add the ability to lower return instructions to TargetLowering. This allows us to lower legal return types to something else, to meet ABI requirements (such as that i64 be returned in two i32 regs on Darwin/ppc). llvm-svn: 23802	2005-10-18 23:23:37 +00:00
Nate Begeman	e74dfbb9ce	Do the right thing and enable 64 bit regs under the control of a subtarget option. Currently the only way to enable this is to specify the 64bitregs mattr flag. It is never enabled by default on any config yet. llvm-svn: 23779	2005-10-18 00:56:42 +00:00
Nate Begeman	0b71e007ef	First bits of 64 bit PowerPC stuff, currently disabled. A lot of this is purely mechanical. llvm-svn: 23778	2005-10-18 00:28:58 +00:00
Nate Begeman	6cca84e43c	More PPC32 -> PPC changes, as well as merging some classes that were redundant after the change. llvm-svn: 23759	2005-10-16 05:39:50 +00:00
Chris Lattner	6f3b954662	Rename PPC32.h to PPC.h This completes the grand PPC file renaming llvm-svn: 23745	2005-10-14 23:59:06 +00:00
Chris Lattner	a17e6c486c	fix an f32/f64 type mismatch llvm-svn: 23587	2005-10-02 06:37:13 +00:00
Chris Lattner	d3eee1a09b	Modify the ppc backend to use two register classes for FP: F8RC and F4RC. These are used to represent float and double values, and the two regclasses contain the same physical registers. llvm-svn: 23577	2005-10-01 01:35:02 +00:00
Chris Lattner	d3ea19b51a	Add FP versions of the binary operators, keeping the int and fp worlds seperate. llvm-svn: 23506	2005-09-28 22:29:58 +00:00
Chris Lattner	a028e7a39c	Darwin, like many BSD systems, has a setjmp/longjmp which saves the signal mask on setjmp calls and restores it on longjmp calls (both of which require syscalls). This makes the calls REALLY slow. Use _setjmp/_longjmp instead. This speeds up hexxagon from 120.31s to 15.68s: from 5.53x slower than GCC to 28% faster than GCC. llvm-svn: 23482	2005-09-27 22:18:25 +00:00
Chris Lattner	0f965a615e	Change the arg lowering code to use copyfromreg from vregs associated with incoming arguments instead of the pregs themselves. This fixes the scheduler from causing problems by moving a copyfromreg for an argument to after a select_cc node (now it can, and bad things won't happen). llvm-svn: 23334	2005-09-13 19:33:40 +00:00
Chris Lattner	aa6cbd90c5	Remove some dead vectors llvm-svn: 23329	2005-09-13 18:47:49 +00:00
Chris Lattner	4309c3a785	PowerPC cannot truncstore i1 natively llvm-svn: 23304	2005-09-10 00:21:06 +00:00
Nate Begeman	6095214bf0	Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts, such as the PowerPC 970. This speeds up 189.lucas from 81.99 to 32.64 seconds. llvm-svn: 23250	2005-09-06 22:03:27 +00:00
Chris Lattner	aa3b1fcc58	Decouple fsqrt from gpul optimizations, implementing fsqrt.ll. Remove the -enable-gpopt option which is subsumed by feature flags. llvm-svn: 23218	2005-09-02 18:33:05 +00:00
Chris Lattner	763a3a0fa7	Restore this patch now that the latent bug has been fixed llvm-svn: 23209	2005-09-02 01:24:55 +00:00
Chris Lattner	06d440f2ee	Revert the previous patch which causes a mysterious regression in toast. llvm-svn: 23207	2005-09-02 00:47:05 +00:00
Chris Lattner	9ee867b93b	Implement small-arguments.ll:test3 by teaching the DAG optimizer that the results of calls to functions returning small values are properly sign/zero extended. llvm-svn: 23198	2005-09-01 23:44:32 +00:00
Chris Lattner	da2e04c69d	Move FCTIWZ handling out of the instruction selectors and into legalization, getting them out of the business of making stack slots. llvm-svn: 23180	2005-08-31 21:09:52 +00:00
Chris Lattner	e675a08e10	Move SHL,SHR i64 -> legalizer llvm-svn: 23178	2005-08-31 20:23:54 +00:00
Chris Lattner	2f03896a0f	lower sra_parts on the dag, implementing it for the dag isel, and exposing the ops to dag optimization. llvm-svn: 23176	2005-08-31 19:09:57 +00:00
Nate Begeman	e3287b85b7	Enable generation of AssertSext and AssertZext in the PPC backend. llvm-svn: 23168	2005-08-31 01:58:39 +00:00
Chris Lattner	e75b5e63a7	Fix a bug in my patch for legalizing to fsel. It cannot handle seteq/setne, which I failed to include when I moved the code over. This fixes MallocBench/gs. llvm-svn: 23140	2005-08-30 00:45:18 +00:00
Chris Lattner	62b9a5d1f8	Fix some really strange indentation that xcode likes to use. no xcode, this is not right: if (!foo) break; X; llvm-svn: 23138	2005-08-30 00:19:00 +00:00
Chris Lattner	9b577f108a	implement SELECT_CC fully for the DAG->DAG isel! llvm-svn: 23101	2005-08-26 21:23:58 +00:00
Chris Lattner	b2854fadda	Make fsel emission work with both the pattern and dag-dag selectors, by giving it a non-instruction opcode. The dag->dag selector used to not select the operands of the fsel, because it thought that whole tree was already selected. llvm-svn: 23091	2005-08-26 20:25:03 +00:00
Chris Lattner	7f1fa8eaef	implement the other half of the select_cc -> fsel lowering, which handles when the RHS of the comparison is 0.0. Turn this on by default. llvm-svn: 23083	2005-08-26 17:36:52 +00:00
Chris Lattner	f3d06c6417	add initial support for converting select_cc -> fsel in the legalizer instead of in the backend. This currently handles fsel cases with registers, but doesn't have the 0.0 and -0.0 optimization enabled yet. Once this is finished, special hack for fp immediates can go away. llvm-svn: 23075	2005-08-26 00:52:45 +00:00
Nate Begeman	65ffd8fbf4	Remove option to make SetCC illegal on PowerPC after long discussion with Chris. This will be accomplished through correctly modeling CR's and subregs. llvm-svn: 23056	2005-08-25 20:01:10 +00:00
Nate Begeman	f3ce09b36e	Ack, typo llvm-svn: 22981	2005-08-23 05:45:10 +00:00
Nate Begeman	7216ad415b	Add an option to make SetCC illegal as a beta option llvm-svn: 22979	2005-08-23 05:42:36 +00:00
Jim Laskey	6267b2c97c	Make UINT_TO_FP and SINT_TO_FP use generic expansion. llvm-svn: 22815	2005-08-17 00:40:22 +00:00
Chris Lattner	79f5ebc7b9	updates for changes in nodes llvm-svn: 22808	2005-08-16 21:58:15 +00:00
Nate Begeman	371e49515d	Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty fixme from the PowerPC backend. Emit slightly better code for legalizing select_cc. llvm-svn: 22805	2005-08-16 19:49:35 +00:00
Chris Lattner	f22556d3ad	Pull the LLVM -> DAG lowering code out of the pattern selector so that it can be shared with the DAG->DAG selector. llvm-svn: 22799	2005-08-16 17:14:42 +00:00

... 19 20 21 22 23 ...

1169 Commits