llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	e2d25a1a50	Fixed an encoding bug: movd from XMM to R32. llvm-svn: 27807	2006-04-18 18:19:00 +00:00
Chris Lattner	1e174c87c3	pretty print node name llvm-svn: 27806	2006-04-18 18:05:58 +00:00
Chris Lattner	9754d142a4	Implement an important entry from README_ALTIVEC: If an altivec predicate compare is used immediately by a branch, don't use a (serializing) MFCR instruction to read the CR6 register, which requires a compare to get it back to CR's. Instead, just branch on CR6 directly. :) For example, for: void foo2(vector float A, vector float B) { if (!vec_any_eq(A, B)) *B = (vector float){0,0,0,0}; } We now generate: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 bne cr6, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr instead of: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 cmpwi cr0, r3, 0 beq cr0, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr This implements CodeGen/PowerPC/vec_br_cmp.ll. llvm-svn: 27804	2006-04-18 17:59:36 +00:00
Chris Lattner	11a9ac51e8	new testcase llvm-svn: 27803	2006-04-18 17:56:30 +00:00
Chris Lattner	68c16a201e	move some stuff around, clean things up llvm-svn: 27802	2006-04-18 17:52:36 +00:00
Chris Lattner	bfc2c68386	Teach the codegen about instructions used for SSE spill code, allowing it to optimize cases where it has to spill a lot llvm-svn: 27801	2006-04-18 16:44:51 +00:00
Nate Begeman	f776fc2c98	Fix a copy & paste error from long ago. llvm-svn: 27800	2006-04-18 16:03:18 +00:00
Chris Lattner	89e761c19d	Add some more notes, many still missing llvm-svn: 27799	2006-04-18 06:32:08 +00:00
Reid Spencer	b687ce80cd	Have the AutoRegen.sh script prompt the user for the LLVM src and obj directories if it can't find them. Then, replace those values into the configure.ac script and pass them to the LLVM_CONFIG_PROJECT so that the values become the default for llvm_src and llvm_obj variables. In this way the user is required to input this exactly once, and the scripts take it from there. llvm-svn: 27798	2006-04-18 06:27:47 +00:00
Reid Spencer	c81081ab5e	Make it possible to default the llvm_src and llvm_obj variables based on the arguments to the macro. This better supports the AutoRegen.sh script in projects/sample/autoconf. llvm-svn: 27797	2006-04-18 06:25:37 +00:00
Chris Lattner	9f87173df3	add a bunch of stuff, pieces still missing llvm-svn: 27796	2006-04-18 06:18:36 +00:00
Chris Lattner	9232c8c1c5	Add a warning. llvm-svn: 27795	2006-04-18 05:31:20 +00:00
Chris Lattner	3af67456dd	Add a warning llvm-svn: 27794	2006-04-18 05:26:10 +00:00
Chris Lattner	96d50487c9	Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793	2006-04-18 04:28:57 +00:00
Chris Lattner	d6d82aa889	Implement v16i8 multiply with this code: vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792	2006-04-18 03:57:35 +00:00
Chris Lattner	48786e4887	Add tests for v8i16 and v16i8 llvm-svn: 27791	2006-04-18 03:54:50 +00:00
Evan Cheng	4d36a36900	Correct comments llvm-svn: 27790	2006-04-18 03:45:01 +00:00
Chris Lattner	7e439874cb	Lower v8i16 multiply into this code: li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789	2006-04-18 03:43:48 +00:00
Chris Lattner	a2cae1bb10	Custom lower v4i32 multiplies into a cute sequence, instead of having legalize scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788	2006-04-18 03:24:30 +00:00
Chris Lattner	2dea154035	new testcase llvm-svn: 27787	2006-04-18 03:22:16 +00:00
Evan Cheng	0ef233509b	Another entry llvm-svn: 27786	2006-04-18 01:22:57 +00:00
Chris Lattner	3db2056315	Fix a build failure on Vladimir's tester. llvm-svn: 27785	2006-04-18 00:21:25 +00:00
Evan Cheng	e008bd3d27	Another entry. llvm-svn: 27784	2006-04-18 00:21:01 +00:00
Evan Cheng	5421206c4b	Use movss to insert_vector_elt(v, s, 0). llvm-svn: 27782	2006-04-17 22:45:49 +00:00
Chris Lattner	36dd7c98d1	Turn x86 unaligned load/store intrinsics into aligned load/store instructions if the pointer is known aligned. llvm-svn: 27781	2006-04-17 22:26:56 +00:00
Chris Lattner	916ae0775e	Fix handling of calls in functions that use vectors. This fixes a crash on the code in GCC PR26546. llvm-svn: 27780	2006-04-17 22:10:08 +00:00
Evan Cheng	6e5e205841	Use two pinsrw to insert an element into v4i32 / v4f32 vector. llvm-svn: 27779	2006-04-17 22:04:06 +00:00
Chris Lattner	63a5cdc423	remove done item llvm-svn: 27778	2006-04-17 21:52:03 +00:00
Chris Lattner	6bd68ae81e	Don't diddle VRSAVE if no registers need to be added/removed from it. This allows us to codegen functions as: _test_rol: vspltisw v2, -12 vrlw v2, v2, v2 blr instead of: _test_rol: mfvrsave r2, 256 mr r3, r2 mtvrsave r3 vspltisw v2, -12 vrlw v2, v2, v2 mtvrsave r2 blr Testcase here: CodeGen/PowerPC/vec_vrsave.ll llvm-svn: 27777	2006-04-17 21:48:13 +00:00
Chris Lattner	efe2b3f2fc	New testcase, shouldn't touch vrsave llvm-svn: 27776	2006-04-17 21:48:03 +00:00
Chris Lattner	bec79b4a59	Add a MachineInstr::eraseFromParent convenience method. llvm-svn: 27775	2006-04-17 21:35:41 +00:00
Chris Lattner	9fcad09b1b	Add some convenience methods. llvm-svn: 27774	2006-04-17 21:35:08 +00:00
Evan Cheng	22c06f054b	Encoding bug llvm-svn: 27773	2006-04-17 21:33:57 +00:00
Chris Lattner	72d7c27069	Vectors that are known live-in and live-out are clearly already marked in the vrsave register for the caller. This allows us to codegen a function as: _test_rol: mfspr r2, 256 mr r3, r2 mtspr 256, r3 vspltisw v2, -12 vrlw v2, v2, v2 mtspr 256, r2 blr instead of: _test_rol: mfspr r2, 256 oris r3, r2, 40960 mtspr 256, r3 vspltisw v0, -12 vrlw v2, v0, v0 mtspr 256, r2 blr llvm-svn: 27772	2006-04-17 21:22:06 +00:00
Chris Lattner	14c4972b6d	Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this: vspltisw v2, -12 vrlw v2, v2, v2 instead of: vspltisw v0, -12 vrlw v2, v0, v0 when a function is returning a value. llvm-svn: 27771	2006-04-17 21:19:12 +00:00
Chris Lattner	6df094b4ab	Move some knowledge about registers out of the code emitter into the register info. llvm-svn: 27770	2006-04-17 21:07:20 +00:00
Chris Lattner	0f28d48da2	Use a small table instead of macros to do this conversion. llvm-svn: 27769	2006-04-17 20:59:25 +00:00
Evan Cheng	5022b3426e	Implement v8i16, v16i8 splat using unpckl + pshufd. llvm-svn: 27768	2006-04-17 20:43:08 +00:00
Chris Lattner	c070c621ac	implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll llvm-svn: 27767	2006-04-17 20:32:50 +00:00
Chris Lattner	e757ae6534	New testcase llvm-svn: 27766	2006-04-17 20:32:27 +00:00
Chris Lattner	326870b40b	Codegen insertelement with constant insertion points as scalar_to_vector and a shuffle. For this: void %test2(<4 x float>* %F, float %f) { %tmp = load <4 x float>* %F ; <<4 x float>> [#uses=2] %tmp3 = add <4 x float> %tmp, %tmp ; <<4 x float>> [#uses=1] %tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2 ; <<4 x float>> [#uses=2] %tmp6 = add <4 x float> %tmp2, %tmp2 ; <<4 x float>> [#uses=1] store <4 x float> %tmp6, <4 x float>* %F ret void } we now get this on X86 (which will get better): _test2: movl 4(%esp), %eax movaps (%eax), %xmm0 addps %xmm0, %xmm0 movaps %xmm0, %xmm1 shufps $3, %xmm1, %xmm1 movaps %xmm0, %xmm2 shufps $1, %xmm2, %xmm2 unpcklps %xmm1, %xmm2 movss 8(%esp), %xmm1 unpcklps %xmm1, %xmm0 unpcklps %xmm2, %xmm0 addps %xmm0, %xmm0 movaps %xmm0, (%eax) ret instead of: _test2: subl $28, %esp movl 32(%esp), %eax movaps (%eax), %xmm0 addps %xmm0, %xmm0 movaps %xmm0, (%esp) movss 36(%esp), %xmm0 movss %xmm0, 8(%esp) movaps (%esp), %xmm0 addps %xmm0, %xmm0 movaps %xmm0, (%eax) addl $28, %esp ret llvm-svn: 27765	2006-04-17 19:21:01 +00:00
Chris Lattner	e54133cfba	Make sure to check splats of every constant we can, handle splat(31) by being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764	2006-04-17 18:09:22 +00:00
Evan Cheng	bf0d13c54f	Incorrect foldMemoryOperand entries llvm-svn: 27763	2006-04-17 18:06:12 +00:00
Evan Cheng	5112b5c544	Errors in patterns preventing load folding llvm-svn: 27762	2006-04-17 18:05:01 +00:00
Jeff Cohen	e3955a05e4	Add checks for __OpenBSD__. llvm-svn: 27761	2006-04-17 17:55:41 +00:00
Chris Lattner	264c908e3a	Teach the ppc backend to use rol and vsldoi to generate splatted constants. This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760	2006-04-17 17:55:10 +00:00
Chris Lattner	8cdba16d5e	Some more cases that can be generated with two instructions llvm-svn: 27759	2006-04-17 17:54:18 +00:00
Chris Lattner	26fb8d9393	add a note llvm-svn: 27758	2006-04-17 17:29:41 +00:00
Evan Cheng	b3b41c4f3d	FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly llvm-svn: 27755	2006-04-17 07:24:10 +00:00
Chris Lattner	1b3806ace5	Make some code more general, adding support for constant formation of several new patterns. llvm-svn: 27754	2006-04-17 06:58:41 +00:00

1 2 3 4 5 ...

24238 Commits All Branches Search

24238 Commits

All Branches