llvm-project

Commit Graph

Author	SHA1	Message	Date
Ehsan Amiri	631ed04af0	adding another optimization opportunity to readme file llvm-svn: 263775	2016-03-18 04:02:25 +00:00
Nemanja Ivanovic	c09047916a	Add LLVM support for remaining integer divide and permute instructions from ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. llvm-svn: 234546	2015-04-09 23:54:37 +00:00
Kit Barton	116e18be41	Updated with list of possible improvements we are tracking internally llvm-svn: 231946	2015-03-11 17:43:43 +00:00
Hal Finkel	b0e9b35bc3	[PowerPC] Transform a README.txt entry into a FIXME Remove the README.txt entry regarding register allocation of CR logical ops, and replace it with a FIXME in PPCInstrInfo.td. The text in the README.txt was not really accurate, and thanks goes to Pat Haugen (and Bill Schmidt) from IBM for clarifying what was intended and highlighting the relevant text in the ISA specification. llvm-svn: 225325	2015-01-07 00:15:29 +00:00
Hal Finkel	ed844c4ad1	[PowerPC] Reuse a load operand in int->fp conversions int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. llvm-svn: 225301	2015-01-06 22:31:02 +00:00
Hal Finkel	bde27836ce	[PowerPC] Remove old README.txt entry regarding struct passing Because of how Clang represents structs as arrays (at least on non-Darwin platforms), and what SROA does, etc. this is no longer a problem. llvm-svn: 225251	2015-01-06 07:23:13 +00:00
Hal Finkel	6837077fcf	[PowerPC] Remove old README.txt entry We no longer generate horrible code for the stated function: void f(signed char a, _Bool b, _Bool c) { signed char t = 0; if (b) t = a; if (c) *a = t; } for which we now generate: .L.f: andi. 5, 5, 1 cmpldi 1, 4, 0 li 5, 0 beq 1, .LBB0_2 lbz 5, 0(3) .LBB0_2: # %if.end bclr 4, 1, 0 stb 5, 0(3) blr so we don't need the README.txt entry. llvm-svn: 225217	2015-01-05 22:20:22 +00:00
Hal Finkel	9187711f08	[PowerPC] Convert a README.txt entry into a better test We now produce the desired code as noted in the README.txt file (no spurious or). Remove the README entry and improve the regression test. llvm-svn: 225214	2015-01-05 21:53:52 +00:00
Hal Finkel	f4044b02a5	[PowerPC] Remove README.txt entry This entry has been rendered irrelevant now that we have proper CR bit tracking. llvm-svn: 225211	2015-01-05 21:41:26 +00:00
Hal Finkel	c7d35bb5b1	[PowerPC] Add a test for truncating a shifted load We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. llvm-svn: 225209	2015-01-05 21:33:14 +00:00
Hal Finkel	a4750dec99	[PowerPC] Add another test for load/store with update We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. llvm-svn: 225205	2015-01-05 21:22:42 +00:00
Hal Finkel	200d2ad188	[PowerPC] Fold i1 extensions with other ops Consider this function from our README.txt file: int foo(int a, int b) { return (a < b) << 4; } We now explicitly track CR bits by default, so the comment in the README.txt about not really having a SETCC is no longer accurate, but we did generate this somewhat silly code: cmpw 0, 3, 4 li 3, 0 li 12, 1 isel 3, 12, 3, 0 sldi 3, 3, 4 blr which generates the zext as a select between 0 and 1, and then shifts the result by a constant amount. Here we preprocess the DAG in order to fold the results of operations on an extension of an i1 value into the SELECT_I[48] pseudo instruction when the resulting constant can be materialized using one instruction (just like the 0 and 1). This was not implemented as a DAGCombine because the resulting code would have been anti-canonical and depends on replacing chained user nodes, which does not fit well into the lowering paradigm. Now we generate: cmpw 0, 3, 4 li 3, 0 li 12, 16 isel 3, 12, 3, 0 blr which is less silly. llvm-svn: 225203	2015-01-05 21:10:24 +00:00
Hal Finkel	2f61879ff4	[PowerPC] Materialize i64 constants using rotation with masking r225135 added the ability to materialize i64 constants using rotations in order to reduce the instruction count. Sometimes we can use a rotation only with some extra masking, so that we take advantage of the fact that generating a bunch of extra higher-order 1 bits is easy using li/lis. llvm-svn: 225147	2015-01-05 03:41:38 +00:00
Hal Finkel	241ba79f95	[PowerPC] Materialize i64 constants using rotation Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing a rotated constant, and then applying the inverse rotation, requires fewer instructions than the direct method. If so, do that instead. In r225132, I added support for forming constants using bit inversion. In effect, this reverts that commit and replaces it with rotation support. The bit inversion is useful for turning constants that are mostly ones into ones that are mostly zeros (thus enabling a more-efficient shift-based materialization), but the same effect can be obtained by using negative constants and a rotate, and that is at least as efficient, if not more. llvm-svn: 225135	2015-01-04 15:43:55 +00:00
Hal Finkel	8adf2254ef	[PowerPC] Improve instruction selection bit-permuting operations (32-bit) The PowerPC backend, somewhat embarrassingly, did not generate an optimal-length sequence of instructions for a 32-bit bswap. While adding a pattern for the bswap intrinsic to fix this would not have been terribly difficult, doing so would not have addressed the real problem: we had been generating poor code for many bit-permuting operations (by which I mean things like byte-swap that permute the bits of one or more inputs around in various ways). Here are some initial steps toward solving this deficiency. Bit-permuting operations are represented, at the SDAG level, using ISD::ROTL, SHL, SRL, AND and OR (mostly with constant second operands). Looking back through these operations, we can build up a description of the bits in the resulting value in terms of bits of one or more input values (and constant zeros). For each bit, we compute the rotation amount from the original value, and then group consecutive (value, rotation factor) bits into groups. Groups sharing these attributes are then collected and sorted, and we can then instruction select the entire permutation using a combination of masked rotations (rlwinm), imm ands (andi/andis), and masked rotation inserts (rlwimi). The result is that instead of lowering an i32 bswap as: rlwinm 5, 3, 24, 16, 23 rlwinm 4, 3, 24, 0, 7 rlwimi 4, 3, 8, 8, 15 rlwimi 5, 3, 8, 24, 31 rlwimi 4, 5, 0, 16, 31 we now produce: rlwinm 4, 3, 8, 0, 31 rlwimi 4, 3, 24, 16, 23 rlwimi 4, 3, 24, 0, 7 and for the 'test6' example in the PowerPC/README.txt file: unsigned test6(unsigned x) { return ((x & 0x00FF0000) >> 16) \| ((x & 0x000000FF) << 16); } we used to produce: lis 4, 255 rlwinm 3, 3, 16, 0, 31 ori 4, 4, 255 and 3, 3, 4 and now we produce: rlwinm 4, 3, 16, 24, 31 rlwimi 4, 3, 16, 8, 15 and, as a nice bonus, this fixes the FIXME in test/CodeGen/PowerPC/rlwimi-and.ll. This commit does not include instruction-selection for i64 operations, those will come later. llvm-svn: 224318	2014-12-16 05:51:41 +00:00
Hal Finkel	b5e9b0426a	[PowerPC] Better lowering for add/or of a FrameIndex If we have an add (or an or that is really an add), where one operand is a FrameIndex and the other operand is a small constant, we can combine the lowering of the FrameIndex (which is lowered as an add of the FI and a zero offset) with the constant operand. Amusingly, this is an old potential improvement entry from lib/Target/PowerPC/README.txt which had never been resolved. In short, we used to lower: %X = alloca { i32, i32 } %Y = getelementptr {i32,i32}* %X, i32 0, i32 1 ret i32* %Y as: addi 3, 1, -8 ori 3, 3, 4 blr and now we produce: addi 3, 1, -4 blr which is much more sensible. llvm-svn: 224071	2014-12-11 22:51:06 +00:00
Hal Finkel	b5aa7e54d9	Generate PPC early conditional returns PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. llvm-svn: 179026	2013-04-08 16:24:03 +00:00
Hal Finkel	2ed21a8ca6	Remove some obsolete PowerPC/README entries llvm-svn: 178657	2013-04-03 14:25:55 +00:00
Hal Finkel	f1af79ab45	Remove "gpr0 allocation" from the PPC README TODO list As Chris pointed out, post r178123, this is now done! llvm-svn: 178165	2013-03-27 18:39:52 +00:00
Hal Finkel	41e6fd1df9	Remove the TODO statement in the PPC README re: CTR loops As Chris points out, this can now be removed! TODO: check if the associated section on viterbi's inner loop can also be removed. llvm-svn: 158224	2012-06-08 20:02:09 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Chris Lattner	2e040ebe19	add a readme. llvm-svn: 114303	2010-09-19 00:34:58 +00:00
Dan Gohman	6f34abd092	Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul, respectively. llvm-svn: 97531	2010-03-02 01:11:08 +00:00
Dale Johannesen	626b79d6a6	Add the problem I just hacked around in 96015/96020. The solution there produces correct code, but is seriously deficient in several ways. llvm-svn: 96039	2010-02-12 23:16:24 +00:00
Chris Lattner	e0359b4fe7	move PR5945 here. llvm-svn: 94350	2010-01-24 02:27:03 +00:00
Chris Lattner	1b35bbe813	change the canonical form of "cond ? -1 : 0" to be "sext cond" instead of a select. This simplifies some instcombine code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows us to generate better code for a testcase on ppc. llvm-svn: 94339	2010-01-24 00:09:49 +00:00
Chris Lattner	97331ae668	add a note llvm-svn: 94317	2010-01-23 18:42:37 +00:00
Chris Lattner	9e4e45a3b6	constant materialization could be improved. llvm-svn: 92921	2010-01-07 17:53:10 +00:00
Dan Gohman	17151155ed	Remove the IA-64 backend. llvm-svn: 76920	2009-07-24 00:30:09 +00:00
Chris Lattner	4ec83ea628	clarify: stub emission depends on the version of the linker you use, it has nothing to do with the target. Also, the stub elimination optimization requires making the stub explicit. llvm-svn: 74682	2009-07-02 01:24:34 +00:00
Dale Johannesen	4e6044c405	Add darwin stub removal to wishlist. llvm-svn: 74667	2009-07-01 23:36:02 +00:00
Dale Johannesen	aae3a4f864	Move some former testcases (low-probability codegen optimizations) into this wishlist. llvm-svn: 59455	2008-11-17 18:56:34 +00:00
Nate Begeman	f69d13b60a	Implement ISD::TRAP support on PPC llvm-svn: 54644	2008-08-11 17:36:31 +00:00
Chris Lattner	6b0a189225	add a note llvm-svn: 47830	2008-03-02 19:27:34 +00:00
Chris Lattner	bd0bb3f07f	Evan implemented this. llvm-svn: 47827	2008-03-02 17:56:29 +00:00
Nate Begeman	3090b0fbd1	additional missing feature llvm-svn: 46948	2008-02-11 04:16:09 +00:00
Chris Lattner	8e07533f20	If someone wants to implement ppc TRAP, they can go for it :) llvm-svn: 46019	2008-01-15 22:15:02 +00:00
Chris Lattner	89f36e6b21	Finally implement correct ordered comparisons for PPC, even though the code generated is not wonderful. This turns a miscompilation into a code quality bug (noted in the ppc readme). This fixes PR642, which is over 2 years old (!). Nate, please review this. llvm-svn: 45742	2008-01-08 06:46:30 +00:00
Chris Lattner	f6a8156e4f	implement __builtin_return_addr(0) on ppc. llvm-svn: 44700	2007-12-08 06:59:59 +00:00
Chris Lattner	6777b72659	Add some notes about better flag handling. llvm-svn: 41808	2007-09-10 21:43:18 +00:00
Chris Lattner	92c6a65d4e	new example llvm-svn: 41318	2007-08-23 15:16:03 +00:00
Chris Lattner	075b4db621	add a note llvm-svn: 35530	2007-03-31 07:06:25 +00:00
Chris Lattner	26ad3e7191	add a note llvm-svn: 35334	2007-03-25 05:10:46 +00:00
Chris Lattner	9c9e2f1af2	add a note llvm-svn: 35330	2007-03-25 04:46:28 +00:00
Chris Lattner	c9088b4c8e	add a note llvm-svn: 34101	2007-02-09 17:38:01 +00:00
Nate Begeman	ba52b94fa2	Remove fixed item llvm-svn: 34081	2007-02-09 04:19:54 +00:00
Chris Lattner	37ebf9317b	A relatively simple PPC optimization. llvm-svn: 33709	2007-01-31 19:49:20 +00:00
Nate Begeman	17f250005a	Update some of the llvm in the readme llvm-svn: 33630	2007-01-29 21:21:22 +00:00
Chris Lattner	889d934d00	move contents of PR587 to here. llvm-svn: 33333	2007-01-18 07:34:57 +00:00
Chris Lattner	542dfd5510	Rewrite the branch selector to be correct in the face of large functions. The algorithm it used before wasn't 100% correct, we now use an iterative expansion model. This fixes assembler errors when compiling 403.gcc with tail merging enabled. Change the way the branch selector works overall: Now, the isel generates PPC::BCC instructions (as it used to) directly, and these BCC instructions are emitted to the output or jitted directly if branches don't need expansion. Only if branches need expansion are instructions rewritten and created. This should make branch select faster, and eliminates the Bxx instructions from the .td file. llvm-svn: 31837	2006-11-18 00:32:03 +00:00

1 2 3 4

155 Commits