llvm-project

Commit Graph

Author	SHA1	Message	Date
Nick Lewycky	1a7ed5868b	Make the 'icmp pred trunc(ext(X)), CST --> icmp pred X, ext(trunc(CST))' transformation much more careful. Truncating binary '01' to '1' sounds like it's safe until you realize that it switched from positive to negative under a signed interpretation, and that depends on the icmp predicate. Also a few miscellaneous cleanups. llvm-svn: 97721	2010-03-04 06:54:10 +00:00
Chris Lattner	3afc0721c7	fix incorrect folding of icmp with undef, PR6481. llvm-svn: 97659	2010-03-03 19:46:03 +00:00
Bill Wendling	af13d82945	This test case: long test(long x) { return (x & 123124) \| 3; } Currently compiles to: _test: orl $3, %edi movq %rdi, %rax andq $123127, %rax ret This is because instruction and DAG combiners canonicalize (or (and x, C), D) -> (and (or, D), (C \| D)) However, this is only profitable if (C & D) != 0. It gets in the way of the 3-addressification because the input bits are known to be zero. llvm-svn: 97616	2010-03-03 00:35:56 +00:00
Dan Gohman	6f34abd092	Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul, respectively. llvm-svn: 97531	2010-03-02 01:11:08 +00:00
Dan Gohman	882c95605f	LLVM instruction syntax doesn't have trailing semicolons. llvm-svn: 97456	2010-03-01 17:53:15 +00:00
John McCall	dcb9a7ad3d	Teach APFloat how to create both QNaNs and SNaNs and with arbitrary-width payloads. APFloat's internal folding routines always make QNaNs now, instead of sometimes making QNaNs and sometimes SNaNs depending on the type. llvm-svn: 97364	2010-02-28 02:51:25 +00:00
Dan Gohman	cd4c03e886	Don't do (X != Y) ? X : Y -> X for floating-point values; it doesn't handle NaN properly. Do (X une Y) ? X : Y -> X if one of X and Y is not zero. llvm-svn: 96955	2010-02-23 17:17:57 +00:00
Dan Gohman	8a0eb36d23	Remove the code which constant-folded ptrtoint(inttoptr(x)+c) to getelementptr. Despite only doing so in the case where x is a known array object and c can be converted to an index within range, this could still be invalid if c is actually the address of an object allocated outside of LLVM. Also, SCEVExpander, the original motivation for this code, has since been improved to avoid inttoptr+ptroint in more cases. llvm-svn: 96950	2010-02-23 16:35:41 +00:00
Dan Gohman	e7f6feb469	Convert this test to FileCheck and add a testcase for PR3574. llvm-svn: 96851	2010-02-23 01:28:09 +00:00
Evan Cheng	3688b8fa68	Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting. llvm-svn: 96825	2010-02-22 23:34:00 +00:00
Dan Gohman	d8abbf0af6	Add a test for canonicalizing ConstantExpr operands. llvm-svn: 96820	2010-02-22 23:07:52 +00:00
Dan Gohman	754e4a9801	Constant-fold certain comparisons with infinity and negative infinity. llvm-svn: 96777	2010-02-22 04:06:03 +00:00
Dan Gohman	cf39be32bf	Fold bswap(undef) to undef. llvm-svn: 96432	2010-02-17 00:54:58 +00:00
Eric Christopher	843a4cc43c	Fix a problem where we had bitcasted operands that gave us odd offsets since the bitcasted pointer size and the offset pointer size are going to be different types for the GEP vs base object. llvm-svn: 96134	2010-02-13 23:38:01 +00:00
Eric Christopher	cccdc13662	Make sure that ConstantExpr offsets also aren't off of extern symbols. Thanks to Duncan Sands for the testcase! llvm-svn: 95877	2010-02-11 17:44:04 +00:00
Chris Lattner	4e8137d678	Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect what it does. Enhance it to return false to optimizing vector sign extensions from vector comparisions, which is the idiom used to get a splatted vector for a vector comparison. Doing this breaks vector-casts.ll, add some compensating transformations to handle the important case they cover without depending on this canonicalization. This fixes rdar://7434900 a serious pessimization of vector compares. llvm-svn: 95855	2010-02-11 06:26:33 +00:00
Chris Lattner	1d4eb8fac4	convert to filecheck. llvm-svn: 95854	2010-02-11 06:24:37 +00:00
Eric Christopher	531ea566a6	Add ConstantExpr handling to Intrinsic::objectsize lowering. Update testcase accordingly now that we can optimize another section. llvm-svn: 95846	2010-02-11 01:48:54 +00:00
Eric Christopher	7b7028fd24	Move Intrinsic::objectsize lowering back to InstCombineCalls and enable constant 0 offset lowering. llvm-svn: 95691	2010-02-09 21:24:27 +00:00
Eric Christopher	ad1aa86276	Pull these back out, they're a little too aggressive and time consuming for a simple optimization. llvm-svn: 95671	2010-02-09 17:29:18 +00:00
Chris Lattner	9b6a1789e5	fix PR6193, only considering sign extensions from i1 for this xform. llvm-svn: 95642	2010-02-09 01:12:41 +00:00
Eric Christopher	9f85e7eb16	Add a new pass to do llvm.objsize lowering using SCEV. Initial skeleton and SCEVUnknown lowering implemented, the rest should come relatively quickly. Move testcase to new directory. Move pass to right before SimplifyLibCalls - which is moved down a bit so we can take advantage of a few opts. llvm-svn: 95628	2010-02-09 00:35:38 +00:00
Chris Lattner	64ffd11d49	fix logical-select to invoke filecheck right, and fix hte instcombine xform it is checking to actually pass. There is no need to match m_SelectCst<0, -1> since instcombine canonicalizes that into not(sext). Add matches for sext(not(x)) in addition to not(sext(x)). llvm-svn: 95420	2010-02-05 19:53:02 +00:00
Eric Christopher	04371b4f12	Remove this code for now. I have a better idea and will rewrite with that in mind. llvm-svn: 95402	2010-02-05 19:04:06 +00:00
Eric Christopher	107a1fbf61	Temporarily revert this since it appears to have caused a build failure. llvm-svn: 95294	2010-02-04 06:41:27 +00:00
Eric Christopher	42fa84a880	Rework constant expr and array handling for objectsize instcombining. Fix bugs where we would compute out of bounds as in bounds, and where we couldn't know that the linker could override the size of an array. Add a few new testcases, change existing testcase to use a private global array instead of extern. llvm-svn: 95283	2010-02-04 02:55:34 +00:00
Eric Christopher	f12e18db21	If we're dealing with a zero-length array, don't lower to any particular size, we just don't know what the length is yet. llvm-svn: 95266	2010-02-03 23:56:07 +00:00
Eric Christopher	d86233c118	Recommit this, looks like it wasn't the cause. llvm-svn: 95165	2010-02-03 00:21:58 +00:00
Eric Christopher	e67d01a9a8	Hopefully temporarily revert this. llvm-svn: 95154	2010-02-02 23:01:31 +00:00
Eric Christopher	4264e7e46f	Re-add strcmp and known size object size checking optimization. Passed bootstrap and nightly test run here. llvm-svn: 95145	2010-02-02 22:10:43 +00:00
Chris Lattner	8e2c471614	don't turn (A & (C0?-1:0)) \| (B & ~(C0?-1:0)) -> C0 ? A : B for vectors. Codegen is generating awful code or segfaulting in various cases (e.g. PR6204). llvm-svn: 95058	2010-02-02 02:43:51 +00:00
Chris Lattner	94eb4b285b	fix PR6195, a bug constant folding scalar -> vector compares. llvm-svn: 94997	2010-02-01 20:04:40 +00:00
Dan Gohman	e5e1b7b05a	Generalize target-independent folding rules for sizeof to handle more cases, and implement target-independent folding rules for alignof and offsetof. Also, reassociate reassociative operators when it leads to more folding. Generalize ScalarEvolution's isOffsetOf to recognize offsetof on arrays. Rename getAllocSizeExpr to getSizeOfExpr, and getFieldOffsetExpr to getOffsetOfExpr, for consistency with analagous ConstantExpr routines. Make the target-dependent folder promote GEP array indices to pointer-sized integers, to make implicit casting explicit and exposed to subsequent folding. And add a bunch of testcases for this new functionality, and a bunch of related existing functionality. llvm-svn: 94987	2010-02-01 18:27:38 +00:00
Chris Lattner	846a52e228	fix rdar://7590304, a miscompilation of objc apps on arm. The caller of objc message send was getting marked arm_apcscc, but the prototype isn't. This is fine at runtime because objcmsgsend is implemented in assembly. Only turn a mismatched caller and callee into 'unreachable' if the callee is a definition. llvm-svn: 94986	2010-02-01 18:11:34 +00:00
Chris Lattner	2cecedf081	fix rdar://7590304, an infinite loop in instcombine. In the invoke case, instcombine can't zap the invoke for fear of changing the CFG. However, we have to do something to prevent the next iteration of instcombine from inserting another store -> undef before the invoke thereby getting into infinite iteration between dead store elim and store insertion. Just zap the callee to null, which will prevent the next iteration from doing anything. llvm-svn: 94985	2010-02-01 18:04:58 +00:00
Eli Friedman	690c7f4dfd	Remove test which is no longer relevant. llvm-svn: 94944	2010-01-31 04:40:45 +00:00
Eli Friedman	a2cc2875fc	Simplify/generalize the xor+add->sign-extend instcombine. llvm-svn: 94943	2010-01-31 04:29:12 +00:00
Eli Friedman	37a8197b61	Add a small transform: transform -(X<<Y) to (-X<<Y) when the shift has a single use and X is free to negate. llvm-svn: 94941	2010-01-31 02:30:23 +00:00
Bob Wilson	2e51136f80	Remove ARM-specific calling convention from this test. Target data is needed for this test, but otherwise, there's nothing ARM-specific about it and no need to specify the calling convention. llvm-svn: 94862	2010-01-30 00:40:23 +00:00
Eric Christopher	5a0e174863	Revert my last couple of patches. They appear to have broken bison. llvm-svn: 94841	2010-01-29 21:16:24 +00:00
Bob Wilson	7c42b9d51e	Improve isSafeToLoadUnconditionally to recognize that GEPs with constant indices are safe if the result is known to be within the bounds of the underlying object. llvm-svn: 94829	2010-01-29 19:19:08 +00:00
Eric Christopher	997f7ca8c5	Add constant support to object size handling and remove default lowering. We'll either figure it out, or not and be lowered by SelectionDAGBuild. Add test. llvm-svn: 94775	2010-01-29 01:09:57 +00:00
Duncan Sands	3a48b87c54	Fix PR6165. The bug was that LHSKnownZero was being and'd with DemandedMask when it should have been and'd with LowBits. Fix that and while there beef up the logic in the case of a negative LHS. llvm-svn: 94745	2010-01-28 17:22:42 +00:00
Chris Lattner	1b35bbe813	change the canonical form of "cond ? -1 : 0" to be "sext cond" instead of a select. This simplifies some instcombine code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows us to generate better code for a testcase on ppc. llvm-svn: 94339	2010-01-24 00:09:49 +00:00
Chris Lattner	249da5cb73	implement a simple instcombine xform that has been in the readme forever. llvm-svn: 94318	2010-01-23 18:49:30 +00:00
Mon P Wang	e04b4567f8	InstCombine should not fold sext/zext of a vector and a bitcast to a scalar to a sext/zext llvm-svn: 94280	2010-01-23 04:35:57 +00:00
Chris Lattner	18f49ce2d3	optimize ~(~X >>s Y) --> (X >>s Y), patch by Edmund Grimley Evans! llvm-svn: 93884	2010-01-19 18:16:19 +00:00
Chris Lattner	43f2fa6201	my instcombine transformations to make extension elimination more aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)), make sure to transform a couple more things into that canonical form, and catch a case where we missed turning zext/shl/ashr into a single sext. llvm-svn: 93787	2010-01-18 22:19:16 +00:00
Chris Lattner	2b03aaafbf	filecheckize this. llvm-svn: 93776	2010-01-18 22:00:46 +00:00
Chris Lattner	48b753ef9f	remove a redundant test, filecheckize another. llvm-svn: 93774	2010-01-18 21:55:43 +00:00
Bill Wendling	3fbf36d0b4	Reduce fsub-fadd.ll and merge it into fsub-fsub.ll. Rename fsub-fsub.ll to fsub.ll and FileCheckify it. llvm-svn: 93669	2010-01-17 00:21:21 +00:00
Bill Wendling	ad7a5b07a7	When the visitSub method was split into visitSub and visitFSub, this xform was added to the FSub version. However, the original version of this xform guarded against doing this for floating point (!Op0->getType()->isFPOrFPVector()). This is causing LLVM to perform incorrect xforms for code like: void func(double rhi, double rlo, double xh, double xl, double yh, double yl){ double mh, ml; double c = 134217729.0; double up, u1, u2, vp, v1, v2; up = xhc; u1 = (xh - up) + up; u2 = xh - u1; vp = yhc; v1 = (yh - vp) + vp; v2 = yh - v1; mh = xhyh; ml = (((u1v1 - mh) + (u1v2)) + (u2v1)) + (u2v2); ml += xhyl + xlyh; rhi = mh + ml; rlo = (mh - (rhi)) + ml; } The last line was optimized away, but rl is intended to be the difference between the infinitely precise result of mh + ml and after it has been rounded to double precision. llvm-svn: 93369	2010-01-13 23:23:17 +00:00
Chris Lattner	2d2c0a00ad	disable this testcase, PR5997 llvm-svn: 93206	2010-01-11 23:18:33 +00:00
Chris Lattner	9518869423	add one more bitfield optimization, allowing clang to generate good code on PR4216: _test_bitfield: ## @test_bitfield orl $32962, %edi movl $4294941946, %eax andq %rdi, %rax ret instead of: _test_bitfield: movl $4294941696, %ecx movl %edi, %eax orl $194, %edi orl $32768, %eax andq $250, %rdi andq %rax, %rcx movq %rdi, %rax orq %rcx, %rax ret Evan is looking into the remaining andq+imm -> andl optimization. llvm-svn: 93147	2010-01-11 06:55:24 +00:00
Chris Lattner	0a85420409	Extend CanEvaluateZExtd to handle and/or/xor more aggressively in the BitsToClear case. This allows it to promote expressions which have an and/or/xor after the lshr, promoting cases like test2 (from PR4216) and test3 (random extample extracted from a spec benchmark). clang now compiles the code in PR4216 into: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx orl $32768, %edi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret instead of: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx shrl $8, %edi orl $128, %edi shlq $8, %rdi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret which is still not great, but is progress. llvm-svn: 93145	2010-01-11 04:05:13 +00:00
Chris Lattner	12bd8992b3	Remove the dead TD argument to CanEvaluateZExtd, and add a new BitsToClear result which allows us to start promoting expressions that end with a lshr-by-constant. This is conservatively correct and better than what we had before (see testcases) but still needs to be extended further. llvm-svn: 93144	2010-01-11 03:32:00 +00:00
Chris Lattner	7dd540ee24	teach sext optimization to handle truncs from types that are not the dest of the sext. llvm-svn: 93128	2010-01-10 20:30:41 +00:00
Chris Lattner	39d2daa94c	teach zext optimization how to deal with truncs that don't come from the zext dest type. This allows us to handle test52/53 in cast.ll, and allows llvm-gcc to generate much better code for PR4216 in -m64 mode: _test_bitfield: ## @test_bitfield orl $32962, %edi movl %edi, %eax andl $-25350, %eax ret This also fixes a bug handling vector extends, ensuring that the mask produced is a vector constant, not an integer constant. llvm-svn: 93127	2010-01-10 20:25:54 +00:00
Chris Lattner	2fff10c424	now that the cost model has changed, we can always consider elimination of a sign extend to be a win, which simplifies the client of CanEvaluateSExtd, and allows us to eliminate more casts (examples taken from real code). llvm-svn: 93109	2010-01-10 07:40:50 +00:00
Chris Lattner	d8509424a4	change the preferred canonical form for a sign extension to be lshr+ashr instead of trunc+sext. We want to avoid type conversions whenever possible, it is easier to codegen expressions without truncates and extensions. llvm-svn: 93107	2010-01-10 07:08:30 +00:00
Chris Lattner	49d2c9764d	two changes: 1) don't try to optimize a sext or zext that is only used by a trunc, let the trunc get optimized first. This avoids some pointless effort in some common cases since instcombine scans down a block in the first pass. 2) Change the cost model for zext elimination to consider an 'and' cheaper than a zext. This allows us to do it more aggressively, and for the next patch to simplify the code quite a bit. llvm-svn: 93097	2010-01-10 02:39:31 +00:00
Chris Lattner	f0af17dab3	enhance CanEvaluateZExtd to handle shift left and sext, allowing more expressions to be promoted and casts eliminated. llvm-svn: 93096	2010-01-10 02:22:12 +00:00
Chris Lattner	a1e223ea10	teach instcombine to delete sign extending shift pairs (sra(shl X, C), C) when the input is already sign extended. llvm-svn: 93019	2010-01-08 19:04:21 +00:00
Chris Lattner	35d3b9dcd0	teach ComputeNumSignBits to look through PHI nodes. llvm-svn: 92964	2010-01-07 23:44:37 +00:00
Chris Lattner	7b6a068801	filecheckize llvm-svn: 92963	2010-01-07 23:42:23 +00:00
Chris Lattner	3057c37959	Enhance instcombine to reason more strongly about promoting computation that feeds into a zext, similar to the patch I did yesterday for sext. There is a lot of room for extension beyond this patch. llvm-svn: 92962	2010-01-07 23:41:00 +00:00
Chris Lattner	98748c0964	Teach instcombine's sext elimination logic to be more aggressive. Previously, instcombine would only promote an expression tree to the larger type if doing so eliminated two casts. This is because a need to manually do the sign extend after the promoted expression tree with two shifts. Now, we keep track of whether the result of the computation is going to be properly sign extended already. If so, we can unconditionally promote the expression, which allows us to zap more sext's. This implements rdar://6598839 (aka gcc pr38751) llvm-svn: 92815	2010-01-06 01:56:21 +00:00
Chris Lattner	a93c63c22d	more rearrangement and cleanup, fix my test failure. llvm-svn: 92792	2010-01-05 22:21:18 +00:00
Chris Lattner	f88dd5ed64	remove two trunc xforms that are subsumed by EvaluateInDifferentType. The only difference is that EvaluateInDifferentType checks to ensure they are profitable before doing them :) llvm-svn: 92788	2010-01-05 22:01:41 +00:00
Chris Lattner	7c0c64831c	merge some tests. llvm-svn: 92786	2010-01-05 21:54:09 +00:00
Chris Lattner	dec34d8cef	merge cast2 into cast.ll llvm-svn: 92784	2010-01-05 21:48:13 +00:00
Chris Lattner	f312cfa250	remove useless test. llvm-svn: 92782	2010-01-05 21:46:22 +00:00
Chris Lattner	16c7a332d2	another example. llvm-svn: 92781	2010-01-05 21:43:08 +00:00
Chris Lattner	9068136944	remove a useless negative test, add a rdar # to an xfail that I'm working on. llvm-svn: 92777	2010-01-05 21:37:44 +00:00
Chris Lattner	e3e6f4ce62	clean up tests. llvm-svn: 92776	2010-01-05 21:32:59 +00:00
Chris Lattner	44a63815b9	just remove this xform which is subsumed by others. llvm-svn: 92775	2010-01-05 21:16:30 +00:00
Chris Lattner	54f4e39956	optimize comparisons against cttz/ctlz/ctpop, patch by Alastair Lynn! llvm-svn: 92745	2010-01-05 18:09:56 +00:00
Dan Gohman	fb4193625a	Delete useless trailing semicolons. llvm-svn: 92740	2010-01-05 17:55:26 +00:00
Chris Lattner	9da1cb243b	optimize cttz and ctlz when we can prove something about the leading/trailing bits. Patch by Alastair Lynn! llvm-svn: 92706	2010-01-05 07:23:56 +00:00
Chris Lattner	a751d09c08	Truncate GEP indexes larger than the pointer size down to pointer size when doing this transform if the GEP is not inbounds. No testcase because it is very difficult to trigger this: instcombine already canonicalizes GEP indices to pointer size, so it relies specific permutations of the instcombine worklist. Thanks to Duncan for pointing this possible problem out. llvm-svn: 92495	2010-01-04 18:57:15 +00:00
Chris Lattner	2d91231d82	implement an instcombine xform needed by clang's codegen on the example in PR4216. This doesn't trigger in the testsuite, so I'd really appreciate someone scrutinizing the logic for correctness. llvm-svn: 92458	2010-01-04 06:03:59 +00:00
Chris Lattner	fca0c8f93a	generalize the previous transformation to handle indexing into arrays of structs and other arrays, so long as all the subsequent indexes are constants. This triggers frequently for stuff like: @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]> [#uses=50] %623 = getelementptr inbounds [29 x [2 x i32]] @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1] %684 = icmp eq i32 %683, 999 also for the "my_defs" table in 'gs', etc. llvm-svn: 92444	2010-01-03 03:03:27 +00:00
Chris Lattner	98ad2b56cc	teach instcombine to optimize idioms like A[i]&42 == 0. This occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which is copied in multiple apps) in _sch_istable, etc. llvm-svn: 92427	2010-01-02 22:08:28 +00:00
Chris Lattner	b56bef45f8	Teach the table lookup optimization to generate range compares when a consequtive sequence of elements all satisfies the predicate. Like the double compare case, this generates better code than the magic constant case and generalizes to more than 32/64 element array lookups. Here are some examples where it triggers. From 403.gcc, most accesses to the rtx_class array are handled, e.g.: @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=547] %142 = icmp eq i8 %141, 105 @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=543] %165 = icmp eq i8 %164, 60 Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) optimized before are actually range compares. This lets 32-bit machines optimize them. 400.perlbmk has stuff like this: 400.perlbmk: PL_regkind, even for 32-bit: @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]> [#uses=4] %811 = icmp ne i8 %810, 33 @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]> [#uses=94] %12 = icmp ult i8 %10, 2 etc. llvm-svn: 92426	2010-01-02 21:50:18 +00:00
Nick Lewycky	a67519be12	Fix logic error in previous commit. The != case needs to become an or, not an and. llvm-svn: 92419	2010-01-02 16:14:56 +00:00
Nick Lewycky	357d41b3c1	Optimize pointer comparison into the typesafe form, now that the backends will handle them efficiently. This is the opposite direction of the transformation we used to have here. llvm-svn: 92418	2010-01-02 15:25:44 +00:00
Chris Lattner	cfda435c73	Generalize the previous xform to handle cases where exactly two elements match or don't match with two comparisons. For example, the testcase compiles into: define i1 @test5(i32 %X) { %1 = icmp eq i32 %X, 2 ; <i1> [#uses=1] %2 = icmp eq i32 %X, 7 ; <i1> [#uses=1] %R = or i1 %1, %2 ; <i1> [#uses=1] ret i1 %R } This generalizes the previous xforms when the array is larger than 64 elements (and this case matches) and generates better code for cases where it overlaps with the magic bitshift case. This generalizes more cases than you might expect. For example, 400.perlbmk has: @PL_utf8skip = constant [256 x i8] c"\01\01\01\... %15 = icmp ult i8 %7, 7 403.gcc has: @rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ... %18 = icmp eq i16 %16, 295 and xalancbmk has a bunch of examples, such as _ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE. llvm-svn: 92417	2010-01-02 09:35:17 +00:00
Chris Lattner	935a4a606a	enhance the compare/load/index optimization to work on any load from a global with 32/64 elements or less (depending on whether i64 is native on the target), generating a bitshift idiom to determine the result. For example, on test4 we produce: define i1 @test4(i32 %X) { %1 = lshr i32 933, %X ; <i32> [#uses=1] %2 = and i32 %1, 1 ; <i32> [#uses=1] %R = icmp ne i32 %2, 0 ; <i1> [#uses=1] ret i1 %R } This triggers in a number of interesting cases, for example, here's an fp case: @A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]> [#uses=7] ... %7 = fcmp olt double %3, 0.000000e+00 In this case we make the slen2_tab global dead, which is nice: @slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]> [#uses=1] ... %204 = icmp eq i32 %46, 0 Perl has a bunch of these, also on the 'Perl_regkind' array: @Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]> [#uses=1] ... %1364 = icmp eq i16 %1361, 0 186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this: @white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]> [#uses=2] However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc. go 64-bit machines :) llvm-svn: 92415	2010-01-02 08:56:52 +00:00
Chris Lattner	b1567bd584	enhance the previous optimization to work with fcmp in addition to icmp. llvm-svn: 92412	2010-01-02 08:20:51 +00:00
Chris Lattner	a061859ccc	Teach instcombine to fold compares of loads from constant arrays with variable indices into a comparison of the index with a constant. The most common occurrence of this that I see by far is stuff like: if ("foobar"[i] == '\0') ... which we compile into: if (i == 6), saving a load and materialization of the global address. This also exposes loop trip count information to later passes in many cases. This triggers hundreds of times in xalancbmk, which is where I first noticed it, but it also triggers in many other apps. Here are a few interesting ones from various apps: @must_be_connected_without = internal constant [8 x i8] [i8 getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8]> [#uses=2] %scevgep.i = getelementptr [8 x i8] @must_be_connected_without, i64 0, i64 %indvar.i ; <i8*> [#uses=1] %17 = load ... %18 = icmp eq i8 %17, null ; <i1> [#uses=1] -> icmp eq i64 %indvar.i, 7 @yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]> [#uses=2] %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8> [#uses=1] %mode.0.in = getelementptr inbounds [9 x i32] @mb_mode_table, i64 0, i64 %.pn ; <i32> [#uses=1] load ... %64 = icmp eq i8 %58, 4 ; <i1> [#uses=1] -> icmp eq i64 %.pn, 35 ; <i1> [#uses=0] @gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767] %scevgep.i = getelementptr [4 x i16] @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1] %425 = load %scevgep.i %426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0] -> false llvm-svn: 92411	2010-01-02 08:12:04 +00:00
Chris Lattner	2e4be2c340	remove the instcombine transformations that are inserting nasty pointer to int casts that confuse later optimizations. See PR3351 for details. This improves but doesn't complete fix 483.xalancbmk because llvm-gcc does this xform in GCC's "fold" routine as well. Clang++ will do better I guess. llvm-svn: 92408	2010-01-02 00:31:05 +00:00
Chris Lattner	faf1337acb	add a simple instcombine xform, simplify another one to use hasAllZeroIndices() instead of hand rolling a loop. llvm-svn: 92403	2010-01-01 23:09:08 +00:00
Chris Lattner	30c0a2833d	generalize the pointer difference optimization to handle a constantexpr gep on the 'base' side of the expression. This completes comment #4 in PR3351, which comes from 483.xalancbmk. llvm-svn: 92402	2010-01-01 22:42:29 +00:00
Chris Lattner	4394f71752	teach instcombine to optimize pointer difference idioms involving constant expressions. This is a step towards comment #4 in PR3351. llvm-svn: 92401	2010-01-01 22:29:12 +00:00
Chris Lattner	25c87e9cf9	implement the transform requested in PR5284 llvm-svn: 92398	2010-01-01 18:34:40 +00:00
Chris Lattner	8330daf733	add a few trivial instcombines for llvm.powi. llvm-svn: 92383	2010-01-01 01:52:15 +00:00
Chris Lattner	7bc85a931e	add check lines for min/max tests. llvm-svn: 91816	2009-12-21 06:08:50 +00:00
Chris Lattner	33269813df	really convert this to filecheck. llvm-svn: 91815	2009-12-21 06:06:10 +00:00
Chris Lattner	d4fb4296df	give instcombine some helper functions for matching MIN and MAX, and implement some optimizations for MIN(MIN()) and MAX(MAX()) and MIN(MAX()) etc. This substantially improves the code in PR5822 but doesn't kick in much elsewhere. 2 max's were optimized in pairlocalalign and one in smg2000. llvm-svn: 91814	2009-12-21 06:03:05 +00:00
Chris Lattner	7a72d50de7	filecheckize llvm-svn: 91813	2009-12-21 05:53:13 +00:00
Chris Lattner	ffbd02829c	enhance x-(-A) -> x+A to preserve NUW/NSW. Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in cases where it would otherwise be undefined behavior. Surprisingly (to me at least), this triggers hundreds of the times in a few benchmarks: lencode, ldecode, and 466.h264ref seem to really like this. llvm-svn: 91812	2009-12-21 04:04:05 +00:00
Chris Lattner	900ce231f9	Optimize all cases of "icmp (X+Cst), X" to something simpler. This triggers a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others. It is also the first part of PR5822 llvm-svn: 91811	2009-12-21 03:19:28 +00:00
Chris Lattner	d18b455086	convert to filecheck llvm-svn: 91810	2009-12-21 03:11:05 +00:00
Chris Lattner	4ad5eba568	fix PR5827 by disabling the phi slicing transformation in a case where instcombine would have to split a critical edge due to a phi node of an invoke. Since instcombine can't change the CFG, it has to bail out from doing the transformation. llvm-svn: 91763	2009-12-19 07:01:15 +00:00
Eli Friedman	86b9d75dc8	Optimize icmp of null and select of two constants even if the select has multiple uses. (The construct in question was found in gcc.) llvm-svn: 91675	2009-12-18 08:22:35 +00:00
Eli Friedman	250b119d98	Allow instcombine to combine "sext(a) >u const" to "a >u trunc(const)". llvm-svn: 91631	2009-12-17 22:42:29 +00:00
Eli Friedman	7cc86b4cc6	Make the ptrtoint comparison simplification work if one side is a global. llvm-svn: 91624	2009-12-17 21:27:47 +00:00
Eli Friedman	5842c9968a	Slightly generalize transformation of memmove(a,a,n) so that it also applies to memcpy. (Such a memcpy is technically illegal, but in practice is safe and is generated by struct self-assignment in C code.) llvm-svn: 91621	2009-12-17 21:07:31 +00:00
Eli Friedman	e67cae33e1	Aggressively flip compare constant expressions where appropriate; constant folding in particular expects null to be on the RHS. llvm-svn: 91587	2009-12-17 06:07:04 +00:00
Benjamin Kramer	401e6093c9	Fix some CHECK lines which were ignored by accident. llvm-svn: 91214	2009-12-12 09:25:50 +00:00
Nick Lewycky	a0e9d700dc	Generalize this optimization to work on equality comparisons between any two integers that are constant except for a single bit (the same n-th bit in each). llvm-svn: 90646	2009-12-05 05:00:00 +00:00
Chris Lattner	77c36d68f3	fix PR5673 by being more careful about pointers to functions. llvm-svn: 90369	2009-12-03 01:05:45 +00:00
Chris Lattner	4ca1981e82	merge sext-2 into sext.ll llvm-svn: 90293	2009-12-02 05:34:35 +00:00
Chris Lattner	0a12a8f9fe	rename test llvm-svn: 90292	2009-12-02 05:32:33 +00:00
Chris Lattner	fe206d2a13	filecheckize llvm-svn: 90291	2009-12-02 05:32:16 +00:00
Mon P Wang	bb3eac9e7a	Fixed an assertion failure for tracking sext of a vector of integers llvm-svn: 90290	2009-12-02 04:59:58 +00:00
Nick Lewycky	e35e6f097d	Teach ConstantFolding to do a better job when folding gep(bitcast). This permits the devirtualization of llvm.org/PR3100#c9 when compiled by clang. llvm-svn: 90099	2009-11-29 21:40:55 +00:00
Chris Lattner	1cc4cca193	add testcases for the foo_with_overflow op xforms added recently and fix bugs exposed by the tests. Testcases from Alastair Lynn! llvm-svn: 90056	2009-11-29 02:57:29 +00:00
Chris Lattner	cd261c9c26	Implement PR5634. llvm-svn: 90046	2009-11-29 00:51:17 +00:00
Chris Lattner	a73ecf0b00	Fix PR5471 by removing an instcombine xform. Some pieces of the code generates store to undef and some generates store to null as the idiom for undefined behavior. Since simplifycfg zaps both, don't remove the undefined behavior in instcombine. llvm-svn: 89971	2009-11-26 22:04:42 +00:00
Dan Gohman	580b80d6d9	Make ConstantFoldConstantExpression recursively visit the entire ConstantExpr, not just the top-level operator. This allows it to fold many more constants. Also, make GlobalOpt call ConstantFoldConstantExpression on GlobalVariable initializers. llvm-svn: 89659	2009-11-23 16:22:21 +00:00
Nick Lewycky	922d4ab574	Reapply r88830 with a bugfix: this transform only applies to icmp eq/ne. This fixes part of PR5438. llvm-svn: 89639	2009-11-23 03:17:33 +00:00
Nick Lewycky	95148689c9	Revert r88830 and r88831 which appear to have caused a selfhost buildbot some grief. I suspect this patch merely exposed a bug else. llvm-svn: 88841	2009-11-15 07:47:32 +00:00
Nick Lewycky	6a6ac7e105	Correct typo. llvm-svn: 88831	2009-11-15 06:16:57 +00:00
Nick Lewycky	e29fa4c7a1	Teach instcombine to look for booleans in wider integers when it encounters a zext(icmp). It may be able to optimize that away. This fixes one of the cases in PR5438. llvm-svn: 88830	2009-11-15 05:55:17 +00:00
Duncan Sands	ba61fed5d3	Don't trivially delete unused calls to llvm.invariant.start. This allows llvm.invariant.start to be used without necessarily being paired with a call to llvm.invariant.end. If you run the entire optimization pipeline then such calls are in fact deleted (adce does it), but that's actually a good thing since we probably do want them to be zapped late in the game. There should really be an integration test that checks that the llvm.invariant.start call lasts long enough that all passes that do interesting things with it get to do their stuff before it is deleted. But since no passes do anything interesting with it yet this will have to wait for later. llvm-svn: 86840	2009-11-11 15:34:13 +00:00
Chris Lattner	1559bedcc7	unify the code that determines whether it is a good idea to change the type of a computation. This fixes some infinite loops when dealing with TD that has no native types. llvm-svn: 86670	2009-11-10 07:23:37 +00:00
Chris Lattner	39c07b2eef	if a 'with overflow' intrinsic just has the normal result used, simplify it to a normal binop. Patch by Alastair Lynn, testcase by me. llvm-svn: 86524	2009-11-09 07:07:56 +00:00
Chris Lattner	0685be3441	enhance PHI slicing to handle the case when a slicable PHI is begin used by a chain of other PHIs. llvm-svn: 86503	2009-11-09 01:38:00 +00:00
Chris Lattner	2299d4b6d8	Teach an instcombine to not pull trunc instructions through PHI nodes when both the source and dest are illegal types, since it would cause the phi to grow (for example, we shouldn't transform test14b's phi to a phi on i320). This fixes an infinite loop on i686 bootstrap with phi slicing turned on, so turn it back on. llvm-svn: 86483	2009-11-08 21:20:06 +00:00
Chris Lattner	a837e4db6b	reapply r8644[3-5] with only the scary part (SliceUpIllegalIntegerPHI) disabled. llvm-svn: 86480	2009-11-08 19:23:30 +00:00
Daniel Dunbar	4c41373c56	Speculatively revert r8644[3-5], they seem to be leading to infinite loops in llvm-gcc bootstrap. llvm-svn: 86478	2009-11-08 17:52:47 +00:00
Chris Lattner	99db7963b4	another more interesting test. llvm-svn: 86445	2009-11-08 08:36:40 +00:00
Chris Lattner	7c8b29ef61	feature test for the new transformation in r86443 llvm-svn: 86444	2009-11-08 08:30:58 +00:00
Chris Lattner	c7a450b5b2	teach a couple of instcombine transformations involving PHIs to not turn a PHI in a legal type into a PHI of an illegal type, and add a new optimization that breaks up insane integer PHI nodes into small pieces (PR3451). llvm-svn: 86443	2009-11-08 08:21:13 +00:00
Chris Lattner	c77d24b792	make instcombine only rewrite a chain of computation (eliminating some extends) if the new type of the computation is legal or if both the source and dest are illegal. This prevents instcombine from changing big chains of computation into i64 on 32-bit targets for example. llvm-svn: 86398	2009-11-07 19:11:46 +00:00
Chris Lattner	cb3c64ee3c	move two functions up higher in the file. Delete a useless argument to EmitGEPOffset. Implement some new transforms for optimizing subtracts of two pointer to ints into the same vector. This happens for C++ iterator idioms for example, stringmap takes a const char* that points to the start and end of a string. Once inlined, we want the pointer difference to turn back into a length. This is rdar://7362831. llvm-svn: 86021	2009-11-04 08:05:20 +00:00
Chris Lattner	e3cdf2ed3b	filecheckize this test. llvm-svn: 86020	2009-11-04 07:57:05 +00:00
Kenneth Uildriks	90fedc6ef9	Make opt default to not adding a target data string and update tests that depend on target data to supply it within the test llvm-svn: 85900	2009-11-03 15:29:06 +00:00
Chris Lattner	3cd6a61b27	fix instcombine to only do store sinking when the alignments of the two loads agree. Propagate that onto the new store. llvm-svn: 85772	2009-11-02 02:06:37 +00:00
Chris Lattner	db3311edc7	merge a test into store.ll llvm-svn: 85771	2009-11-02 02:00:18 +00:00
Chris Lattner	d263dbec7a	convert to filecheck llvm-svn: 85770	2009-11-02 01:58:03 +00:00
Chris Lattner	3e6398baa5	merge phi-merge.ll into phi.ll I don't know what Dan wants to do with phi-merge-gep.ll, I'll let him deal with it because instcombine may end up sinking these. llvm-svn: 85739	2009-11-01 20:10:11 +00:00
Chris Lattner	328ef89bd1	when merging two loads, make sure to take the min of their alignment, not the max. This didn't matter until the previous patch because instcombine would refuse to sink loads with differenting alignments. llvm-svn: 85738	2009-11-01 20:07:07 +00:00
Chris Lattner	0b40a8bc0e	fix a bug noticed by inspection: when instcombine sinks loads through phis, it didn't preserve the alignment of the load. This is a missed optimization of the alignment is high and a miscompilation when the alignment is low. llvm-svn: 85736	2009-11-01 19:50:13 +00:00
Chris Lattner	d162b5c955	convert to filecheck. llvm-svn: 85734	2009-11-01 19:22:20 +00:00
Edward O'Callaghan	e45ac76ee4	Convert a few tests to FileCheck for PR5307. llvm-svn: 85171	2009-10-26 22:52:03 +00:00
Dan Gohman	672927f393	Code that checks WillNotOverflowSignedAdd before creating an Add can safely use the NSW bit on the Add. llvm-svn: 85164	2009-10-26 22:14:22 +00:00
Chris Lattner	683eed3286	reapply r85085 with a bugfix to avoid infinite looping. All of the 'demorgan' related xforms need to use dyn_castNotVal, not m_Not. llvm-svn: 85119	2009-10-26 15:40:07 +00:00
Evan Cheng	8014a728b9	Revert 85085. It causes infinite looping during llvm-gcc build. llvm-svn: 85090	2009-10-26 03:51:32 +00:00

1 2 3 4 5 ...

717 Commits