llvm-project

Commit Graph

Author	SHA1	Message	Date
Devang Patel	d146e2e3df	If a scope has only one instruction then first instruction is also the last instruction. llvm-svn: 92736	2010-01-05 16:59:17 +00:00
Chris Lattner	9da1cb243b	optimize cttz and ctlz when we can prove something about the leading/trailing bits. Patch by Alastair Lynn! llvm-svn: 92706	2010-01-05 07:23:56 +00:00
Chris Lattner	f741d72b84	fix an infinite loop in reassociate building emacs. llvm-svn: 92679	2010-01-05 04:55:35 +00:00
Devang Patel	be94f23992	Remove dead debug info intrinsics. Intrinsic::dbg_stoppoint Intrinsic::dbg_region_start Intrinsic::dbg_region_end Intrinsic::dbg_func_start AutoUpgrade simply ignores these intrinsics now. llvm-svn: 92557	2010-01-05 01:10:40 +00:00
Devang Patel	e6433faba6	Fix debug_inlined section entries for routines whose names are changed through __asm() extension. llvm-svn: 92533	2010-01-04 23:04:36 +00:00
Dan Gohman	8c63ee7e28	Make this test more portable. llvm-svn: 92514	2010-01-04 21:23:34 +00:00
Devang Patel	63cdd6fcf3	Remove oversimplified test case. llvm-svn: 92510	2010-01-04 20:54:06 +00:00
Dan Gohman	52183c3cc9	Add some tests and update an existing test to reflect recent x86 isel peeps. llvm-svn: 92509	2010-01-04 20:53:54 +00:00
Devang Patel	a7c8e58d95	The test, derived from optimzed IR, does not mention "bar" in debug info anywhere so the dwarf writer is not expected to emit any debug info for function "bar". llvm-svn: 92499	2010-01-04 19:41:13 +00:00
Chris Lattner	a751d09c08	Truncate GEP indexes larger than the pointer size down to pointer size when doing this transform if the GEP is not inbounds. No testcase because it is very difficult to trigger this: instcombine already canonicalizes GEP indices to pointer size, so it relies specific permutations of the instcombine worklist. Thanks to Duncan for pointing this possible problem out. llvm-svn: 92495	2010-01-04 18:57:15 +00:00
Anton Korobeynikov	d91a14dba5	Fix invalid chain folding for memory variant of sdiv / udiv llvm-svn: 92472	2010-01-04 10:31:54 +00:00
Chris Lattner	2d91231d82	implement an instcombine xform needed by clang's codegen on the example in PR4216. This doesn't trigger in the testsuite, so I'd really appreciate someone scrutinizing the logic for correctness. llvm-svn: 92458	2010-01-04 06:03:59 +00:00
Chris Lattner	1dae8766b1	fix PR5930, allowing the asmprinter to emit difference between two labels as a truncate. llvm-svn: 92455	2010-01-03 18:33:18 +00:00
Chris Lattner	f6a585fc2f	add PR# llvm-svn: 92451	2010-01-03 18:10:58 +00:00
Chris Lattner	a7cfc43af8	differences between two blockaddress's don't cause a global variable initializer to require relocations. llvm-svn: 92450	2010-01-03 18:09:40 +00:00
Chris Lattner	fca0c8f93a	generalize the previous transformation to handle indexing into arrays of structs and other arrays, so long as all the subsequent indexes are constants. This triggers frequently for stuff like: @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]> [#uses=50] %623 = getelementptr inbounds [29 x [2 x i32]] @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1] %684 = icmp eq i32 %683, 999 also for the "my_defs" table in 'gs', etc. llvm-svn: 92444	2010-01-03 03:03:27 +00:00
Chris Lattner	98ad2b56cc	teach instcombine to optimize idioms like A[i]&42 == 0. This occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which is copied in multiple apps) in _sch_istable, etc. llvm-svn: 92427	2010-01-02 22:08:28 +00:00
Chris Lattner	b56bef45f8	Teach the table lookup optimization to generate range compares when a consequtive sequence of elements all satisfies the predicate. Like the double compare case, this generates better code than the magic constant case and generalizes to more than 32/64 element array lookups. Here are some examples where it triggers. From 403.gcc, most accesses to the rtx_class array are handled, e.g.: @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=547] %142 = icmp eq i8 %141, 105 @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=543] %165 = icmp eq i8 %164, 60 Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) optimized before are actually range compares. This lets 32-bit machines optimize them. 400.perlbmk has stuff like this: 400.perlbmk: PL_regkind, even for 32-bit: @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]> [#uses=4] %811 = icmp ne i8 %810, 33 @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]> [#uses=94] %12 = icmp ult i8 %10, 2 etc. llvm-svn: 92426	2010-01-02 21:50:18 +00:00
Nick Lewycky	a67519be12	Fix logic error in previous commit. The != case needs to become an or, not an and. llvm-svn: 92419	2010-01-02 16:14:56 +00:00
Nick Lewycky	357d41b3c1	Optimize pointer comparison into the typesafe form, now that the backends will handle them efficiently. This is the opposite direction of the transformation we used to have here. llvm-svn: 92418	2010-01-02 15:25:44 +00:00
Chris Lattner	cfda435c73	Generalize the previous xform to handle cases where exactly two elements match or don't match with two comparisons. For example, the testcase compiles into: define i1 @test5(i32 %X) { %1 = icmp eq i32 %X, 2 ; <i1> [#uses=1] %2 = icmp eq i32 %X, 7 ; <i1> [#uses=1] %R = or i1 %1, %2 ; <i1> [#uses=1] ret i1 %R } This generalizes the previous xforms when the array is larger than 64 elements (and this case matches) and generates better code for cases where it overlaps with the magic bitshift case. This generalizes more cases than you might expect. For example, 400.perlbmk has: @PL_utf8skip = constant [256 x i8] c"\01\01\01\... %15 = icmp ult i8 %7, 7 403.gcc has: @rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ... %18 = icmp eq i16 %16, 295 and xalancbmk has a bunch of examples, such as _ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE. llvm-svn: 92417	2010-01-02 09:35:17 +00:00
Chris Lattner	935a4a606a	enhance the compare/load/index optimization to work on any load from a global with 32/64 elements or less (depending on whether i64 is native on the target), generating a bitshift idiom to determine the result. For example, on test4 we produce: define i1 @test4(i32 %X) { %1 = lshr i32 933, %X ; <i32> [#uses=1] %2 = and i32 %1, 1 ; <i32> [#uses=1] %R = icmp ne i32 %2, 0 ; <i1> [#uses=1] ret i1 %R } This triggers in a number of interesting cases, for example, here's an fp case: @A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]> [#uses=7] ... %7 = fcmp olt double %3, 0.000000e+00 In this case we make the slen2_tab global dead, which is nice: @slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]> [#uses=1] ... %204 = icmp eq i32 %46, 0 Perl has a bunch of these, also on the 'Perl_regkind' array: @Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]> [#uses=1] ... %1364 = icmp eq i16 %1361, 0 186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this: @white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]> [#uses=2] However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc. go 64-bit machines :) llvm-svn: 92415	2010-01-02 08:56:52 +00:00
Chris Lattner	b1567bd584	enhance the previous optimization to work with fcmp in addition to icmp. llvm-svn: 92412	2010-01-02 08:20:51 +00:00
Chris Lattner	a061859ccc	Teach instcombine to fold compares of loads from constant arrays with variable indices into a comparison of the index with a constant. The most common occurrence of this that I see by far is stuff like: if ("foobar"[i] == '\0') ... which we compile into: if (i == 6), saving a load and materialization of the global address. This also exposes loop trip count information to later passes in many cases. This triggers hundreds of times in xalancbmk, which is where I first noticed it, but it also triggers in many other apps. Here are a few interesting ones from various apps: @must_be_connected_without = internal constant [8 x i8] [i8 getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8]> [#uses=2] %scevgep.i = getelementptr [8 x i8] @must_be_connected_without, i64 0, i64 %indvar.i ; <i8*> [#uses=1] %17 = load ... %18 = icmp eq i8 %17, null ; <i1> [#uses=1] -> icmp eq i64 %indvar.i, 7 @yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]> [#uses=2] %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8> [#uses=1] %mode.0.in = getelementptr inbounds [9 x i32] @mb_mode_table, i64 0, i64 %.pn ; <i32> [#uses=1] load ... %64 = icmp eq i8 %58, 4 ; <i1> [#uses=1] -> icmp eq i64 %.pn, 35 ; <i1> [#uses=0] @gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767] %scevgep.i = getelementptr [4 x i16] @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1] %425 = load %scevgep.i %426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0] -> false llvm-svn: 92411	2010-01-02 08:12:04 +00:00
Chris Lattner	2e4be2c340	remove the instcombine transformations that are inserting nasty pointer to int casts that confuse later optimizations. See PR3351 for details. This improves but doesn't complete fix 483.xalancbmk because llvm-gcc does this xform in GCC's "fold" routine as well. Clang++ will do better I guess. llvm-svn: 92408	2010-01-02 00:31:05 +00:00
Chris Lattner	909c71c96a	allow this to work on linux hosts. llvm-svn: 92407	2010-01-02 00:22:15 +00:00
Chris Lattner	1eea3b0ada	Teach codegen to handle: (X != null) \| (Y != null) --> (X\|Y) != 0 (X == null) & (Y == null) --> (X\|Y) == 0 so that instcombine can stop doing this for pointers. This is part of PR3351, which is a case where instcombine doing this for pointers (inserting ptrtoint) is pessimizing code. llvm-svn: 92406	2010-01-02 00:00:03 +00:00
Chris Lattner	6eef072eb6	rename file. llvm-svn: 92405	2010-01-01 23:55:04 +00:00
Chris Lattner	faf1337acb	add a simple instcombine xform, simplify another one to use hasAllZeroIndices() instead of hand rolling a loop. llvm-svn: 92403	2010-01-01 23:09:08 +00:00
Chris Lattner	30c0a2833d	generalize the pointer difference optimization to handle a constantexpr gep on the 'base' side of the expression. This completes comment #4 in PR3351, which comes from 483.xalancbmk. llvm-svn: 92402	2010-01-01 22:42:29 +00:00
Chris Lattner	4394f71752	teach instcombine to optimize pointer difference idioms involving constant expressions. This is a step towards comment #4 in PR3351. llvm-svn: 92401	2010-01-01 22:29:12 +00:00
Chris Lattner	25c87e9cf9	implement the transform requested in PR5284 llvm-svn: 92398	2010-01-01 18:34:40 +00:00
Chris Lattner	39f18e545e	Teach codegen to lower llvm.powi to an efficient (but not optimal) multiply sequence when the power is a constant integer. Before, our codegen for std::pow(.., int) always turned into a libcall, which was really inefficient. This should also make many gfortran programs happier I'd imagine. llvm-svn: 92388	2010-01-01 03:32:16 +00:00
Chris Lattner	5967840a5f	Make this more likely to generate a libcall. llvm-svn: 92387	2010-01-01 03:26:51 +00:00
Chris Lattner	8330daf733	add a few trivial instcombines for llvm.powi. llvm-svn: 92383	2010-01-01 01:52:15 +00:00
Chris Lattner	0c59ac3f41	When factoring multiply expressions across adds, factor both positive and negative forms of constants together. This allows us to compile: int foo(int x, int y) { return (x-y) + (x-y) + (x-y); } into: _foo: ## @foo subl %esi, %edi leal (%rdi,%rdi,2), %eax ret instead of (where the 3 and -3 were not factored): _foo: imull $-3, 8(%esp), %ecx imull $3, 4(%esp), %eax addl %ecx, %eax ret this started out as: movl 12(%ebp), %ecx imull $3, 8(%ebp), %eax subl %ecx, %eax subl %ecx, %eax subl %ecx, %eax ret This comes from PR5359. llvm-svn: 92381	2010-01-01 01:13:15 +00:00
Chris Lattner	2f03e64094	test case we alredy get right. llvm-svn: 92380	2010-01-01 00:50:00 +00:00
Chris Lattner	fed3397654	reuse negates where possible instead of always creating them from scratch. This allows us to optimize test12 into: define i32 @test12(i32 %X) { %factor = mul i32 %X, -3 ; <i32> [#uses=1] %Z = add i32 %factor, 6 ; <i32> [#uses=1] ret i32 %Z } instead of: define i32 @test12(i32 %X) { %Y = sub i32 6, %X ; <i32> [#uses=1] %C = sub i32 %Y, %X ; <i32> [#uses=1] %Z = sub i32 %C, %X ; <i32> [#uses=1] ret i32 %Z } llvm-svn: 92373	2009-12-31 20:34:32 +00:00
Chris Lattner	60b71b5c4d	teach reassociate to factor x+x+x -> x*3. While I'm at it, fix RemoveDeadBinaryOp to actually do something. llvm-svn: 92368	2009-12-31 19:24:52 +00:00
Chris Lattner	4e3a5678af	simple fix for an incorrect factoring which causes a miscompilation, PR5458. llvm-svn: 92354	2009-12-31 08:33:49 +00:00
Chris Lattner	2d3b53a68c	merge some more tests in. llvm-svn: 92353	2009-12-31 08:32:22 +00:00
Chris Lattner	19a4baa201	filecheckize llvm-svn: 92352	2009-12-31 08:29:56 +00:00
Chris Lattner	cac432c846	add some basic named MD tests. llvm-svn: 92336	2009-12-31 03:00:49 +00:00
Chris Lattner	c5c08899e4	fix two bogus tests that the asmparser now rejects. llvm-svn: 92303	2009-12-30 05:54:51 +00:00
Chris Lattner	28f1eebe3e	reimplement insertvalue/extractvalue metadata handling to not blindly accept invalid input. Actually add a testcase. llvm-svn: 92297	2009-12-30 05:14:00 +00:00
Chris Lattner	0f3bb7b25e	fix parsing of mdstring values. llvm-svn: 92290	2009-12-30 04:13:37 +00:00
Chris Lattner	596760d9bb	Each instruction is allowed to have multiple different metadata objects on them. Though the entire compiler supports this, the asmparser didn't. llvm-svn: 92270	2009-12-29 21:25:40 +00:00
Chris Lattner	93163c401e	Do not crash when .ll printing metadata that smells like debug info, but isn't. llvm-svn: 92268	2009-12-29 21:17:33 +00:00
Sanjiv Gupta	015215ca86	Extern declaration for unordered.f32 libcall was not being emitted. Fixed that. llvm-svn: 92242	2009-12-29 03:24:34 +00:00
Sanjiv Gupta	1ecffe13b2	Fixed llc crash for zext (i1 -> i8) loads. llvm-svn: 92201	2009-12-28 04:53:24 +00:00

1 2 3 4 5 ...

8893 Commits