Commit Graph

291 Commits

Author SHA1 Message Date
Chandler Carruth ef28abefd0 Simplify a README.txt entry significantly to expose the core issue.
llvm-svn: 123556
2011-01-16 01:40:23 +00:00
Chris Lattner b6c3aff1cb typo
llvm-svn: 123406
2011-01-13 22:11:56 +00:00
Chris Lattner b9cdf393a4 memcpy + metadata = bliss :)
llvm-svn: 123405
2011-01-13 22:08:15 +00:00
Chandler Carruth b1e7f557b7 Teach constant folding to perform conversions from constant floating
point values to their integer representation through the SSE intrinsic
calls. This is the last part of a README.txt entry for which I have real
world examples.

llvm-svn: 123206
2011-01-11 01:07:24 +00:00
Owen Anderson d490c2d2ae Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
2011-01-11 00:36:45 +00:00
Chris Lattner 78cdd2a6c6 +0.0 vs -0.0 differences can be handled by looking at the user of the
operation in some cases.

llvm-svn: 123190
2011-01-10 21:01:17 +00:00
Chris Lattner eef1455020 expand on a note
llvm-svn: 123145
2011-01-10 00:33:01 +00:00
Chris Lattner 5b358c6825 typo
llvm-svn: 123142
2011-01-09 23:48:41 +00:00
Chris Lattner 320370e3ca xref a PR #
llvm-svn: 123141
2011-01-09 23:42:22 +00:00
Chandler Carruth d011d5317c Add a note about the inability to model FP -> int conversions which
perform rounding other than truncation in the IR. Common C code for this
turns into really an LLVM intrinsic call that blocks a lot of further
optimizations.

llvm-svn: 123135
2011-01-09 22:36:18 +00:00
Chandler Carruth 0c68a668fa Add a note about a missed FP optimization.
llvm-svn: 123126
2011-01-09 21:00:19 +00:00
Chandler Carruth 82e6f6a325 Another missed memset in std::vector initialization.
llvm-svn: 123116
2011-01-09 11:29:57 +00:00
Chandler Carruth 43f6d1b67e Fix a cut-paste-o so that the sample code is correct for my last note.
Also, switch to a more clear 'sink' function with its declaration to
avoid any confusion about 'g'. Thanks for the suggestion Frits.

llvm-svn: 123113
2011-01-09 10:10:59 +00:00
Chandler Carruth ad6e1f0501 Another missed optimization of trivial vector code.
llvm-svn: 123112
2011-01-09 09:58:36 +00:00
Chandler Carruth f32619300a Add a note about vector's size-constructor producing dead stores.
llvm-svn: 123111
2011-01-09 09:58:33 +00:00
Chandler Carruth 5d684c17a7 Add a note about a missed memset optimization from std::fill.
llvm-svn: 123103
2011-01-09 01:32:55 +00:00
Benjamin Kramer 134cde912a Revert 122959, it needs more thought. Add it back to README.txt with additional notes.
llvm-svn: 123030
2011-01-07 20:42:20 +00:00
Chris Lattner 84184b7207 With Benjamin's recent amazing patches, we should be able to do even better things :)
llvm-svn: 122978
2011-01-06 22:25:00 +00:00
Benjamin Kramer 1e01ade2e8 Add a note from llvmdev, this time with more info.
llvm-svn: 122966
2011-01-06 17:35:50 +00:00
Benjamin Kramer 605f21a6c8 EarlyCSE does this now (and GVN always did it).
llvm-svn: 122960
2011-01-06 13:19:46 +00:00
Benjamin Kramer 799b011276 InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc.
llvm-svn: 122959
2011-01-06 13:11:05 +00:00
Chris Lattner 245de78e06 add a note about object size from drystone, add a poorly optimized loop from 179.art.
llvm-svn: 122954
2011-01-06 07:41:22 +00:00
Chris Lattner 73552c2cce add a trivial instcombine missed in Dhrystone
llvm-svn: 122953
2011-01-06 07:09:23 +00:00
Chris Lattner 51415d26f1 update a bunch of entries.
llvm-svn: 122700
2011-01-02 18:31:38 +00:00
Chris Lattner ddf58010bd Allow loop-idiom to run on multiple BB loops, but still only scan the loop
header for now for memset/memcpy opportunities.  It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
loops" into 2 basic block loops that loop-idiom was ignoring.

With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:

        for (j=0; j<MAX_history; ++j) {
          history_new[i][j+1] = history[2*i][j];
        }

Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine.  Woo.

llvm-svn: 122685
2011-01-02 07:58:36 +00:00
Chris Lattner 6c3fc0a52d a missed __builtin_object_size case.
llvm-svn: 122676
2011-01-01 22:57:31 +00:00
Chris Lattner e5d5a41a58 various updates.
llvm-svn: 122675
2011-01-01 22:52:11 +00:00
Duncan Sands 772749aea1 Revert commit 122654 at the request of Chris, who reckons that instsimplify
is the wrong hammer for this nail, and is probably right.

llvm-svn: 122661
2011-01-01 20:08:02 +00:00
Duncan Sands e3c539581c Fix a README item by having InstructionSimplify do a mild form of value
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.

llvm-svn: 122654
2011-01-01 16:12:09 +00:00
Chris Lattner 102bc01900 add a note from llvmdev
llvm-svn: 122603
2010-12-28 18:45:02 +00:00
Benjamin Kramer dfa40f8f19 Remove/fix invalid README entries. The well thought out strcpy function doesn't return a pointer to the end of the string.
llvm-svn: 122496
2010-12-23 15:32:07 +00:00
Chris Lattner 5e0c0c72e9 recognize an unsigned add with overflow idiom into uadd.
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret

llvm-svn: 122182
2010-12-19 19:37:52 +00:00
Chris Lattner 5174921b5b add another overflow idiom
llvm-svn: 121854
2010-12-15 07:28:58 +00:00
Chris Lattner 2e33985300 add a note about overflow idiom recognition.
llvm-svn: 121853
2010-12-15 07:25:55 +00:00
Chris Lattner 27ecda1efd add a shift/imul missed optimization
llvm-svn: 121850
2010-12-15 07:10:43 +00:00
Chris Lattner aded09f27f add a note about a SPEC hack that gcc mainline does.
llvm-svn: 121849
2010-12-15 06:38:24 +00:00
Chris Lattner 14cb11ddb2 add a note
llvm-svn: 121656
2010-12-13 00:15:25 +00:00
Benjamin Kramer c4169cebe3 Generalize the and-icmp-select instcombine further by allowing selects of the form
(x & 2^n) ? 2^m+C : C

we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.

llvm-svn: 121609
2010-12-11 10:49:22 +00:00
Benjamin Kramer 94a622af4c The srem -> urem transform is not safe for any divisor that's not a power of two.
E.g. -5 % 5 is 0 with srem and 1 with urem.

Also addresses Frits van Bommel's comments.

llvm-svn: 120049
2010-11-23 20:33:57 +00:00
Benjamin Kramer b5afa65b0a InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive.
This allows to transform the rem in "1 << ((int)x % 8);" to an and.

llvm-svn: 120028
2010-11-23 18:52:42 +00:00
Benjamin Kramer f1ebb63161 InstCombine: Implement X - A*-B -> X + A*B.
llvm-svn: 119984
2010-11-22 20:31:27 +00:00
Benjamin Kramer 24656c9583 Implement the "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" optimization.
This currently only catches the most basic case, a two-case switch, but can be
extended later.

llvm-svn: 119964
2010-11-22 09:45:38 +00:00
Chris Lattner 9165d9d2ac add some random notes.
llvm-svn: 119925
2010-11-21 07:05:31 +00:00
Chris Lattner f7e896138e optimize:
void a(int x) { if (((1<<x)&8)==0) b(); }

into "x != 3", which occurs over 100 times in 403.gcc but in no
other program in llvm-test.

llvm-svn: 119922
2010-11-21 06:44:42 +00:00
Chris Lattner 9de0176ef8 tail calls on x86 are implemented.
llvm-svn: 119920
2010-11-21 06:10:27 +00:00
Chris Lattner 932aab3cbf add a note
llvm-svn: 118806
2010-11-11 18:23:57 +00:00
Chris Lattner 1d6aa32b87 add pr#
llvm-svn: 118797
2010-11-11 17:17:56 +00:00
Chris Lattner 4d94e47368 add a case we fail to devirt.
llvm-svn: 118608
2010-11-09 19:37:28 +00:00
Duncan Sands f532d31198 Fix a README item: when doing a comparison with the result
of a select instruction, see if doing the compare with the
true and false values of the select gives the same result.
If so, that can be used as the value of the comparison.

llvm-svn: 118378
2010-11-07 16:12:23 +00:00
Benjamin Kramer 8628e2a19c Add a note.
llvm-svn: 118337
2010-11-06 10:37:16 +00:00
Benjamin Kramer 2b76c66fd6 Add constant folding for strspn and strcspn to SimplifyLibCalls.
llvm-svn: 115116
2010-09-30 00:58:35 +00:00
Chris Lattner 011f146419 idiom recognition should catch this.
llvm-svn: 114304
2010-09-19 00:37:34 +00:00
Nick Lewycky bb10e90487 Add optimization to Target/README.txt.
llvm-svn: 110543
2010-08-08 07:04:25 +00:00
Benjamin Kramer 2321e6a4d4 Teach instcombine to transform
(X >s -1) ? C1 : C2 and (X <s  0) ? C2 : C1
into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional.

This optimization could be extended to take non-const C1 and C2 but we better
stay conservative to avoid code size bloat for now.

for
int sel(int n) {
     return n >= 0 ? 60 : 100;
}

we now generate
  sarl  $31, %edi
  andl  $40, %edi
  leal  60(%rdi), %eax

instead of
  testl %edi, %edi
  movl  $60, %ecx
  movl  $100, %eax
  cmovnsl %ecx, %eax

llvm-svn: 107866
2010-07-08 11:39:10 +00:00
Eli Friedman c8f595212f Minor amendment to switch-lowering improvement.
llvm-svn: 107569
2010-07-03 08:43:32 +00:00
Eli Friedman 836fdbc85b Note switch-lowering inefficiency.
llvm-svn: 107565
2010-07-03 07:38:12 +00:00
Eric Christopher e34471bb31 Add another bswap idiom that isn't matched.
llvm-svn: 107213
2010-06-29 22:22:22 +00:00
Benjamin Kramer 41476410c9 TODO--
llvm-svn: 106102
2010-06-16 15:47:00 +00:00
Eli Friedman e17e4aea2a Add README entry; based on testcase from Bill Hart.
llvm-svn: 105878
2010-06-12 05:54:27 +00:00
Chris Lattner 4dc833c607 add a note
llvm-svn: 104404
2010-05-21 23:16:21 +00:00
Dan Gohman 73c8145505 Add a README entry.
llvm-svn: 102906
2010-05-03 14:31:00 +00:00
Chris Lattner cfc921cd2a add a note
llvm-svn: 101581
2010-04-16 23:52:30 +00:00
Chris Lattner 4041ab6e00 Implement rdar://7860110 (also in target/readme.txt) narrowing
a load/or/and/store sequence into a narrower store when it is
safe.  Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does  trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.

This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll 
into:

        movl    %eax, 36(%rdi)

instead of:

        movl    $4294967295, %eax       ## imm = 0xFFFFFFFF
        andq    32(%rdi), %rax
        shlq    $32, %rcx
        addq    %rax, %rcx
        movq    %rcx, 32(%rdi)

and each of the testcases into a single store.  Each of them used
to compile into craziness like this:

_test4:
	movl	$65535, %eax            ## imm = 0xFFFF
	andl	(%rdi), %eax
	shll	$16, %esi
	addl	%eax, %esi
	movl	%esi, (%rdi)
	ret

llvm-svn: 101343
2010-04-15 04:48:01 +00:00
Chris Lattner 1f6689a8ba move PR6576 here.
llvm-svn: 98194
2010-03-10 21:42:42 +00:00
Chris Lattner 187242b3ab move PR6212 to this file.
llvm-svn: 95624
2010-02-09 00:11:10 +00:00
Eli Friedman 0de0b3677a Remove a completed item, add a couple new ones.
llvm-svn: 94945
2010-01-31 04:55:32 +00:00
Bob Wilson 7c42b9d51e Improve isSafeToLoadUnconditionally to recognize that GEPs with constant
indices are safe if the result is known to be within the bounds of the
underlying object.

llvm-svn: 94829
2010-01-29 19:19:08 +00:00
Chris Lattner e3a68d1063 reassociate should do this.
llvm-svn: 94374
2010-01-24 20:17:09 +00:00
Chris Lattner 7e3f8b60d6 add a note.
llvm-svn: 94373
2010-01-24 20:01:41 +00:00
Chris Lattner 249da5cb73 implement a simple instcombine xform that has been in the
readme forever.

llvm-svn: 94318
2010-01-23 18:49:30 +00:00
Chris Lattner 082da53f9a add some notes, making posix-memalign be nocapture would be an easy improvement.
llvm-svn: 94312
2010-01-23 17:59:23 +00:00
Eli Friedman 9ed49c5c8f Add some potentially interesting transformations to README.
llvm-svn: 93797
2010-01-18 22:36:59 +00:00
Duncan Sands c8493da5b1 Fix a README item: have functionattrs look through selects and
phi nodes when deciding which pointers point to local memory.
I actually checked long ago how useful this is, and it isn't
very: it hardly ever fires in the testsuite, but since Chris
wants it here it is!

llvm-svn: 92836
2010-01-06 15:37:47 +00:00
Duncan Sands 78376ad7e1 Partially address a README by having functionattrs consider calls to
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.

llvm-svn: 92829
2010-01-06 08:45:52 +00:00
Chris Lattner 2d91231d82 implement an instcombine xform needed by clang's codegen
on the example in PR4216.  This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.

llvm-svn: 92458
2010-01-04 06:03:59 +00:00
Chris Lattner 39f18e545e Teach codegen to lower llvm.powi to an efficient (but not optimal)
multiply sequence when the power is a constant integer.  Before, our
codegen for std::pow(.., int) always turned into a libcall, which was
really inefficient.

This should also make many gfortran programs happier I'd imagine.

llvm-svn: 92388
2010-01-01 03:32:16 +00:00
Chris Lattner 71cf7c256f update this. To take the next step, llvm.powi should be generalized to work
on integers as well and codegen should lower them to branch trees.

llvm-svn: 92382
2010-01-01 01:29:26 +00:00
Eli Friedman 96cf7f42b0 More info on this transformation.
llvm-svn: 91230
2009-12-12 23:23:43 +00:00
Eli Friedman 8eada9f580 Remove some stuff that's already implemented. Also, remove the note about
merging x >u 5 and x <s 20 because it's impossible to implement.

llvm-svn: 91228
2009-12-12 21:41:48 +00:00
Chris Lattner f05330a5c8 expand note.
llvm-svn: 90429
2009-12-03 07:43:46 +00:00
Chris Lattner d1e4ee3c2b add a note
llvm-svn: 90428
2009-12-03 07:41:54 +00:00
Chris Lattner 58ccf88c36 update and consolidate the load pre notes.
llvm-svn: 90050
2009-11-29 02:19:52 +00:00
Chris Lattner 83a4a9868f add a deadargelim note.
llvm-svn: 90009
2009-11-27 17:12:30 +00:00
Chris Lattner ca9e0e83b3 This testcase is actually only partially redundant, and requires
the FIXME I added yesterday to be implemented.

llvm-svn: 90008
2009-11-27 16:53:57 +00:00
Chris Lattner cc6d29286c this (and probably several others) are now done.
llvm-svn: 89982
2009-11-27 00:35:04 +00:00
Chris Lattner 9bd2136ca3 Teach memdep to phi translate bitcasts. This allows us to compile
the example in GCC PR16799 to:

LBB1_2:                                                     ## %bb1
	movl	%eax, %eax
	subq	%rax, %rdi
	movq	%rdi, (%rcx)
	movl	(%rdi), %eax
	testl	%eax, %eax
	je	LBB1_2

instead of:

LBB1_2:                                                     ## %bb1
	movl	(%rdi), %ecx
	subq	%rcx, %rdi
	movq	%rdi, (%rax)
	cmpl	$0, (%rdi)
	je	LBB1_2

llvm-svn: 89978
2009-11-26 23:41:07 +00:00
Chris Lattner 29bc8a91d3 Teach basicaa that x|c == x+c when the c bits of x are clear. This
allows us to compile the example in readme.txt into:

LBB1_1:                                                     ## %bb
	movl	4(%rdx,%rax), %ecx
	movl	%ecx, %esi
	imull	(%rdx,%rax), %esi
	imull	%esi, %ecx
	movl	%esi, 8(%rdx,%rax)
	imull	%ecx, %esi
	movl	%ecx, 12(%rdx,%rax)
	movl	%esi, 16(%rdx,%rax)
	imull	%ecx, %esi
	movl	%esi, 20(%rdx,%rax)
	addq	$16, %rax
	cmpq	$4000, %rax
	jne	LBB1_1

instead of:

LBB1_1: 
	movl	(%rdx,%rax), %ecx
	imull	4(%rdx,%rax), %ecx
	movl	%ecx, 8(%rdx,%rax)
	imull	4(%rdx,%rax), %ecx
	movl	%ecx, 12(%rdx,%rax)
	imull	8(%rdx,%rax), %ecx
	movl	%ecx, 16(%rdx,%rax)
	imull	12(%rdx,%rax), %ecx
	movl	%ecx, 20(%rdx,%rax)
	addq	$16, %rax
	cmpq	$4000, %rax
	jne	LBB1_1

GCC (4.2) doesn't seem to be able to eliminate the loads in this 
testcase either, it generates:

L2:
	movl	(%rdx), %eax
	imull	4(%rdx), %eax
	movl	%eax, 8(%rdx)
	imull	4(%rdx), %eax
	movl	%eax, 12(%rdx)
	imull	8(%rdx), %eax
	movl	%eax, 16(%rdx)
	imull	12(%rdx), %eax
	movl	%eax, 20(%rdx)
	addl	$4, %ecx
	addq	$16, %rdx
	cmpl	$1002, %ecx
	jne	L2

llvm-svn: 89952
2009-11-26 16:26:43 +00:00
Chris Lattner 12dacdd359 teach basicaa that A[i] != A[i+1].
llvm-svn: 89951
2009-11-26 16:18:10 +00:00
Chris Lattner 8e09ad6f3c update some notes slightly
llvm-svn: 89913
2009-11-26 01:51:18 +00:00
Nick Lewycky ef4ea9a2a9 Add a complex missed optimization opportunity I came across while investigating
bug 5438.

llvm-svn: 88855
2009-11-15 17:51:23 +00:00
Chris Lattner 7a09964e81 another const prop failure.
llvm-svn: 86848
2009-11-11 17:54:02 +00:00
Chris Lattner 539bdf0487 add a note
llvm-svn: 86847
2009-11-11 17:51:27 +00:00
Chris Lattner 0169fd7c62 add a note
llvm-svn: 86756
2009-11-10 23:47:45 +00:00
Chris Lattner 8ff26038ef I did this a week or two ago
llvm-svn: 86754
2009-11-10 23:40:49 +00:00
Nick Lewycky b9397262b7 Improve tail call elimination to handle the switch statement.
llvm-svn: 86403
2009-11-07 21:10:15 +00:00
Chris Lattner 06c26d982e add a note from PR5313
llvm-svn: 86146
2009-11-05 18:19:19 +00:00
Bill Wendling 2e5198ff09 Add new note.
llvm-svn: 85341
2009-10-27 23:30:07 +00:00
Bill Wendling fd2730ee8c Move and clarify note.
llvm-svn: 85334
2009-10-27 22:48:31 +00:00
Chris Lattner 13b8b56dd4 this is done.
llvm-svn: 85041
2009-10-25 06:17:51 +00:00
Chris Lattner 851193b873 some stuff is done, we still have constantexpr simplification to do.
llvm-svn: 84943
2009-10-23 07:00:55 +00:00