Commit Graph

662 Commits

Author SHA1 Message Date
Nuno Lopes 9ac4661afa make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Evan Cheng 319be53a1f Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Duncan Sands 8bc764aeca Replacing zero-sized alloca's with a null pointer is too aggressive, instead
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite.  What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size".  It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly.  Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202
2012-06-26 13:39:21 +00:00
Nuno Lopes 07594cba7c improve optimization of invoke instructions:
- simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146
2012-06-25 17:11:47 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Jakob Stoklund Olesen c5c4e96f3e Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x"
This fixes PR5997.

These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.

This causes no regressions on x86-64.

llvm-svn: 159003
2012-06-22 16:36:43 +00:00
Nuno Lopes 771e7bd4ba instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here).
Fixes the crashes with the attached test case

llvm-svn: 158951
2012-06-21 23:52:14 +00:00
Evan Cheng 32c7cc8ec9 Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329
llvm-svn: 158946
2012-06-21 22:52:49 +00:00
Nuno Lopes dc6085e52d Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Nuno Lopes 55fff83422 refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
Nuno Lopes 3fa32f2452 replace usage of EmitGEPOffset() with TargetData::getIndexedOffset() when the GEP offset is known to be constant.
With this change, we avoid relying on the IR Builder to constant fold the operations.

No functionality change intended.

llvm-svn: 158829
2012-06-20 17:30:51 +00:00
Manman Ren c2bc2d106b InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.

llvm-svn: 158441
2012-06-14 05:57:42 +00:00
Benjamin Kramer 2150145ae4 InstCombine: factor code better.
No functionality change.

llvm-svn: 158301
2012-06-11 08:01:25 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Nuno Lopes 2710f1b049 canonicalize:
-%a + 42
into
42 - %a

previously we were emitting:
-(%a + 42)

This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next

llvm-svn: 158237
2012-06-08 22:30:05 +00:00
Nadav Rotem 4e50efead6 Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type.
llvm-svn: 158166
2012-06-07 20:28:57 +00:00
Chad Rosier faa3894628 Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't
matter.
rdar://11579835

llvm-svn: 158084
2012-06-06 17:22:40 +00:00
Benjamin Kramer 9d5849f51d Fix suspicous hasOneUse() check, found by PVS Studio (PR12357).
llvm-svn: 157592
2012-05-28 20:52:48 +00:00
Benjamin Kramer b8743a9150 InstCombine: Fix infinite loop when encountering switch on trivial icmp.
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.

This fixes PR12897.

llvm-svn: 157587
2012-05-28 19:18:16 +00:00
Chris Lattner 3cb6f83ebb switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.
llvm-svn: 157556
2012-05-28 01:47:44 +00:00
Benjamin Kramer 152f106e5f PR12967: Don't crash when trying to fold a shift that's larger than the type's size.
llvm-svn: 157548
2012-05-27 22:03:32 +00:00
Nuno Lopes a2f6cecb6d add a new pass to instrument loads and stores for run-time bounds checking
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h

(a draft of this) patch reviewed by Andrew, thanks.

llvm-svn: 157261
2012-05-22 17:19:09 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Nuno Lopes e2cfd3ce95 objectsize: add a few more tests and fix a bug
llvm-svn: 156625
2012-05-11 18:25:29 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Nuno Lopes f573030391 objectsize: add support for GEPs with non-constant indexes
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

llvm-svn: 156585
2012-05-10 23:17:35 +00:00
Nuno Lopes 7100f463b0 objectsize:
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

llvm-svn: 156515
2012-05-09 21:30:57 +00:00
Jakub Staszak cfc46f82ff Remove trailing spaces.
llvm-svn: 156257
2012-05-06 13:52:31 +00:00
Stepan Dyatkovskiy cb2a1a34e2 Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.

llvm-svn: 156231
2012-05-05 07:09:40 +00:00
Nuno Lopes d4cf35d775 remove calls to calloc if the allocated memory is not used (it was already being done for malloc)
fix a few typos found by Chad in my previous commit

llvm-svn: 156110
2012-05-03 22:08:19 +00:00
Nuno Lopes d2b71e7fa9 add support for calloc to objectsize lowering
llvm-svn: 156102
2012-05-03 21:19:58 +00:00
Nuno Lopes 22f6f3b055 replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite.
This gives a 0.9% in a test case

llvm-svn: 156069
2012-05-03 16:06:07 +00:00
Lang Hames 3a90fabd85 Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes
<rdar://problem/11291436>.

This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.

llvm-svn: 155866
2012-05-01 00:20:38 +00:00
Chad Rosier 7813dcee30 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674
2012-04-26 23:29:14 +00:00
Lang Hames 2fd0c69125 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.

llvm-svn: 155530
2012-04-25 02:16:54 +00:00
Lang Hames 84531c2b5f Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes
<rdar://problem/11291436>.

llvm-svn: 155468
2012-04-24 18:58:36 +00:00
Jakob Stoklund Olesen 43bcb970e5 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155362
2012-04-23 17:39:52 +00:00
Jakob Stoklund Olesen 205ee3b389 Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

llvm-svn: 155181
2012-04-20 00:38:45 +00:00
Jakob Stoklund Olesen 6b6c81e6b2 Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155136
2012-04-19 16:46:26 +00:00
Chandler Carruth f82b0e2d29 Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

llvm-svn: 154285
2012-04-08 14:36:56 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Nadav Rotem a8f3562e8f 153465 was incorrect. In this code we wanted to check that the pointer operand is of pointer type (and not vector type).
llvm-svn: 153468
2012-03-26 21:00:53 +00:00
Nadav Rotem e63e59cc44 PR12357: The pointer was used before it was checked.
llvm-svn: 153465
2012-03-26 20:39:18 +00:00
Chris Lattner b1e2e1e091 eliminate an unneeded branch, part of PR12357
llvm-svn: 153458
2012-03-26 19:13:57 +00:00
Bill Wendling 55b6b2b6a9 Revert r152907.
llvm-svn: 152935
2012-03-16 18:20:54 +00:00
Bill Wendling a2a26b546c The alignment of the pointer part of the store instruction may have an
alignment. If that's the case, then we want to make sure that we don't increase
the alignment of the store instruction. Because if we increase it to be "more
aligned" than the pointer, code-gen may use instructions which require a greater
alignment than the pointer guarantees.
<rdar://problem/11043589>

llvm-svn: 152907
2012-03-16 07:40:08 +00:00
Eli Friedman e06535b2f6 In InstCombiner::visitOr, make sure we reverse the operand swap used for checking for or-of-xor operations after those checks; a later check expects that any constant will be in Op1. PR12234.
llvm-svn: 152884
2012-03-16 00:52:42 +00:00
Bill Wendling 7fa1be77cc Use an iterator instead of calling .size() on the worklist every time, which is wasteful.
llvm-svn: 152794
2012-03-15 11:19:41 +00:00
Stepan Dyatkovskiy 97b02fc1b3 llvm::SwitchInst
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.

llvm-svn: 152532
2012-03-11 06:09:17 +00:00
Stepan Dyatkovskiy 5b648afb4d Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297
2012-03-08 07:06:20 +00:00