Commit Graph

1328 Commits

Author SHA1 Message Date
Nick Lewycky 802df52424 Remove redundant code in InstCombineShift, no functionality change because instsimplify already does this and instcombine calls instsimplify a few lines above. Patch by Suyog Sarda!
llvm-svn: 211250
2014-06-19 03:28:28 +00:00
Matt Arsenault a0050b0961 R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

llvm-svn: 211247
2014-06-19 01:19:19 +00:00
Jingyue Wu 33bd53df7f [InstCombine] mark ADD with nuw if no unsigned overflow
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".

Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll

Test Plan: make check-all

Reviewers: eliben, rafael, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D4144

llvm-svn: 211084
2014-06-17 00:42:07 +00:00
Jingyue Wu baabe5091c Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll

llvm-svn: 211004
2014-06-15 21:40:57 +00:00
Dinesh Dwivedi 95f0d51bd3 This removes TODO added in http://reviews.llvm.org/D3658
The patch transforms

ABS(NABS(X)) -> ABS(X)
NABS(ABS(X)) -> NABS(X)

Differential Revision: http://reviews.llvm.org/D4040

llvm-svn: 210782
2014-06-12 14:06:00 +00:00
Matt Arsenault 44f60d0a60 Look through addrspacecasts when turning ptr comparisons into
index comparisons.

llvm-svn: 210488
2014-06-09 19:20:29 +00:00
Rafael Espindola 4ba22f0813 Revert 209903 and 210040.
The messages were

 "PR19753: Optimize comparisons with "ashr exact" of a constanst."
 "Added support to optimize comparisons with "lshr exact" of a constant."

They were not correctly handling signed/unsigned operation differences,
causing pr19958.

llvm-svn: 210393
2014-06-07 04:12:35 +00:00
Jingyue Wu 77145d9410 InstCombine: Canonicalize addrspacecast between different element types
addrspacecast X addrspace(M)* to Y addrspace(N)*

-->

bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*

Updat all affected tests and add several new tests in addrspacecast.ll.

This patch is based on http://reviews.llvm.org/D2186 (authored by Matt
Arsenault) with fixes and more tests.

llvm-svn: 210375
2014-06-06 21:52:55 +00:00
Dinesh Dwivedi 3217b6c661 Added select flavour for ABS and NEG(ABS)
This patch can identify 
  ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X
  ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X
  NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X
  NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X
  
and can transform
  ABS(ABS(X)) -> ABS(X)
  NABS(NABS(X)) -> NABS(X)
  
Differential Revision: http://reviews.llvm.org/D3658

llvm-svn: 210312
2014-06-06 06:54:45 +00:00
Bill Schmidt a1184635ce [PPC64LE] Correct vperm -> shuffle transform for little endian
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector.  This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses.  This is represented with a
ppc_altivec_vperm intrinsic function.

The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.

This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.

The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.

llvm-svn: 210282
2014-06-05 19:46:04 +00:00
Rafael Espindola 78598d9ab5 Add a Constant version of stripPointerCasts.
Thanks to rnk for the suggestion.

llvm-svn: 210205
2014-06-04 19:01:48 +00:00
Rafael Espindola 4dc5dfc56b Clauses in a landingpad are always Constant. Use a stricter type.
llvm-svn: 210203
2014-06-04 18:51:31 +00:00
Rafael Espindola 04c2258624 InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
    signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
    the other operand has a known-zero bit in a more significant place than it
    (not including the sign bit) the ripple may go up to and fill the zero, but
    won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 210186
2014-06-04 15:39:14 +00:00
Rafael Espindola d1a2c2d905 Add back commit r210029.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.

llvm-svn: 210052
2014-06-02 22:01:04 +00:00
Rafael Espindola 582c890fbe Revert "Add the nsw flag when we detect that an add will not signed overflow."
This reverts commit r210029.

It was not correctly handling cases where LHS and RHS had multiple but different
sign bits.

llvm-svn: 210048
2014-06-02 21:12:19 +00:00
Rafael Espindola 6b04ef785e Added support to optimize comparisons with "lshr exact" of a constant.
Patch by Rahul Jain.

llvm-svn: 210040
2014-06-02 19:19:04 +00:00
Rafael Espindola 82899febf0 Add the nsw flag when we detect that an add will not signed overflow.
We already had a function for checking this, we were just using it only in
specialized cases.

llvm-svn: 210029
2014-06-02 14:32:58 +00:00
Dinesh Dwivedi ce5d35a9d0 Added inst combine tarnsform for (1 << X) & C pattrens where C is (some PowerOf2 - 1)
This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt
  "((1 << X) & 7) == 0" ==> "X > 2"
  "((1 << X) & 7) != 0" ==> "X < 3".

Differential Revision: http://reviews.llvm.org/D3678

llvm-svn: 210007
2014-06-02 07:57:24 +00:00
Dinesh Dwivedi 43e127bded Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Differential Revision: http://reviews.llvm.org/D3777

llvm-svn: 210006
2014-06-02 07:24:36 +00:00
Rafael Espindola c323952cb4 PR19753: Optimize comparisons with "ashr exact" of a constanst.
Patch by suyog sarda.

llvm-svn: 209903
2014-05-30 15:54:32 +00:00
Chandler Carruth fdc0e0b478 And fix my fix to sink down through the type at the right time. My
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.

Working on test cases now, but wanted to get bots healthier.

llvm-svn: 209860
2014-05-29 23:21:12 +00:00
Chandler Carruth 3012a1b4cd Fix one bug in the latest incarnation of r209843 -- combining GEPs
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.

llvm-svn: 209857
2014-05-29 23:05:52 +00:00
Louis Gerbarg c6b506a0ae Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209843
2014-05-29 20:29:47 +00:00
Rafael Espindola a248f536b3 Revert "Revert "Revert "InstCombine: Improvement to check if signed addition overflows."""
This reverts commit r209776.

It was miscompiling llvm::SelectionDAGISel::MorphNode.

llvm-svn: 209817
2014-05-29 14:39:16 +00:00
Rafael Espindola 6196b7430e Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure

llvm-svn: 209776
2014-05-28 21:43:52 +00:00
Rafael Espindola 910528a3eb Revert "Add support for combining GEPs across PHI nodes"
This reverts commit r209755.

it was the real cause of the libc++ build failure.

llvm-svn: 209775
2014-05-28 21:41:21 +00:00
Rafael Espindola fb59b05ca4 Revert "InstCombine: Improvement to check if signed addition overflows."
This reverts commit r209746.

It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.

llvm-svn: 209762
2014-05-28 18:48:10 +00:00
Louis Gerbarg 727f1cbb17 Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209755
2014-05-28 17:38:31 +00:00
Rafael Espindola 085b57941f InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
   signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
   the other operand has a known-zero bit in a more significant place than it
   (not including the sign bit) the ripple may go up to and fill the zero, but
   won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 209746
2014-05-28 15:30:40 +00:00
Filipe Cabecinhas e8d6a1e82f Post-commit fixes for r209643
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format

llvm-svn: 209667
2014-05-27 16:54:33 +00:00
Daniel Jasper 73458c95ac Fix bad assert.
llvm-svn: 209648
2014-05-27 09:55:37 +00:00
Filipe Cabecinhas 82ac07c283 Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

llvm-svn: 209643
2014-05-27 03:42:20 +00:00
Tim Northover 3b0846e8f7 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00
Dinesh Dwivedi f82f16e3e6 Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'
This removes TODO added in r208849 [http://reviews.llvm.org/D3629]

MIN(MIN(A, 97), 23) -> MIN(A, 23)
MAX(MAX(A, 23), 97) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3785

llvm-svn: 209110
2014-05-19 07:08:32 +00:00
NAKAMURA Takumi 7ef81a4f98 Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"
It broke clang selfhosting even after r209065.

llvm-svn: 209067
2014-05-17 14:39:21 +00:00
Louis Gerbarg 455805694e Fix for sanitizer crash introduced in r209049
This patch fixes 3 issues introduced by r209049 that only showed up in on
the sanitizer buildbots. One was a typo in a compare. The other is a check to
confirm that the single differing value in the two incoming GEPs is the same
type. The final issue was the the IRBuilder under some circumstances would
build PHIs in the middle of the block.

llvm-svn: 209065
2014-05-17 06:51:36 +00:00
Louis Gerbarg 8d2a43e9be Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

rdar://15547484

llvm-svn: 209049
2014-05-16 23:47:24 +00:00
Dinesh Dwivedi 83c11da849 Reverting r208848, reason: build failure: sanitizer-x86_64-linux-bootstrap/builds/3399
llvm-svn: 208852
2014-05-15 08:22:55 +00:00
Dinesh Dwivedi f675f4201b Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'
MIN(MIN(A, 23), 97) -> MIN(A, 23)
MAX(MAX(A, 97), 23) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3629

llvm-svn: 208849
2014-05-15 06:13:40 +00:00
Dinesh Dwivedi 837c16097e Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh

Differential Revision: http://reviews.llvm.org/D3717

llvm-svn: 208848
2014-05-15 06:01:33 +00:00
David Majnemer 186c94244c InstCombine: Optimize -x s< cst
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.

Reviewers: nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3773

llvm-svn: 208827
2014-05-15 00:02:20 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Serge Pavlov e6de9e39a8 Fix the case when reordering shuffle and binop produces a constant.
This resolves PR19737.

llvm-svn: 208762
2014-05-14 09:05:09 +00:00
Nick Lewycky f0cf8fa941 Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/
llvm-svn: 208750
2014-05-14 03:03:05 +00:00
Serge Pavlov b575ee8294 Fix type of shuffle resulted from shuffle merge.
This fix resolves PR19730.

llvm-svn: 208666
2014-05-13 06:07:21 +00:00
Serge Pavlov 02ff620c7b Fix type of shuffle obtained from reordering with binary operation
In transformation:
    BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.

llvm-svn: 208531
2014-05-12 10:11:27 +00:00
Serge Pavlov 0581109708 Fix reordering of shuffles and binary operations
Do not apply transformation:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))

if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
    

llvm-svn: 208518
2014-05-12 05:44:53 +00:00
Serge Pavlov 9ef66a8266 Reorder shuffle and binary operation.
This patch enables transformations:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
    BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)

They allow to eliminate extra shuffles in some cases.

Differential Revision: http://reviews.llvm.org/D3525

llvm-svn: 208488
2014-05-11 08:46:12 +00:00
Michael Zolotukhin 292d3caa15 [InstCombine] Some cleanup in optimization of redundant insertvalue instructions.
And one more test added.

llvm-svn: 208355
2014-05-08 19:50:24 +00:00
Chandler Carruth d70cc604af Tidy up whitespace with clang-format prior to making significant
changes.

llvm-svn: 208229
2014-05-07 17:36:59 +00:00
Michael Zolotukhin 7d6293a0d3 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Rafael Espindola 85f3610222 Also handle ConstantAggregateZero when optimizing vpermilvar*.
llvm-svn: 207582
2014-04-29 22:20:40 +00:00
Rafael Espindola 152ee213a4 Remove tabs.
Sorry, new machine and I forgot to change the editor setting.

llvm-svn: 207578
2014-04-29 21:02:37 +00:00
Rafael Espindola eb7bdbd0ce Two fixes to the vpermilvar optimization.
The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.

llvm-svn: 207577
2014-04-29 20:41:54 +00:00
Hans Wennborg e36e116826 InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)
llvm-svn: 207426
2014-04-28 17:40:03 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Andrea Di Biagio 8cc9059ce8 [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift
right intrinsics.

A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.

llvm-svn: 207299
2014-04-26 01:03:22 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Michael J. Spencer dee4b2c379 [InstCombine][x86] Constant fold psll intrinsics.
This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.

This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...

llvm-svn: 207058
2014-04-24 00:58:18 +00:00
Filipe Cabecinhas 1a80595a2b Optimize some special cases for SSE4a insertqi
Summary:
Since the upper 64 bits of the destination register are undefined when
performing this operation, we can substitute it and let the optimizer
figure out that only a copy is needed.

Also added range merging, if an instruction copies a range that can be
merged with a previous copied range.

Added test cases for both optimizations.

Reviewers: grosbach, nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3357

llvm-svn: 207055
2014-04-24 00:38:14 +00:00
Matt Arsenault 60728177fb Handle addrspacecast when looking at memcpys from globals
llvm-svn: 207054
2014-04-24 00:01:09 +00:00
Matt Arsenault 6b4bed4b83 Remove dead code in instcombine.
Don't replace shifts greater than the type with the maximum shift.

This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.

llvm-svn: 207000
2014-04-23 16:48:40 +00:00
Chandler Carruth 964daaaf19 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Rafael Espindola bad3f77703 Simplify a vpermil* with constant mask.
With a constant mask a vpermil* is just a shufflevector. This patch implements
that simplification. This allows us to produce denser code. It should also
allow more folding down the line.

llvm-svn: 206801
2014-04-21 22:06:04 +00:00
Chandler Carruth 5f1f26e891 [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of the
header files and into the cpp files.

These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.

llvm-svn: 206777
2014-04-21 19:51:41 +00:00
Matt Arsenault fed3dc8dc6 Revert "Revert r206045, "Fix shift by constants for vector.""
Fix cases where the Value itself is used, and not the constant value.

llvm-svn: 206214
2014-04-14 21:50:37 +00:00
NAKAMURA Takumi 58ad0c87f8 Whitespace.
llvm-svn: 206154
2014-04-14 07:03:13 +00:00
NAKAMURA Takumi 26afa982ec Revert r206045, "Fix shift by constants for vector."
It broke some builders, at least, i686.

llvm-svn: 206153
2014-04-14 07:02:57 +00:00
Serge Pavlov b5f3ddc7a1 Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.
llvm-svn: 206144
2014-04-14 02:20:19 +00:00
Serge Pavlov 4bb54d51c8 Recognize test for overflow in integer multiplication.
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:

    %mul32 = trunc i64 %mul64 to i32
    %zext = zext i32 %mul32 to i64
    %overflow = icmp ne i64 %mul64, %zext
or
    %overflow = icmp ugt i64 %mul64 , 0xffffffff

then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.

Differential Revision: http://llvm-reviews.chandlerc.com/D2814

llvm-svn: 206137
2014-04-13 18:23:41 +00:00
Matt Arsenault 173a1e577c Fix shift by constants for vector.
ashr <N x iM>, <N x iM> M -> undef

llvm-svn: 206045
2014-04-11 17:57:53 +00:00
Eli Bendersky 9966b26dac Fix PR19270 - type mismatch caused by invalid optimization.
Patch by Jingyue Wu.

llvm-svn: 205547
2014-04-03 17:51:58 +00:00
Tim Northover 00ed9964c6 ARM64: initial backend import
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.

Everything will be easier with the target in-tree though, hence this
commit.

llvm-svn: 205090
2014-03-29 10:18:08 +00:00
Erik Verbruggen 5e1bac3a38 Revert "InstCombine: merge constants in both operands of icmp."
This reverts commit r204912, and follow-up commit r204948.

This introduced a performance regression, and the fix is not completely
clear yet.

llvm-svn: 205010
2014-03-28 14:50:57 +00:00
Reid Kleckner 3bdf9bc48b InstCombine: Don't combine constants on unsigned icmps
Fixes a miscompile introduced in r204912.  It would miscompile code like
(unsigned)(a + -49) <= 5U.  The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.

llvm-svn: 204948
2014-03-27 17:49:27 +00:00
Erik Verbruggen 59a1219846 InstCombine: merge constants in both operands of icmp.
Transform:
    icmp X+Cst2, Cst
into:
    icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.

llvm-svn: 204912
2014-03-27 11:16:05 +00:00
Richard Osborne 0af4aa9a19 [InstCombine] Don't fold bitcast into store if it would need addrspacecast
Summary:
Previously the code didn't check if the before and after types for the
store were pointers to different address spaces. This resulted in
instcombine using a bitcast to convert between pointers to different
address spaces, causing an assertion due to the invalid cast.

It is not be appropriate to use addrspacecast this case because it is
not guaranteed to be a no-op cast. Instead bail out and do not do the
transformation.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3117

llvm-svn: 204733
2014-03-25 17:21:41 +00:00
Richard Osborne 9805ec457d Reuse earlier variables to make it clear the types involved in the cast.
No functionality change.

llvm-svn: 204732
2014-03-25 17:21:35 +00:00
Owen Anderson 9b8f9c3d95 Fix a bug in InstCombine where we would incorrectly attempt to construct a
bitcast between pointers of two different address spaces if they happened to have
the same pointer size.

llvm-svn: 203862
2014-03-13 22:51:43 +00:00
Chandler Carruth cdf4788401 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Tim Northover fad2761ca0 InstCombine: form shuffles from wider range of insert/extractelements
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229
2014-03-07 10:24:44 +00:00
Chandler Carruth 7da14f1ab9 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

llvm-svn: 203064
2014-03-06 03:23:41 +00:00
Craig Topper 3e4c697ca1 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth 8cd041ef19 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth 452a00747e [Modules] Move the TargetFolder into the Analysis library. Historically,
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.

llvm-svn: 202832
2014-03-04 11:59:06 +00:00
Chandler Carruth 1305dc3351 [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth 4220e9c154 [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth 820a908df7 [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth 219b89b987 [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth 03eb0de93d [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Rafael Espindola 935125126c Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola aeff8a9c05 Make some DataLayout pointers const.
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
2014-02-24 23:12:18 +00:00
Rafael Espindola 37dc9e19f5 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Nick Lewycky 75080ff25d Make sure that value handle users see the transformation of an indirect call to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink!
llvm-svn: 201822
2014-02-20 23:00:15 +00:00
Matt Arsenault aa689f5079 Do more addrspacecast transforms that happen for bitcast.
Makes addrspacecast (gep) do addrspacecast (gep) instead.

llvm-svn: 201376
2014-02-14 00:49:12 +00:00
Benjamin Kramer 920409585f InstCombine: Replace custom constant folding code with ConstantExpr.
llvm-svn: 201352
2014-02-13 18:23:24 +00:00
Owen Anderson 883b5add8e Remove a very old instcombine where we would turn sequences of selects into
logical operations on the i1's driving them.  This is a bad idea for every
target I can think of (confirmed with micro tests on all of: x86-64, ARM,
AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into
a general purpose register, whereas consuming it directly into a select generally
allows it to exist only transiently in a predicate or flags register.

Chandler ran a set of performance tests with this change, and reported no
measurable change on x86-64.

llvm-svn: 201275
2014-02-12 23:54:07 +00:00
Benjamin Kramer 94fc18d040 InstCombine: Teach icmp merging about the equivalence of bit tests and UGE/ULT with a power of 2.
This happens in bitfield code. While there reorganize the existing code
a bit.

llvm-svn: 201176
2014-02-11 21:09:03 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Reid Kleckner 26af2cae05 Update optimization passes to handle inalloca arguments
Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

llvm-svn: 200281
2014-01-28 02:38:36 +00:00
Benjamin Kramer 09b0f88a7f InstCombine: Don't try to use aggregate elements of ConstantExprs.
PR18600.

llvm-svn: 200028
2014-01-24 19:02:37 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Owen Anderson 1664dc8973 Fix all the remaining lost-fast-math-flags bugs I've been able to find. The most important of these are cases in the generic logic for combining BinaryOperators.
This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.

llvm-svn: 199629
2014-01-20 07:44:53 +00:00
Benjamin Kramer b80e1699b3 InstCombine: Modernize a bunch of cast combines.
Also make them vector-aware.

llvm-svn: 199608
2014-01-19 20:05:13 +00:00
Benjamin Kramer 970f4959d4 InstCombine: Hoist 3 copies of AddOne/SubOne into a header.
llvm-svn: 199605
2014-01-19 16:56:10 +00:00
Benjamin Kramer 7a74bd4703 InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with the real thing.
llvm-svn: 199604
2014-01-19 16:48:41 +00:00
Benjamin Kramer 72196f3ae5 InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors.
llvm-svn: 199602
2014-01-19 15:24:22 +00:00
Benjamin Kramer 76b15d04ff InstCombine: Refactor fmul/fdiv combines to handle vectors.
llvm-svn: 199598
2014-01-19 13:36:27 +00:00
Nick Lewycky a6a17d77d2 Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu!
llvm-svn: 199564
2014-01-18 22:47:12 +00:00
Benjamin Kramer fea9ac99b0 InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle vectors too.
PR18532.

llvm-svn: 199553
2014-01-18 16:43:14 +00:00
Owen Anderson 48b842ef7c Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep).
llvm-svn: 199528
2014-01-18 00:48:14 +00:00
Owen Anderson e7321660c1 Fix two cases where we could lose fast math flags when optimizing FADD expressions.
llvm-svn: 199427
2014-01-16 21:26:02 +00:00
Owen Anderson 4557a156e3 Fix an instance where we would drop fast math flags when performing an fdiv to reciprocal multiply transformation.
llvm-svn: 199425
2014-01-16 21:07:52 +00:00
Owen Anderson e8537fc7e0 Fix a bug in InstCombine where we failed to preserve fast math flags when optimizing an FMUL expression.
llvm-svn: 199424
2014-01-16 20:59:41 +00:00
Owen Anderson f74cfe031f Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which LLVM expresses as (fsub -0.0, X).
llvm-svn: 199420
2014-01-16 20:36:42 +00:00
Matt Arsenault 2d353d1a10 Do pointer cast simplifications on addrspacecast
llvm-svn: 199254
2014-01-14 20:00:45 +00:00
Matt Arsenault f08a44f903 Remove a check for an illegal condition.
Bitcasts can't be between address spaces anymore.

llvm-svn: 199253
2014-01-14 19:56:57 +00:00
Hao Liu 26abebbb2c Fix a bug about generating undef operand when optimising shuffle vector and insert element in instruction combine.
llvm-svn: 198730
2014-01-08 03:06:15 +00:00
Kay Tiong Khoo e37d52095e Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose links may not be permanent.
llvm-svn: 197713
2013-12-19 18:35:54 +00:00
Kay Tiong Khoo a570b5adb5 Improved fix for PR17827 (instcombine of shift/and/compare).
This change fixes the case of arithmetic shift right - do not attempt to fold that case.
This change also relaxes the conditions when attempting to fold the logical shift right and shift left cases.

No additional IR-level test cases included at this time. See http://llvm.org/bugs/show_bug.cgi?id=17827 for proofs that these are correct transformations.

llvm-svn: 197705
2013-12-19 18:07:17 +00:00
Matt Arsenault bbf18c6958 Fix assert with copy from global through addrspacecast
llvm-svn: 196638
2013-12-07 02:58:45 +00:00
Duncan P. N. Exon Smith ce5f93efd5 Don't use isNullValue to evaluate ConstantExpr
ConstantExpr can evaluate to false even when isNullValue gives false.

Fixes PR18143.

llvm-svn: 196611
2013-12-06 21:48:36 +00:00
Kay Tiong Khoo d7b00cac10 Use local variable for repeated use rather than 'get' method. No functional change intended.
llvm-svn: 196164
2013-12-02 22:23:32 +00:00
Kay Tiong Khoo 64b732005f Move variables to where they are used and give them better names. No functional change intended.
llvm-svn: 196163
2013-12-02 22:20:40 +00:00
Kay Tiong Khoo 564560f911 Rename variables to be consistent (CST -> Cst). No functional change intended.
llvm-svn: 196161
2013-12-02 22:11:56 +00:00
Kay Tiong Khoo 5389f74655 Conservative fix for PR17827 - don't optimize a shift + and + compare sequence where the shift is logical unless the comparison is unsigned
llvm-svn: 196129
2013-12-02 18:43:59 +00:00
Stephen Canon c454964c47 Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).
llvm-svn: 195934
2013-11-28 21:38:05 +00:00
Hal Finkel 12100bf7e8 Apply the InstCombine fptrunc sqrt optimization to llvm.sqrt
InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls:

  (fptrunc (sqrt (fpext x))) -> (sqrtf x)

but does not apply the same optimization to llvm.sqrt. This is a problem
because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in
fast-math mode, and because this optimization is being applied to sqrt and not
applied to llvm.sqrt, sometimes the fast-math code is slower.

This change makes InstCombine apply this optimization to llvm.sqrt as well.

This fixes the specific problem in PR17758, although the same underlying issue
(optimizations applied to libcalls are not applied to intrinsics) exists for
other optimizations in SimplifyLibCalls.

llvm-svn: 194935
2013-11-16 21:29:08 +00:00
Benjamin Kramer 03f3e248eb InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs.
This is common in bitfield code.

llvm-svn: 194925
2013-11-16 16:00:48 +00:00
Matt Arsenault a9e95abcbf Add instcombine visitor for addrspacecast
llvm-svn: 194786
2013-11-15 05:45:08 +00:00
Nadav Rotem ea186b9515 Update the docs to match the function name.
llvm-svn: 194537
2013-11-13 01:12:01 +00:00
Nadav Rotem 0ed2fdb5af Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on).
llvm-svn: 194525
2013-11-12 22:38:59 +00:00
Matt Arsenault 243140f2fd Scalarize select vector arguments when extracted.
When the elements are extracted from a select on vectors
or a vector select, do the select on the extracted scalars
from the input if there is only one use.

llvm-svn: 194013
2013-11-04 20:36:06 +00:00
Craig Topper ef9e993eaa Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.
llvm-svn: 192672
2013-10-15 05:20:47 +00:00
Owen Anderson 5797bfd4a3 Pull fptrunc's upwards through selects when one of the select's selectands was a constant. This has a number of benefits, including producing small immediates (easier to materialize, smaller constant pools) as well as being more likely to allow the fptrunc to fuse with a preceding instruction (truncating selects are unusual).
llvm-svn: 191929
2013-10-03 21:08:05 +00:00
Matt Arsenault bfa37e546d Make gep i8* X, -(ptrtoint Y) transform work with address spaces
llvm-svn: 191920
2013-10-03 18:15:57 +00:00
Matt Arsenault 8468062c6e Use right address space size in InstCombineCompares
The test's output doesn't change, but this ensures
this is actually hit with a different address space.

llvm-svn: 191701
2013-09-30 21:11:01 +00:00
Matt Arsenault 06adecabe7 Constant fold ptrtoint + compare with address spaces
llvm-svn: 191699
2013-09-30 21:06:18 +00:00
Benjamin Kramer 6748576a0d InstCombine: Replace manual fast math flag copying with the new IRBuilder RAII helper.
Defines away the issue where cast<Instruction> would fail because constant
folding happened. Also slightly cleaner.

llvm-svn: 191674
2013-09-30 15:39:59 +00:00
Joey Gouly d51a35c6a0 Fix a bug in InstCombine where it attempted to cast a Value* to an Instruction*
when it was actually a Constant*.

There are quite a few other casts to Instruction that might have the same problem,
but this is the only one I have a test case for.

llvm-svn: 191668
2013-09-30 14:18:35 +00:00
Matt Arsenault fa25272db9 Use type helper functions
llvm-svn: 191574
2013-09-27 22:18:51 +00:00
Justin Bogner 4a9ac8cd75 InstCombine: Only foldSelectICmpAndOr for integer types
Currently foldSelectICmpAndOr asserts if the "or" involves a vector
containing several of the same power of two. We can easily avoid this by
only performing the fold on integer types, like foldSelectICmpAnd does.

Fixes <rdar://problem/15012516>

llvm-svn: 191552
2013-09-27 20:35:39 +00:00
Benjamin Kramer 30d249a1b3 Push analysis passes to InstSimplify when they're around anyways.
llvm-svn: 191309
2013-09-24 16:37:40 +00:00
Benjamin Kramer 0e2d162d1e InstCombine: Remove unused argument. No functionality change.
llvm-svn: 191112
2013-09-20 22:12:42 +00:00
Benjamin Kramer e6461e3053 InstCombine: Canonicalize (gep i8* X, -(ptrtoint Y)) to (sub (ptrtoint X), (ptrtoint Y))
The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what
you get for pointer subtraction in C code. The rest of instcombine already
knows how to deal with that so just canonicalize on that.

llvm-svn: 191090
2013-09-20 14:38:44 +00:00
Shuxin Yang 3a7ca6ec87 [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.

  The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care 
this transformation.

  This patch sees 6% on a benchmark.

rdar://15032743

llvm-svn: 191037
2013-09-19 21:13:46 +00:00
Benjamin Kramer 0b37cdf9af InstCombine: Don't allow turning vector-of-pointer loads into vector-of-integer.
The code below can't handle any pointers. PR17293.

llvm-svn: 191036
2013-09-19 20:59:04 +00:00
Quentin Colombet 870b662779 Revert the load slicing done in r190870.
To avoid regressions with bitfield optimizations, this slicing should take place
later, like ISel time.

llvm-svn: 190891
2013-09-17 22:01:26 +00:00
Matt Arsenault e6952f28ca Cleanup handling of constant function casts.
Some of this code is no longer necessary since int<->ptr casts are no
longer occur as of r187444.

This also fixes handling vectors of pointers, and adds a bunch of new
testcases for vectors and address spaces.

llvm-svn: 190885
2013-09-17 21:10:14 +00:00
Quentin Colombet b8d672ef5b [InstCombiner] Slice a big load in two loads when the elements are next to each
other in memory.

The motivation was to get rid of truncate and shift right instructions that get
in the way of paired load or floating point load.
E.g.,
Consider the following example:
struct Complex {
  float real;
  float imm;
};

When accessing a complex, llvm was generating a 64-bits load and the imm field
was obtained by a trunc(lshr) sequence, resulting in poor code generation, at
least for x86.

The idea is to declare that two load instructions is the canonical form for
loading two arithmetic type, which are next to each other in memory.

Two scalar loads at a constant offset from each other are pretty
easy to detect for the sorts of passes that like to mess with loads. 

<rdar://problem/14477220>

llvm-svn: 190870
2013-09-17 16:57:34 +00:00
Eli Friedman 77d7fbb924 Get rid of unused isPodLike definitions.
llvm-svn: 190461
2013-09-11 00:36:54 +00:00
Quentin Colombet 5ab555532b [InstCombiner] Expose opportunities to merge subtract and comparison.
Several architectures use the same instruction to perform both a comparison and
a subtract. The instruction selection framework does not allow to consider
different basic blocks to expose such fusion opportunities.

Therefore, these instructions are “merged” by CSE at MI IR level.

To increase the likelihood of CSE to apply in such situation, we reorder the
operands of the comparison, when they have the same complexity, so that they
matches the order of the most frequent subtract.
E.g.,

icmp A, B
...
sub B, A

<rdar://problem/14514580>

llvm-svn: 190352
2013-09-09 20:56:48 +00:00
Matt Arsenault 8227b9f69c Use type helper functions.
llvm-svn: 190113
2013-09-06 00:37:24 +00:00
Matt Arsenault e6db76071c Consistently use dbgs() in debug printing
llvm-svn: 190093
2013-09-05 19:48:28 +00:00
Tim Northover dc647a2603 InstCombine: allow unmasked icmps to be combined with logical ops
"(icmp op i8 A, B)" is equivalent to "(icmp op i8 (A & 0xff), B)" as a
degenerate case. Allowing this as a "masked" comparison when analysing "(icmp)
&/| (icmp)" allows us to combine them in more cases.

rdar://problem/7625728

llvm-svn: 189931
2013-09-04 11:57:17 +00:00
Tim Northover c0756c454c InstCombine: look for masked compares with subset relation
Even in cases which aren't universally optimisable like "(A & B) != 0 && (A &
C) != 0", the masks can make one of the comparisons completely redundant. In
this case, since we've gone to the effort of spotting masked comparisons we
should combine them.

rdar://problem/7625728

llvm-svn: 189930
2013-09-04 11:57:13 +00:00
Matt Arsenault 3dfe54e954 Teach InstCombineLoadCast about address spaces.
This is another one that doesn't matter much,
but uses the right GEP index types in the first
place.

llvm-svn: 189854
2013-09-03 21:05:48 +00:00
Matt Arsenault e38e4cdc46 Use type form of getIntPtrType in alloca visitor.
This doesn't actually matter, since alloca is always
0 address space, but this is more consistent.

llvm-svn: 189853
2013-09-03 21:05:15 +00:00
Benjamin Kramer 010f108382 InstCombine: Check for zero shift amounts before subtracting one causing integer overflow.
PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits
(those are always undef because we can't represent integer types that large).

llvm-svn: 189672
2013-08-30 14:35:35 +00:00
Matt Arsenault 38874731f6 Fix typo.
llvm-svn: 189524
2013-08-28 22:17:26 +00:00
Matt Arsenault 745101d666 Teach InstCombine about address spaces
llvm-svn: 188926
2013-08-21 19:53:10 +00:00
Jakub Staszak b4eb6adebb Use pop_back_val() instead of both back() and pop_back().
llvm-svn: 188723
2013-08-19 22:47:55 +00:00
Matt Arsenault d79f7d9ea1 Teach InstCombine visitGetElementPtr about address spaces
llvm-svn: 188721
2013-08-19 22:17:40 +00:00
Matt Arsenault 98f34e3abe Cleanup visitGetElementPtr to make address space change easier
llvm-svn: 188720
2013-08-19 22:17:34 +00:00
Matt Arsenault 94a028aa43 commonPointerCast cleanups to make address space change easier
llvm-svn: 188719
2013-08-19 22:17:18 +00:00
Matt Arsenault 5aeae18e9d Revert non-test parts of r188507
Re-add the inboundsless tests I didn't add originally

llvm-svn: 188710
2013-08-19 21:40:31 +00:00
Jim Grosbach d0de8ace8a InstCombine: Use isAllOnesValue() instead of explicit -1.
llvm-svn: 188563
2013-08-16 17:03:36 +00:00
Jim Grosbach 20e3b9ac30 InstCombine: Simplify if(x!=0 && x!=-1).
When both constants are positive or both constants are negative,
InstCombine already simplifies comparisons like this, but when
it's exactly zero and -1, the operand sorting ends up reversed
and the pattern fails to match. Handle that special case.

Follow up for rdar://14689217

llvm-svn: 188512
2013-08-16 00:15:20 +00:00
Matt Arsenault 1de76773bc Don't do FoldCmpLoadFromIndexedGlobal for non inbounds GEPs
This path wasn't tested before without a datalayout,
so add some more tests and re-run with and without one.

llvm-svn: 188507
2013-08-15 23:11:07 +00:00
Matt Arsenault 9e3a6ca698 Fix always creating GEP with i32 indices
Use the pointer size if datalayout is available.
Use i64 if it's not, which is consistent with what other
places do when the pointer size is unknown.

The test doesn't really test this in a useful way
since it will be transformed to that later anyway,
but this now tests it for non-zero arrays and when
datalayout isn't available. The cases in
visitGetElementPtrInst should save an extra re-visit to
the newly created GEP since it won't need to cleanup after
itself.

llvm-svn: 188339
2013-08-14 00:24:38 +00:00
Matt Arsenault fc00f7eabd Use type helper functions instead of cast
llvm-svn: 188338
2013-08-14 00:24:34 +00:00
Matt Arsenault 640ff9dbcf Use array initializer, space around operator
llvm-svn: 188337
2013-08-14 00:24:05 +00:00
Richard Sandiford feb34713d5 Fix big-endian handling of integer-to-vector bitcasts in InstCombine
These functions used to assume that the lsb of an integer corresponds
to vector element 0, whereas for big-endian it's the other way around:
the msb is in the first element and the lsb is in the last element.

Fixes MultiSource/Benchmarks/mediabench/gsm/toast for z.

llvm-svn: 188155
2013-08-12 07:26:09 +00:00
Matt Arsenault ff7dc7248e Fix missing -*- C++ -*-s
llvm-svn: 187758
2013-08-06 00:16:21 +00:00
Owen Anderson c7be519dc0 Preserve fast-math flags when folding (fsub x, (fneg y)) to (fadd x, y).
llvm-svn: 187462
2013-07-30 23:53:17 +00:00
Matt Arsenault cacbb2377a Change behavior of calling bitcasted alias functions.
It will now only convert the arguments / return value and call
the underlying function if the types are able to be bitcasted.
This avoids using fp<->int conversions that would occur before.

llvm-svn: 187444
2013-07-30 20:45:05 +00:00
Owen Anderson d6d4da09f7 Fix variable name.
llvm-svn: 187253
2013-07-26 22:06:21 +00:00
Owen Anderson e37c2e4d11 When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is
also worthwhile for it to look through FP extensions and truncations, whose
application commutes with fneg.

llvm-svn: 187249
2013-07-26 21:40:29 +00:00
Stephen Lin 4ef1387221 Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP for consistency.
llvm-svn: 187225
2013-07-26 17:55:00 +00:00
Stephen Lin a9b57f6bea InstCombine: call FoldOpIntoSelect for all floating binops, not just fmul
llvm-svn: 186759
2013-07-20 07:13:13 +00:00
Stephen Lin 03f9fbbcd7 Restore r181216, which was partially reverted in r182499.
llvm-svn: 186533
2013-07-17 20:06:03 +00:00
Craig Topper 5871321e49 Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).
llvm-svn: 186301
2013-07-15 04:27:47 +00:00
Craig Topper b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Nick Lewycky 7459be6dc7 Add a microoptimization for urem.
llvm-svn: 186235
2013-07-13 01:16:47 +00:00
Joey Gouly a3250f22c2 Fix a crash in EvaluateInDifferentElementOrder where it would generate an
undef vector of the wrong type.

LGTM'd by Nick Lewycky on IRC.

llvm-svn: 186224
2013-07-12 23:08:06 +00:00
Benjamin Kramer fc3ea6f4bc Don't use a potentially expensive shift if all we want is one set bit.
No functionality change.

llvm-svn: 186095
2013-07-11 16:05:50 +00:00
David Majnemer eeed73b981 InstCombine: Fix typo in comment for visitICmpInstWithInstAndIntCst
llvm-svn: 185916
2013-07-09 09:24:35 +00:00
David Majnemer 72d76275ac InstCombine: variations on 0xffffffff - x >= 4
The following transforms are valid if -C is a power of 2:
(icmp ugt (xor X, C), ~C) -> (icmp ult X, C)
(icmp ult (xor X, C), -C) -> (icmp uge X, C)

These are nice, they get rid of the xor.

llvm-svn: 185915
2013-07-09 09:20:58 +00:00
David Majnemer 414d4e58aa InstCombine: X & -C != -C -> X <= u ~C
Tests were added in r185910 somehow.

llvm-svn: 185912
2013-07-09 08:09:32 +00:00
David Majnemer bafa537eb7 Commit r185909 was a misapplied patch, fix it
llvm-svn: 185910
2013-07-09 07:58:32 +00:00
David Majnemer f2a9a513c7 InstCombine: add more transforms
C1-X <u C2 -> (X|(C2-1)) == C1
C1-X >u C2 -> (X|C2) == C1
X-C1 <u C2 -> (X & -C2) == C1
X-C1 >u C2 -> (X & ~C2) == C1

llvm-svn: 185909
2013-07-09 07:50:59 +00:00
David Majnemer fa90a0b325 InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1
Back in r179493 we determined that two transforms collided with each
other.  The fix back then was to reorder the transforms so that the
preferred transform would give it a try and then we would try the
secondary transform.  However, it was noted that the best approach would
canonicalize one transform into the other, removing the collision and
allowing us to optimize IR given to us in that form.

llvm-svn: 185808
2013-07-08 11:53:08 +00:00
David Majnemer c2a990bc00 InstCombine: (icmp eq B, 0) | (icmp ult A, B) -> (icmp ule A, B-1)
This transform allows us to turn IR that looks like:
  %1 = icmp eq i64 %b, 0
  %2 = icmp ult i64 %a, %b
  %3 = or i1 %1, %2
  ret i1 %3

into:
  %0 = add i64 %b, -1
  %1 = icmp uge i64 %0, %a
  ret i1 %1

which means we go from lowering:
        cmpq    %rsi, %rdi
        setb    %cl
        testq   %rsi, %rsi
        sete    %al
        orb     %cl, %al
        ret

to lowering:
        decq    %rsi
        cmpq    %rdi, %rsi
        setae   %al
        ret

llvm-svn: 185677
2013-07-05 00:31:17 +00:00
David Majnemer 37f8f445de InstCombine: Reimplementation of visitUDivOperand
This transform was originally added in r185257 but later removed in
r185415.  The original transform would create instructions speculatively
and then discard them if the speculation was proved incorrect.  This has
been replaced with a scheme that splits the transform into two parts:
preflight and fold.  While we preflight, we build up fold actions that
inform the folding stage on how to act.

llvm-svn: 185667
2013-07-04 21:17:49 +00:00
Craig Topper af0dea1347 Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185606
2013-07-04 01:31:24 +00:00
Hal Finkel fdbe161b1a Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms)
I'm reverting this commit because:

 1. As discussed during review, it needs to be rewritten (to avoid creating and
then deleting instructions).

 2. This is causing optimizer crashes. Specifically, I'm seeing things like
this:

    While deleting: i1 %
    Use still stuck around after Def is destroyed:  <badref> = select i1 <badref>, i32 0, i32 1
    opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.

   I'd guess that these will go away once we're no longer creating/deleting
instructions here, but just in case, I'm adding a regression test.

Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message:

	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms

	Real world code sometimes has the denominator of a 'udiv' be a
	'select'.  LLVM can handle such cases but only when the 'select'
	operands are symmetric in structure (both select operands are a constant
	power of two or a left shift, etc.).  This falls apart if we are dealt a
	'udiv' where the code is not symetric or if the select operands lead us
	to more select instructions.

	Instead, we should treat the LHS and each select operand as a distinct
	divide operation and try to optimize them independently.  If we can
	to simplify each operation, then we can replace the 'udiv' with, say, a
	'lshr' that has a new select with a bunch of new operands for the
	select.

llvm-svn: 185415
2013-07-02 05:21:11 +00:00
Benjamin Kramer 4093f29366 InstCombine: Also turn selects fed by an and into arithmetic when the types don't match.
Inserting a zext or trunc is sufficient. This pattern is somewhat common in
LLVM's pointer mangling code.

llvm-svn: 185270
2013-06-29 21:17:04 +00:00
David Majnemer 5953d3712a InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison
Changing the sign when comparing the base pointer would introduce all
sorts of unexpected things like:
  %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0
  %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0
  %cmp.i = icmp ult i8* %gep.i, %gep2.i
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = icmp ne i1 %cmp.i, %cmp.i1
  ret i1 %cmp

into:
  %cmp.i = icmp slt [1 x i8]* %a, %b
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = xor i1 %cmp.i, %cmp.i1
  ret i1 %cmp

By preserving the original sign, we now get:
  ret i1 false

This fixes PR16483.

llvm-svn: 185259
2013-06-29 10:28:04 +00:00
David Majnemer 92a8a7d45a InstCombine: Small whitespace cleanup in FoldGEPICmp
llvm-svn: 185258
2013-06-29 09:45:35 +00:00
David Majnemer 797227eea6 InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms
Real world code sometimes has the denominator of a 'udiv' be a
'select'.  LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.).  This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.

Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently.  If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.

llvm-svn: 185257
2013-06-29 08:40:07 +00:00
David Majnemer b889e405eb InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2)
We may, after other optimizations, find ourselves with IR that looks
like:

  %shl = shl i32 1, %y
  %cmp = icmp ult i32 %shl, 32

Instead, we should just compare the shift count:

  %cmp = icmp ult i32 %y, 5

llvm-svn: 185242
2013-06-28 23:42:03 +00:00
Matt Arsenault 5d2e85f6d7 Fix using arg_end() - arg_begin() instead of arg_size()
llvm-svn: 185121
2013-06-28 00:25:40 +00:00
Michael Gottesman 79b0967548 Revert "Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float.""
This reverts commit r185099.

Looks like both the ppc-64 and mips bots are still failing after I reverted this
change.

Since:

1. The mips bot always performs a clean build,
2. The ppc64-bot failed again after a clean build (I asked the ppc-64
maintainers to clean the bot which they did... Thanks Will!),

I think it is safe to assume that this change was not the cause of the failures
that said builders were seeing. Thus I am recomitting.

llvm-svn: 185111
2013-06-27 21:58:19 +00:00
Michael Gottesman ccaf3321f1 Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float."
This reverts commit r185095. This is causing a FileCheck failure on
the 3dnow intrinsics on at least the mips/ppc bots but not on the x86
bots.

Reverting while I figure out what is going on.

llvm-svn: 185099
2013-06-27 20:40:11 +00:00
Michael Gottesman 03255a1675 [APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float.
The category which an APFloat belongs to should be dependent on the
actual value that the APFloat has, not be arbitrarily passed in by the
user. This will prevent inconsistency bugs where the category and the
actual value in APFloat differ.

I also fixed up all of the references to this constructor (which were
only in LLVM).

llvm-svn: 185095
2013-06-27 19:50:52 +00:00
Michael Gottesman c2af8d6273 In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && !APFloat.isDenormal => APFloat.isNormal.
llvm-svn: 185037
2013-06-26 23:17:31 +00:00
Michael Gottesman 3cb77ab98a [APFloat] Converted all references to APFloat::isNormal => APFloat::isFiniteNonZero.
Turns out all the references were in llvm and not in clang.

llvm-svn: 184356
2013-06-19 21:23:18 +00:00
Jakub Staszak 96ff4d6d3b Simplify code. No functionality change.
llvm-svn: 183461
2013-06-06 23:34:59 +00:00
Jakub Staszak bddea11bc5 Re-apply "Use IRBuilder instead of ConstantInt methods." with the fixed issues.
llvm-svn: 183439
2013-06-06 20:18:46 +00:00
Rafael Espindola a7bbc0b740 Revert "Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit."
This reverts commit 183328. It caused pr16244 and broke the bots.

llvm-svn: 183422
2013-06-06 17:03:05 +00:00
Jakub Staszak 9de494e0ee Remove unneeded cast<>.
llvm-svn: 183363
2013-06-06 00:49:57 +00:00
Jakub Staszak 461d1fe6fc Use IRBuilder instead of ConstantInt methods.
llvm-svn: 183360
2013-06-06 00:37:23 +00:00
Jakub Staszak 2f390b755a Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit.
llvm-svn: 183328
2013-06-05 18:27:02 +00:00
Nick Lewycky 688d668e5c Delete dead safety check.
llvm-svn: 183167
2013-06-03 23:15:20 +00:00
Nick Lewycky 3f715e260a When determining the new index for an insertelement, we may not assume that an
index greater than the size of the vector is invalid. The shuffle may be
shrinking the size of the vector. Fixes a crash!

Also drop the maximum recursion depth of the safety check for this
optimization to five.

llvm-svn: 183080
2013-06-01 20:51:31 +00:00
Rafael Espindola 65281bf36e Simplify multiplications by vectors whose elements are powers of 2.
Patch by Andrea Di Biagio.

llvm-svn: 183005
2013-05-31 14:27:15 +00:00
Nick Lewycky a2b7720618 Reapply with r182909 with a fix to the calculation of the new indices for
insertelement instructions.

llvm-svn: 182976
2013-05-31 00:59:42 +00:00
Evgeniy Stepanov 2c14269883 Revert r182909.
PR/16177

llvm-svn: 182919
2013-05-30 09:40:17 +00:00
Nick Lewycky d7f27094c0 Swizzle vector inputs if it helps us eliminate shuffles.
llvm-svn: 182909
2013-05-30 04:33:38 +00:00
Michael J. Spencer df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Joey Gouly b34294d0e4 Run clang-format over the scalarizePHI function.
llvm-svn: 182640
2013-05-24 12:33:28 +00:00
Joey Gouly 83699284be scalarizePHI needs to insert the next ExtractElement in the same block
as the BinaryOperator, *not* in the block where the IRBuilder is currently
inserting into. Fixes a bug where scalarizePHI would create instructions
that would not dominate all uses.

llvm-svn: 182639
2013-05-24 12:29:54 +00:00
Jean-Luc Duprat 0dda6f168c This is an update to a previous commit (r181216).
The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B

Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C 
down to :
select C, B, A

In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice.  This change list amends the previous one to enable
just these inst combines:

select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A

llvm-svn: 182499
2013-05-22 18:29:31 +00:00
Matt Arsenault 52ddb7bcdd Add missing -*- C++ -*- to headers
llvm-svn: 182164
2013-05-17 21:43:39 +00:00
Sylvestre Ledru 149e281aa8 Fix two typo
llvm-svn: 181848
2013-05-14 23:36:24 +00:00
David Majnemer 6c30f49af3 InstCombine: Flip the order of two urem transforms
There are two transforms in visitUrem that conflict with each other.

*) One, if a divisor is a power of two, subtracts one from the divisor
   and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
   instructions.

Flipping the order allows the subtraction to go beneath the sign
extension.

llvm-svn: 181668
2013-05-12 00:07:05 +00:00
David Majnemer 470b077bca InstCombine: Turn urem to bitwise-and more often
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.

llvm-svn: 181661
2013-05-11 09:01:28 +00:00
Benjamin Kramer 14e915f7b4 InstCombine: Don't claim to be able to evaluate any shl in a zexted type.
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.

PR15959.

llvm-svn: 181604
2013-05-10 16:26:37 +00:00
Benjamin Kramer a6645e8b8f InstCombine: Verify the type before transforming uitofp into select.
PR15952.

llvm-svn: 181586
2013-05-10 09:16:52 +00:00
Benjamin Kramer 21b972ae94 InstCombine: Don't just copy known bits from the first operand of an srem.
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.

llvm-svn: 181518
2013-05-09 16:32:32 +00:00
David Majnemer 70f286d95f InstCombine: (X ^ signbit) + C -> X + (signbit ^ C)
llvm-svn: 181249
2013-05-06 21:21:31 +00:00
Jean-Luc Duprat 3e4fc3ef24 Provide InstCombines for the following 3 cases:
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A

These come up in code that has been hand-optimized from a select to a linear blend, 
on platforms where that may have mattered. We want to undo such changes 
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B

llvm-svn: 181216
2013-05-06 16:55:50 +00:00
Nadav Rotem c70ef4e93c Revert r164763 because it introduces new shuffles.
Thanks Nick Lewycky for pointing this out.

llvm-svn: 181177
2013-05-06 02:39:09 +00:00
Dmitri Gribenko 3238fb7595 Add ArrayRef constructor from None, and do the cleanups that this constructor enables
Patch by Robert Wilhelm.

llvm-svn: 181138
2013-05-05 00:40:33 +00:00
Nick Lewycky 881e9d62e2 Tabs to spaces. No functionality change.
llvm-svn: 181082
2013-05-04 01:08:15 +00:00
Filip Pizlo dec20e43c0 This patch breaks up Wrap.h so that it does not have to include all of
the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.

llvm-svn: 180881
2013-05-01 20:59:00 +00:00
Jim Grosbach d11584a7f7 Revert "InstCombine: Fold more shuffles of shuffles."
This reverts commit r180802

There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.

llvm-svn: 180834
2013-05-01 00:25:27 +00:00
Jim Grosbach 0b914fe839 InstCombine: Fold more shuffles of shuffles.
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.

rdar://13402653
PR15866

llvm-svn: 180802
2013-04-30 20:43:52 +00:00
David Majnemer d73f37bb83 Fix a bug in foldSelectICmpAndOr.
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.

llvm-svn: 180779
2013-04-30 10:36:33 +00:00
David Majnemer 8d048d0482 Fix "Combine bit test + conditional or into simple math"
This fixes the optimization introduced in r179748 and reverted in r179750.

While the optimization was sound, it did not properly respect differences in
bit-width.

llvm-svn: 180777
2013-04-30 08:57:58 +00:00
Eric Christopher 04d4e9312c Move C++ code out of the C headers and into either C++ headers
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063
2013-04-22 22:47:22 +00:00
Anat Shemer 10260a75e3 Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users.
llvm-svn: 180045
2013-04-22 20:51:10 +00:00
Jakub Staszak 99317268e2 Keep coding stanard. Don't use "else if" after "return".
llvm-svn: 179826
2013-04-19 01:18:04 +00:00
Anat Shemer 5570318f43 In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use.
llvm-svn: 179786
2013-04-18 19:56:44 +00:00
Anat Shemer 0c95efad7e Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location.
llvm-svn: 179783
2013-04-18 19:35:39 +00:00
David Majnemer 81af06e003 Revert "Combine bit test + conditional or into simple math"
It is causing stage2 builds to fail, let's get them running again.

llvm-svn: 179750
2013-04-18 08:42:33 +00:00
David Majnemer bdf0caf6b1 Combine bit test + conditional or into simple math
Simplify:
(select (icmp eq (and X, C1), 0), Y, (or Y, C2))

Into:
(or (shl (and X, C1), C3), y)

Where:
C3 = Log(C2) - Log(C1)

If:
C1 and C2 are both powers of two

llvm-svn: 179748
2013-04-18 07:30:07 +00:00
David Majnemer 1fae195557 Reorders two transforms that collide with each other
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493
2013-04-14 21:15:43 +00:00
Benjamin Kramer e89c705030 InstCombine: Check the operand types before merging fcmp ord & fcmp ord.
Fixes PR15737.

llvm-svn: 179417
2013-04-12 21:56:23 +00:00
David Majnemer 1a08accbb7 Simplify (A & ~B) in icmp if A is a power of 2
The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0

llvm-svn: 179386
2013-04-12 17:25:07 +00:00
David Majnemer b81cd63c4b Optimize icmp involving addition better
Allows LLVM to optimize sequences like the following:

%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y

into:

%cmp = icmp sge i32 %x, %y

as well as:

%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2

into:

%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x

llvm-svn: 179316
2013-04-11 20:05:46 +00:00
Benjamin Kramer a95f87494a Fix for wrong instcombine on vector insert/extract
When trying to collapse sequences of insertelement/extractelement
instructions into single shuffle instructions, there is one specific
case where the Instruction Combiner wrongly updates the resulting
Mask of shuffle indexes.

The problem is in function CollectShuffleElments.

If we have a sequence of insert/extract element instructions
like the one below:

  %tmp1 = extractelement <4 x float> %LHS, i32 0
  %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
  %tmp3 = extractelement <4 x float> %RHS, i32 2
  %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3

Where:
  . %RHS will have a mask of [4,5,6,7]
  . %LHS will have a mask of [0,1,2,3]

The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
instead of [4,0,6,7].
When analyzing %tmp2 in order to compute the Mask for the
resulting shuffle instruction, the algorithm forgets to update
the mask index at position 1 with the index associated to the
element extracted from %LHS by instruction %tmp1.

Patch by Andrea DiBiagio!

llvm-svn: 179291
2013-04-11 15:10:09 +00:00
Jim Grosbach bdbd73460c Tidy up a bit. No functional change.
llvm-svn: 178915
2013-04-05 21:20:12 +00:00
Akira Hatanaka 99866dd535 Check if Type is a vector before calling function Type::getVectorNumElements.
llvm-svn: 178208
2013-03-28 01:28:02 +00:00
Ulrich Weigand 8a51d8ea95 Make InstCombineCasts.cpp:OptimizeIntToFloatBitCast endian safe.
The OptimizeIntToFloatBitCast converts shift-truncate sequences
into extractelement operations.  The computation of the element
index to be used in the resulting operation is currently only
correct for little-endian targets.

This commit fixes the element index computation to be correct
for big-endian targets as well.  If the target byte order is
unknown, the optimization cannot be performed at all.

llvm-svn: 178031
2013-03-26 15:36:14 +00:00
Shuxin Yang 389ed4b8f7 Fix a bug in fast-math fadd/fsub simplification.
The problem is that the code mistakenly took for granted that following constructor 
is able to create an APFloat from a *SIGNED* integer:
   
  APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value)

rdar://13486998

llvm-svn: 177906
2013-03-25 20:43:41 +00:00
Arnaud A. de Grandmaison 3ee88e8a77 Address issues found by Duncan during post-commit review of r177856.
llvm-svn: 177863
2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison 9c383d68cf InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst)
This simplification happens at 2 places :
 - using the nsw attribute when the shl / mul is used by a sign test
 - when the shl / mul is compared for (in)equality to zero

llvm-svn: 177856
2013-03-25 09:48:49 +00:00
Arnaud A. de Grandmaison f364bc63e7 InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test.
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.

Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan

llvm-svn: 177712
2013-03-22 08:25:01 +00:00
Shuxin Yang 2eca602f8b Perform factorization as a last resort of unsafe fadd/fsub simplification.
Rules include:
  1)1 x*y +/- x*z => x*(y +/- z) 
    (the order of operands dosen't matter)

  2) y/x +/- z/x => (y +/- z)/x 

 The transformation is disabled if the new add/sub expr "y +/- z" is a 
denormal/naz/inifinity.

rdar://12911472

llvm-svn: 177088
2013-03-14 18:08:26 +00:00
Arnaud A. de Grandmaison 7153305b92 Fix a performance regression when combining to smaller types in icmp (shl %v, C1), C2 :
Only combine when the shl is only used by the icmp

llvm-svn: 176950
2013-03-13 14:40:37 +00:00
Jakub Staszak 2ef36b633b Simplify code. No functionality change.
llvm-svn: 176765
2013-03-09 11:18:59 +00:00
Jim Grosbach 95d2eb95c3 InstCombine: Don't shrink allocas when combining with a bitcast.
When considering folding a bitcast of an alloca into the alloca itself,
make sure we don't shrink the amount of memory being allocated, or
things rapidly go sideways.

rdar://13324424

llvm-svn: 176547
2013-03-06 05:44:53 +00:00
Quentin Colombet e684a6d4aa Fix a bug in instcombine for fmul in fast math mode.
The instcombine recognized pattern looks like:
a = b * c
d = a +/- Cst
or
a = b * c
d = Cst +/- a

When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0).

The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1.

llvm-svn: 176300
2013-02-28 21:12:40 +00:00
Bill Wendling 23242098e7 The transform is:
(or (bool?A:B),(bool?C:D)) --> (bool?(or A,C):(or B,D))

By the time the OR is visited, both the SELECTs have been visited and not
optimized and the OR itself hasn't been transformed so we do this transform in
the hopes that the new ORs will be optimized.

The transform is explicitly disabled for vector-selects until "codegen matures
to handle them better".

Patch by Muhammad Tauqir!

llvm-svn: 175380
2013-02-16 23:41:36 +00:00
Arnaud A. de Grandmaison 1fd843eee7 Fix refactoring mistake in "Teach InstCombine to work with smaller legal types..."
llvm-svn: 175273
2013-02-15 15:18:17 +00:00
Arnaud A. de Grandmaison 61c167c62b Teach InstCombine to work with smaller legal types in icmp (shl %v, C1), C2
It enables to work with a smaller constant, which is target friendly for those which can compare to immediates.
It also avoids inserting a shift in favor of a trunc, which can be free on some targets.

This used to work until LLVM-3.1, but regressed with the 3.2 release.

llvm-svn: 175270
2013-02-15 14:35:47 +00:00
Arnaud A. de Grandmaison 2e4df4f7c2 Fix comment
visitSExt is an adapted copy of the related visitZExt method, so adapt the comment accordingly.

llvm-svn: 175019
2013-02-13 00:19:19 +00:00
Michael Ilseman 74a6da963b Optimization: bitcast (<1 x ...> insertelement ..., X, ...) to ... ==> bitcast X to ...
llvm-svn: 174905
2013-02-11 21:41:44 +00:00
Andrew Trick 1bd53c3675 Revert "Have InstCombine call SipmlifyCall when handling calls. Test case included."
This reverts commit 3854a5d90fee52af1065edbed34521fff6cdc18d.

This causes a clang unit test to hang: vtable-available-externally.cpp.

llvm-svn: 174692
2013-02-08 01:55:39 +00:00
Michael Ilseman 6092dc5455 Have InstCombine call SipmlifyCall when handling calls. Test case included.
llvm-svn: 174675
2013-02-07 23:01:35 +00:00
Michael Ilseman 1dd6f2a5ba Preserve fast-math flags after reassociation and commutation. Update test cases
llvm-svn: 174571
2013-02-07 01:40:15 +00:00
Benjamin Kramer 944e0abf04 InstCombine: Fix and simplify the inttoptr side too.
llvm-svn: 174438
2013-02-05 20:22:40 +00:00
Benjamin Kramer e477875873 InstCombine: Harden code to work with vectors of pointers and simplify it a bit.
Found by running instcombine on a fabricated test case for the constant folder.

llvm-svn: 174430
2013-02-05 19:21:56 +00:00
Nadav Rotem 4349f6963e Revert r174152. The shift amount may overflow and in that case this transformation is illegal.
llvm-svn: 174156
2013-02-01 07:59:33 +00:00
Nadav Rotem 1d584029ae Optimize shift lefts of a constant by a value plus constant into a single shift.
llvm-svn: 174152
2013-02-01 06:45:40 +00:00
Bill Wendling d219675c2a Convert typeIncompatible to return an AttributeSet.
There are still places which treat the Attribute object as a collection of
attributes. I'm systematically removing them.

llvm-svn: 173990
2013-01-30 23:07:40 +00:00
Nadav Rotem 513bd8a73c InstCombine: canonicalize sext-and --> select
sext-not-and --> select.

Patch by Muhammad Tauqir Ahmad.

llvm-svn: 173901
2013-01-30 06:35:22 +00:00
Bill Wendling 3575c8c6d6 Use the AttributeSet instead of AttributeWithIndex.
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.

llvm-svn: 173602
2013-01-27 02:08:22 +00:00
Bill Wendling 57625a4966 Remove some introspection functions.
The 'getSlot' function and its ilk allow introspection into the AttributeSet
class. However, that class should be opaque. Allow access through accessor
methods instead.

llvm-svn: 173522
2013-01-25 23:09:36 +00:00
Bill Wendling 8649283e75 Use the new 'getSlotIndex' method to retrieve the attribute's slot index.
llvm-svn: 173499
2013-01-25 21:46:52 +00:00
Craig Topper 3529aa5fc2 Remove trailing whitespace.
llvm-svn: 173322
2013-01-24 05:22:40 +00:00
Benjamin Kramer e4c46fec73 Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone."
This causes crashes during the build of compiler-rt during selfhost. Add a
testcase for coverage.

llvm-svn: 173279
2013-01-23 17:52:29 +00:00
Benjamin Kramer cd86115d8a InstCombine: Clean up weird code that talks about a modulus that's long gone.
This does the right thing unless the multiplication overflows, but the old code
didn't handle that case either.

llvm-svn: 173276
2013-01-23 17:16:22 +00:00
Bill Wendling 49bc76cbb3 Remove the last of uses that use the Attribute object as a collection of attributes.
Collections of attributes are handled via the AttributeSet class now. This
finally frees us up to make significant changes to how attributes are structured.

llvm-svn: 173228
2013-01-23 06:14:59 +00:00
Bill Wendling c90d9f89b7 Have AttributeSet::getRetAttributes() return an AttributeSet instead of Attribute.
This further restricts the use of the Attribute class to the Attribute family of
classes.

llvm-svn: 173098
2013-01-21 22:44:49 +00:00
Bill Wendling bd4ea16bf3 Make AttributeSet::getFnAttributes() return an AttributeSet instead of an Attribute.
This is more code to isolate the use of the Attribute class to that of just
holding one attribute instead of a collection of attributes.

llvm-svn: 173094
2013-01-21 21:57:28 +00:00
Paul Redmond 9d86a4a3b6 Transform (sub 0, (zext bool to A)) to (sext bool to A) and
(sub 0, (sext bool to A)) to (zext bool to A).

Patch by Muhammad Ahmad
Reviewed by Duncan Sands

llvm-svn: 173093
2013-01-21 21:57:20 +00:00
Bill Wendling 658d24d211 Use AttributeSet accessor methods instead of Attribute accessor methods.
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.

llvm-svn: 172853
2013-01-18 21:53:16 +00:00
Bill Wendling 7754389526 Push some more methods down to hide the use of the Attribute class.
Because the Attribute class is going to stop representing a collection of
attributes, limit the use of it as an aggregate in favor of using AttributeSet.
This replaces some of the uses for querying the function attributes.

llvm-svn: 172844
2013-01-18 21:11:39 +00:00
Craig Topper 45d9f4b569 Check for less than 0 in shuffle mask instead of -1. It's more consistent with other code related to shuffles and easier to implement in compiled code.
llvm-svn: 172788
2013-01-18 05:30:07 +00:00
Craig Topper 2ea22b0b84 Remove trailing whitespace. Remove new lines between closing brace and 'else'
llvm-svn: 172784
2013-01-18 05:09:16 +00:00
Nadav Rotem 7df850924d Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero.
llvm-svn: 172576
2013-01-15 23:43:14 +00:00
Shuxin Yang e822745202 1. Hoist minus sign as high as possible in an attempt to reveal
some optimization opportunities (in the enclosing supper-expressions).

   rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "-0.0 - X" has only one reference.

   rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "0.0 - X" has only one reference, and
        the instruction is marked "noSignedZero".

2. Eliminate negation (The compiler was already able to handle these
    opt if the 0.0s are replaced with -0.0.)

   rule 3: (0.0 - X) * (0.0 - Y) => X * Y
   rule 4: (0.0 - X) * C => X * -C
   if the expr is flagged "noSignedZero".

3. 
  Rule 5: (X*Y) * X => (X*X) * Y
   if X!=Y and the expression is flagged with "UnsafeAlgebra".

   The purpose of this transformation is two-fold:
    a) to form a power expression (of X).
    b) potentially shorten the critical path: After transformation, the
       latency of the instruction Y is amortized by the expression of X*X,
       and therefore Y is in a "less critical" position compared to what it
      was before the transformation. 

4. Remove the InstCombine code about simplifiying "X * select".
   
   The reasons are following:
    a) The "select" is somewhat architecture-dependent, therefore the
       higher level optimizers are not able to precisely predict if
       the simplification really yields any performance improvement
       or not.

    b) The "select" operator is bit complicate, and tends to obscure
       optimization opportunities. It is btter to keep it as low as
       possible in expr tree, and let CodeGen to tackle the optimization.

llvm-svn: 172551
2013-01-15 21:09:32 +00:00
Jakub Staszak 190db2f25c Remove trailing spaces.
llvm-svn: 172489
2013-01-14 23:16:36 +00:00
Shuxin Yang 320f52a4b0 This change is to implement following rules under the condition C_A and/or C_R
---------------------------------------------------------------------------
 C_A: reassociation is allowed
 C_R: reciprocal of a constant C is appropriate, which means 
    - 1/C is exact, or 
    - reciprocal is allowed and 1/C is neither a special value nor a denormal.
 -----------------------------------------------------------------------------

 rule1:  (X/C1) / C2 => X / (C2*C1)  (if C_A)
                     => X * (1/(C2*C1))  (if C_A && C_R)
 rule 2:  X*C1 / C2 => X * (C1/C2)  if C_A
 rule 3: (X/Y)/Z = > X/(Y*Z)  (if C_A && at least one of Y and Z is symbolic value)
 rule 4: Z/(X/Y) = > (Z*Y)/X  (similar to rule3)

 rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
 rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
 rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)

llvm-svn: 172488
2013-01-14 22:48:41 +00:00
David Greene 530430be98 Fix Casting Bug
Add a const version of getFpValPtr to avoid a cast-away-const warning.

llvm-svn: 172467
2013-01-14 21:04:40 +00:00
Nick Lewycky 80ea003c6c Fix typo in comment.
llvm-svn: 172460
2013-01-14 20:56:10 +00:00
Owen Anderson dbf0ca523d Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc.
llvm-svn: 172117
2013-01-10 22:06:52 +00:00
Shuxin Yang f0537ab681 Consider expression "0.0 - X" as the negation of X if
- this expression is explicitly marked no-signed-zero, or
  - no-signed-zero of this expression can be derived from some context.

llvm-svn: 171922
2013-01-09 00:13:41 +00:00