Commit Graph

1133 Commits

Author SHA1 Message Date
Benjamin Kramer 6de786666a InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr.
We can't analyze the individual values of a vector expression. PR20114.

llvm-svn: 211581
2014-06-24 10:38:10 +00:00
Dinesh Dwivedi 562fd7534c Added instruction combine to transform few more negative values addition to subtraction (Part 1)
This patch enables transforms for following patterns.
  (x + (~(y & c) + 1)   -->   x - (y & c)
  (x + (~((y >> z) & c) + 1)   -->   x - ((y>>z) & c)

Differential Revision: http://reviews.llvm.org/D3733

llvm-svn: 211266
2014-06-19 10:36:52 +00:00
Dinesh Dwivedi b62e52e1b5 Refactored and updated SimplifyUsingDistributiveLaws() to
* Find factorization opportunities using identity values.
 * Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
 * Keep NSW flag while simplifying instruction using factorization.

This fixes PR19263.

Differential Revision: http://reviews.llvm.org/D3799

llvm-svn: 211261
2014-06-19 08:29:18 +00:00
David Majnemer 6cf6c05322 InstCombine: Stop two transforms dueling
InstCombineMulDivRem has:
// Canonicalize (X+C1)*CI -> X*CI+C1*CI.

InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z)  iff W == Y

These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.

The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.

To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.

This fixes PR20079.

llvm-svn: 211258
2014-06-19 07:14:33 +00:00
Nick Lewycky 8561a49c27 Move optimization of some cases of (A & C1)|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs.
llvm-svn: 211252
2014-06-19 03:51:46 +00:00
Nick Lewycky 802df52424 Remove redundant code in InstCombineShift, no functionality change because instsimplify already does this and instcombine calls instsimplify a few lines above. Patch by Suyog Sarda!
llvm-svn: 211250
2014-06-19 03:28:28 +00:00
Matt Arsenault a0050b0961 R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

llvm-svn: 211247
2014-06-19 01:19:19 +00:00
Jingyue Wu 33bd53df7f [InstCombine] mark ADD with nuw if no unsigned overflow
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".

Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll

Test Plan: make check-all

Reviewers: eliben, rafael, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D4144

llvm-svn: 211084
2014-06-17 00:42:07 +00:00
Jingyue Wu baabe5091c Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll

llvm-svn: 211004
2014-06-15 21:40:57 +00:00
Dinesh Dwivedi 95f0d51bd3 This removes TODO added in http://reviews.llvm.org/D3658
The patch transforms

ABS(NABS(X)) -> ABS(X)
NABS(ABS(X)) -> NABS(X)

Differential Revision: http://reviews.llvm.org/D4040

llvm-svn: 210782
2014-06-12 14:06:00 +00:00
Matt Arsenault 44f60d0a60 Look through addrspacecasts when turning ptr comparisons into
index comparisons.

llvm-svn: 210488
2014-06-09 19:20:29 +00:00
Rafael Espindola 4ba22f0813 Revert 209903 and 210040.
The messages were

 "PR19753: Optimize comparisons with "ashr exact" of a constanst."
 "Added support to optimize comparisons with "lshr exact" of a constant."

They were not correctly handling signed/unsigned operation differences,
causing pr19958.

llvm-svn: 210393
2014-06-07 04:12:35 +00:00
Jingyue Wu 77145d9410 InstCombine: Canonicalize addrspacecast between different element types
addrspacecast X addrspace(M)* to Y addrspace(N)*

-->

bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*

Updat all affected tests and add several new tests in addrspacecast.ll.

This patch is based on http://reviews.llvm.org/D2186 (authored by Matt
Arsenault) with fixes and more tests.

llvm-svn: 210375
2014-06-06 21:52:55 +00:00
Dinesh Dwivedi 3217b6c661 Added select flavour for ABS and NEG(ABS)
This patch can identify 
  ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X
  ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X
  NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X
  NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X
  
and can transform
  ABS(ABS(X)) -> ABS(X)
  NABS(NABS(X)) -> NABS(X)
  
Differential Revision: http://reviews.llvm.org/D3658

llvm-svn: 210312
2014-06-06 06:54:45 +00:00
Bill Schmidt a1184635ce [PPC64LE] Correct vperm -> shuffle transform for little endian
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector.  This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses.  This is represented with a
ppc_altivec_vperm intrinsic function.

The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.

This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.

The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.

llvm-svn: 210282
2014-06-05 19:46:04 +00:00
Rafael Espindola 78598d9ab5 Add a Constant version of stripPointerCasts.
Thanks to rnk for the suggestion.

llvm-svn: 210205
2014-06-04 19:01:48 +00:00
Rafael Espindola 4dc5dfc56b Clauses in a landingpad are always Constant. Use a stricter type.
llvm-svn: 210203
2014-06-04 18:51:31 +00:00
Rafael Espindola 04c2258624 InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
    signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
    the other operand has a known-zero bit in a more significant place than it
    (not including the sign bit) the ripple may go up to and fill the zero, but
    won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 210186
2014-06-04 15:39:14 +00:00
Rafael Espindola d1a2c2d905 Add back commit r210029.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.

llvm-svn: 210052
2014-06-02 22:01:04 +00:00
Rafael Espindola 582c890fbe Revert "Add the nsw flag when we detect that an add will not signed overflow."
This reverts commit r210029.

It was not correctly handling cases where LHS and RHS had multiple but different
sign bits.

llvm-svn: 210048
2014-06-02 21:12:19 +00:00
Rafael Espindola 6b04ef785e Added support to optimize comparisons with "lshr exact" of a constant.
Patch by Rahul Jain.

llvm-svn: 210040
2014-06-02 19:19:04 +00:00
Rafael Espindola 82899febf0 Add the nsw flag when we detect that an add will not signed overflow.
We already had a function for checking this, we were just using it only in
specialized cases.

llvm-svn: 210029
2014-06-02 14:32:58 +00:00
Dinesh Dwivedi ce5d35a9d0 Added inst combine tarnsform for (1 << X) & C pattrens where C is (some PowerOf2 - 1)
This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt
  "((1 << X) & 7) == 0" ==> "X > 2"
  "((1 << X) & 7) != 0" ==> "X < 3".

Differential Revision: http://reviews.llvm.org/D3678

llvm-svn: 210007
2014-06-02 07:57:24 +00:00
Dinesh Dwivedi 43e127bded Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Differential Revision: http://reviews.llvm.org/D3777

llvm-svn: 210006
2014-06-02 07:24:36 +00:00
Rafael Espindola c323952cb4 PR19753: Optimize comparisons with "ashr exact" of a constanst.
Patch by suyog sarda.

llvm-svn: 209903
2014-05-30 15:54:32 +00:00
Chandler Carruth fdc0e0b478 And fix my fix to sink down through the type at the right time. My
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.

Working on test cases now, but wanted to get bots healthier.

llvm-svn: 209860
2014-05-29 23:21:12 +00:00
Chandler Carruth 3012a1b4cd Fix one bug in the latest incarnation of r209843 -- combining GEPs
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.

llvm-svn: 209857
2014-05-29 23:05:52 +00:00
Louis Gerbarg c6b506a0ae Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209843
2014-05-29 20:29:47 +00:00
Rafael Espindola a248f536b3 Revert "Revert "Revert "InstCombine: Improvement to check if signed addition overflows."""
This reverts commit r209776.

It was miscompiling llvm::SelectionDAGISel::MorphNode.

llvm-svn: 209817
2014-05-29 14:39:16 +00:00
Rafael Espindola 6196b7430e Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure

llvm-svn: 209776
2014-05-28 21:43:52 +00:00
Rafael Espindola 910528a3eb Revert "Add support for combining GEPs across PHI nodes"
This reverts commit r209755.

it was the real cause of the libc++ build failure.

llvm-svn: 209775
2014-05-28 21:41:21 +00:00
Rafael Espindola fb59b05ca4 Revert "InstCombine: Improvement to check if signed addition overflows."
This reverts commit r209746.

It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.

llvm-svn: 209762
2014-05-28 18:48:10 +00:00
Louis Gerbarg 727f1cbb17 Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209755
2014-05-28 17:38:31 +00:00
Rafael Espindola 085b57941f InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
   signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
   the other operand has a known-zero bit in a more significant place than it
   (not including the sign bit) the ripple may go up to and fill the zero, but
   won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 209746
2014-05-28 15:30:40 +00:00
Filipe Cabecinhas e8d6a1e82f Post-commit fixes for r209643
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format

llvm-svn: 209667
2014-05-27 16:54:33 +00:00
Daniel Jasper 73458c95ac Fix bad assert.
llvm-svn: 209648
2014-05-27 09:55:37 +00:00
Filipe Cabecinhas 82ac07c283 Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

llvm-svn: 209643
2014-05-27 03:42:20 +00:00
Tim Northover 3b0846e8f7 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00
Dinesh Dwivedi f82f16e3e6 Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'
This removes TODO added in r208849 [http://reviews.llvm.org/D3629]

MIN(MIN(A, 97), 23) -> MIN(A, 23)
MAX(MAX(A, 23), 97) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3785

llvm-svn: 209110
2014-05-19 07:08:32 +00:00
NAKAMURA Takumi 7ef81a4f98 Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"
It broke clang selfhosting even after r209065.

llvm-svn: 209067
2014-05-17 14:39:21 +00:00
Louis Gerbarg 455805694e Fix for sanitizer crash introduced in r209049
This patch fixes 3 issues introduced by r209049 that only showed up in on
the sanitizer buildbots. One was a typo in a compare. The other is a check to
confirm that the single differing value in the two incoming GEPs is the same
type. The final issue was the the IRBuilder under some circumstances would
build PHIs in the middle of the block.

llvm-svn: 209065
2014-05-17 06:51:36 +00:00
Louis Gerbarg 8d2a43e9be Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

rdar://15547484

llvm-svn: 209049
2014-05-16 23:47:24 +00:00
Dinesh Dwivedi 83c11da849 Reverting r208848, reason: build failure: sanitizer-x86_64-linux-bootstrap/builds/3399
llvm-svn: 208852
2014-05-15 08:22:55 +00:00
Dinesh Dwivedi f675f4201b Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'
MIN(MIN(A, 23), 97) -> MIN(A, 23)
MAX(MAX(A, 97), 23) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3629

llvm-svn: 208849
2014-05-15 06:13:40 +00:00
Dinesh Dwivedi 837c16097e Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh

Differential Revision: http://reviews.llvm.org/D3717

llvm-svn: 208848
2014-05-15 06:01:33 +00:00
David Majnemer 186c94244c InstCombine: Optimize -x s< cst
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.

Reviewers: nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3773

llvm-svn: 208827
2014-05-15 00:02:20 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Serge Pavlov e6de9e39a8 Fix the case when reordering shuffle and binop produces a constant.
This resolves PR19737.

llvm-svn: 208762
2014-05-14 09:05:09 +00:00
Nick Lewycky f0cf8fa941 Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/
llvm-svn: 208750
2014-05-14 03:03:05 +00:00
Serge Pavlov b575ee8294 Fix type of shuffle resulted from shuffle merge.
This fix resolves PR19730.

llvm-svn: 208666
2014-05-13 06:07:21 +00:00
Serge Pavlov 02ff620c7b Fix type of shuffle obtained from reordering with binary operation
In transformation:
    BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.

llvm-svn: 208531
2014-05-12 10:11:27 +00:00
Serge Pavlov 0581109708 Fix reordering of shuffles and binary operations
Do not apply transformation:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))

if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
    

llvm-svn: 208518
2014-05-12 05:44:53 +00:00
Serge Pavlov 9ef66a8266 Reorder shuffle and binary operation.
This patch enables transformations:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
    BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)

They allow to eliminate extra shuffles in some cases.

Differential Revision: http://reviews.llvm.org/D3525

llvm-svn: 208488
2014-05-11 08:46:12 +00:00
Michael Zolotukhin 292d3caa15 [InstCombine] Some cleanup in optimization of redundant insertvalue instructions.
And one more test added.

llvm-svn: 208355
2014-05-08 19:50:24 +00:00
Chandler Carruth d70cc604af Tidy up whitespace with clang-format prior to making significant
changes.

llvm-svn: 208229
2014-05-07 17:36:59 +00:00
Michael Zolotukhin 7d6293a0d3 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Rafael Espindola 85f3610222 Also handle ConstantAggregateZero when optimizing vpermilvar*.
llvm-svn: 207582
2014-04-29 22:20:40 +00:00
Rafael Espindola 152ee213a4 Remove tabs.
Sorry, new machine and I forgot to change the editor setting.

llvm-svn: 207578
2014-04-29 21:02:37 +00:00
Rafael Espindola eb7bdbd0ce Two fixes to the vpermilvar optimization.
The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.

llvm-svn: 207577
2014-04-29 20:41:54 +00:00
Hans Wennborg e36e116826 InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)
llvm-svn: 207426
2014-04-28 17:40:03 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Andrea Di Biagio 8cc9059ce8 [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift
right intrinsics.

A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.

llvm-svn: 207299
2014-04-26 01:03:22 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Michael J. Spencer dee4b2c379 [InstCombine][x86] Constant fold psll intrinsics.
This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.

This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...

llvm-svn: 207058
2014-04-24 00:58:18 +00:00
Filipe Cabecinhas 1a80595a2b Optimize some special cases for SSE4a insertqi
Summary:
Since the upper 64 bits of the destination register are undefined when
performing this operation, we can substitute it and let the optimizer
figure out that only a copy is needed.

Also added range merging, if an instruction copies a range that can be
merged with a previous copied range.

Added test cases for both optimizations.

Reviewers: grosbach, nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3357

llvm-svn: 207055
2014-04-24 00:38:14 +00:00
Matt Arsenault 60728177fb Handle addrspacecast when looking at memcpys from globals
llvm-svn: 207054
2014-04-24 00:01:09 +00:00
Matt Arsenault 6b4bed4b83 Remove dead code in instcombine.
Don't replace shifts greater than the type with the maximum shift.

This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.

llvm-svn: 207000
2014-04-23 16:48:40 +00:00
Chandler Carruth 964daaaf19 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Rafael Espindola bad3f77703 Simplify a vpermil* with constant mask.
With a constant mask a vpermil* is just a shufflevector. This patch implements
that simplification. This allows us to produce denser code. It should also
allow more folding down the line.

llvm-svn: 206801
2014-04-21 22:06:04 +00:00
Chandler Carruth 5f1f26e891 [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of the
header files and into the cpp files.

These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.

llvm-svn: 206777
2014-04-21 19:51:41 +00:00
Matt Arsenault fed3dc8dc6 Revert "Revert r206045, "Fix shift by constants for vector.""
Fix cases where the Value itself is used, and not the constant value.

llvm-svn: 206214
2014-04-14 21:50:37 +00:00
NAKAMURA Takumi 58ad0c87f8 Whitespace.
llvm-svn: 206154
2014-04-14 07:03:13 +00:00
NAKAMURA Takumi 26afa982ec Revert r206045, "Fix shift by constants for vector."
It broke some builders, at least, i686.

llvm-svn: 206153
2014-04-14 07:02:57 +00:00
Serge Pavlov b5f3ddc7a1 Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.
llvm-svn: 206144
2014-04-14 02:20:19 +00:00
Serge Pavlov 4bb54d51c8 Recognize test for overflow in integer multiplication.
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:

    %mul32 = trunc i64 %mul64 to i32
    %zext = zext i32 %mul32 to i64
    %overflow = icmp ne i64 %mul64, %zext
or
    %overflow = icmp ugt i64 %mul64 , 0xffffffff

then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.

Differential Revision: http://llvm-reviews.chandlerc.com/D2814

llvm-svn: 206137
2014-04-13 18:23:41 +00:00
Matt Arsenault 173a1e577c Fix shift by constants for vector.
ashr <N x iM>, <N x iM> M -> undef

llvm-svn: 206045
2014-04-11 17:57:53 +00:00
Eli Bendersky 9966b26dac Fix PR19270 - type mismatch caused by invalid optimization.
Patch by Jingyue Wu.

llvm-svn: 205547
2014-04-03 17:51:58 +00:00
Tim Northover 00ed9964c6 ARM64: initial backend import
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.

Everything will be easier with the target in-tree though, hence this
commit.

llvm-svn: 205090
2014-03-29 10:18:08 +00:00
Erik Verbruggen 5e1bac3a38 Revert "InstCombine: merge constants in both operands of icmp."
This reverts commit r204912, and follow-up commit r204948.

This introduced a performance regression, and the fix is not completely
clear yet.

llvm-svn: 205010
2014-03-28 14:50:57 +00:00
Reid Kleckner 3bdf9bc48b InstCombine: Don't combine constants on unsigned icmps
Fixes a miscompile introduced in r204912.  It would miscompile code like
(unsigned)(a + -49) <= 5U.  The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.

llvm-svn: 204948
2014-03-27 17:49:27 +00:00
Erik Verbruggen 59a1219846 InstCombine: merge constants in both operands of icmp.
Transform:
    icmp X+Cst2, Cst
into:
    icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.

llvm-svn: 204912
2014-03-27 11:16:05 +00:00
Richard Osborne 0af4aa9a19 [InstCombine] Don't fold bitcast into store if it would need addrspacecast
Summary:
Previously the code didn't check if the before and after types for the
store were pointers to different address spaces. This resulted in
instcombine using a bitcast to convert between pointers to different
address spaces, causing an assertion due to the invalid cast.

It is not be appropriate to use addrspacecast this case because it is
not guaranteed to be a no-op cast. Instead bail out and do not do the
transformation.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3117

llvm-svn: 204733
2014-03-25 17:21:41 +00:00
Richard Osborne 9805ec457d Reuse earlier variables to make it clear the types involved in the cast.
No functionality change.

llvm-svn: 204732
2014-03-25 17:21:35 +00:00
Owen Anderson 9b8f9c3d95 Fix a bug in InstCombine where we would incorrectly attempt to construct a
bitcast between pointers of two different address spaces if they happened to have
the same pointer size.

llvm-svn: 203862
2014-03-13 22:51:43 +00:00
Chandler Carruth cdf4788401 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Tim Northover fad2761ca0 InstCombine: form shuffles from wider range of insert/extractelements
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229
2014-03-07 10:24:44 +00:00
Chandler Carruth 7da14f1ab9 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

llvm-svn: 203064
2014-03-06 03:23:41 +00:00
Craig Topper 3e4c697ca1 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth 8cd041ef19 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth 452a00747e [Modules] Move the TargetFolder into the Analysis library. Historically,
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.

llvm-svn: 202832
2014-03-04 11:59:06 +00:00
Chandler Carruth 1305dc3351 [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth 4220e9c154 [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth 820a908df7 [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth 219b89b987 [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth 03eb0de93d [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Rafael Espindola 935125126c Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola aeff8a9c05 Make some DataLayout pointers const.
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
2014-02-24 23:12:18 +00:00
Rafael Espindola 37dc9e19f5 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Nick Lewycky 75080ff25d Make sure that value handle users see the transformation of an indirect call to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink!
llvm-svn: 201822
2014-02-20 23:00:15 +00:00
Matt Arsenault aa689f5079 Do more addrspacecast transforms that happen for bitcast.
Makes addrspacecast (gep) do addrspacecast (gep) instead.

llvm-svn: 201376
2014-02-14 00:49:12 +00:00