Commit Graph

1843 Commits

Author SHA1 Message Date
Sanjay Patel 5f0217f42e fix documentation comments and other clean-ups; NFC
llvm-svn: 271839
2016-06-05 16:46:18 +00:00
Sanjay Patel 6f8f47b358 [InstCombine] less 'CI' confusion; NFC
Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C',
so that it's hopefully clearer that 'CI' refers to CastInst in this context.

While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'.

llvm-svn: 271817
2016-06-05 00:12:32 +00:00
Sanjay Patel ea8a211169 [InstCombine] allow vector constants for cast+icmp fold
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001

llvm-svn: 271810
2016-06-04 22:04:05 +00:00
Sanjay Patel c774f8c265 clean-up; NFC
llvm-svn: 271807
2016-06-04 21:20:44 +00:00
Sanjay Patel 4c204230fc fix formatting, punctuation; NFC
llvm-svn: 271804
2016-06-04 20:39:22 +00:00
Simon Pilgrim fda22d66fc [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

llvm-svn: 271789
2016-06-04 13:42:46 +00:00
Sanjay Patel 6cf18af1c5 [InstCombine] look through bitcasts to find selects
There was concern that creating bitcasts for the simpler potential select pattern:

define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
  %a2 = add <2 x i64> %a, %a
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %bc = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %a2, %bc
  ret <2 x i64> %and
}

might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.

The motivating example for this patch is this IR produced via SSE intrinsics in C:

define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
  %t0 = bitcast <2 x i64> %a to <4 x i32>
  %t1 = bitcast <2 x i64> %b to <4 x i32>
  %cmp = icmp sgt <4 x i32> %t0, %t1
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %t2 = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %t2, %a
  %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
  %neg2 = bitcast <4 x i32> %neg to <2 x i64>
  %and2 = and <2 x i64> %neg2, %b
  %or = or <2 x i64> %and, %and2
  ret <2 x i64> %or
}

For an AVX target, this is currently:

vpcmpgtd  %xmm1, %xmm0, %xmm2
vpand     %xmm0, %xmm2, %xmm0
vpandn    %xmm1, %xmm2, %xmm1
vpor      %xmm1, %xmm0, %xmm0
retq

With this patch, it becomes:

vpmaxsd   %xmm1, %xmm0, %xmm0

Differential Revision: http://reviews.llvm.org/D20774

llvm-svn: 271676
2016-06-03 14:42:07 +00:00
Sanjay Patel dba8b4c04d transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Sanjay Patel 5c0bc02878 [InstCombine] remove guard for generating a vector select
This is effectively NFC because we already do this transform after r175380:
http://reviews.llvm.org/rL175380

and also via foldBoolSextMaskToSelect().

This change should just make it a bit more efficient to match the pattern. 
The original guard was added in r95058:
http://reviews.llvm.org/rL95058

A sampling of codegen for current in-tree targets shows no problems. This
makes sense given that we're already producing the vector selects via the
other transforms.

llvm-svn: 271554
2016-06-02 18:03:05 +00:00
Saleem Abdulrasool d2f705ddf9 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Craig Topper 8287fd8abd [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Simon Pilgrim 9602d678cb [X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 271131
2016-05-28 18:03:41 +00:00
Sanjay Patel 74d23ad498 [InstCombine] move and/sext fold to helper function; NFCI
We need to enhance the pattern matching on these to look through bitcasts.

llvm-svn: 271051
2016-05-27 21:41:29 +00:00
Simon Pilgrim 4642a57fbf Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
llvm-svn: 270976
2016-05-27 09:02:25 +00:00
Simon Pilgrim c013e5737b [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 270973
2016-05-27 08:49:15 +00:00
Chad Rosier e5819e2732 [InstCombine] Catch more bswap cases missed due to zext and truncs.
Fixes PR27824.
Differential Revision: http://reviews.llvm.org/D20591.

llvm-svn: 270853
2016-05-26 14:58:51 +00:00
Craig Topper a423aa4642 [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

llvm-svn: 270819
2016-05-26 04:28:45 +00:00
Chad Rosier a00df49dc5 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Gerolf Hoflehner 00e7092f68 [InstCombine] Fix assertion when bitcast is converted to gep
When an aggregate contains an opaque type its size cannot be
determined. This triggers an "Invalid GetElementPtrInst indices for type" assert
in function checkGEPType. The fix suppresses the conversion in this case.

http://reviews.llvm.org/D20319

llvm-svn: 270479
2016-05-23 19:23:17 +00:00
Sanjay Patel a8ef4a5737 reduce indent; NFC
llvm-svn: 270372
2016-05-22 17:08:52 +00:00
Guozhi Wei b1d37199cc [InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.

If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.

llvm-svn: 270135
2016-05-19 21:07:01 +00:00
Sanjay Patel 22b01febd4 [InstCombine] add another test for wrong icmp constant (PR27792)
It doesn't matter if the comparison is unsigned; the inc/dec is always signed.

llvm-svn: 269831
2016-05-17 20:20:40 +00:00
Sanjay Patel 86564cad06 [InstCombine] fix constant to be signed for signed comparisons
This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures.
I'll add a test in a follow-up commit assuming this fixes things properly.

llvm-svn: 269797
2016-05-17 18:38:55 +00:00
Benjamin Kramer ca9a0fe2b9 [InstCombine] Don't crash when trying to take an element of a ConstantExpr.
Fixes PR27786.

llvm-svn: 269757
2016-05-17 12:08:55 +00:00
Sanjay Patel 18254935c9 try to avoid unused variable warning in release build; NFCI
llvm-svn: 269729
2016-05-17 01:12:31 +00:00
Sanjay Patel e9b2c32e7f [InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756)
Fix a bug introduced with rL269426 :
[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)

We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of
an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue 
elements. Handle those appropriately.

Also, refactor this function to join the scalar and vector paths and eliminate the switches.

Differential Revision: http://reviews.llvm.org/D20289

llvm-svn: 269728
2016-05-17 00:57:57 +00:00
Sanjay Patel abbc2ac231 use 'match' for less indenting; NFCI
llvm-svn: 269494
2016-05-13 21:51:17 +00:00
Jun Bum Lim be11bdc4b0 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

llvm-svn: 269456
2016-05-13 18:38:35 +00:00
Sanjay Patel 0c8f3f9332 [InstCombine] handle zero constant vectors for LE/GE comparisons too
Enhancement to: http://reviews.llvm.org/rL269426
With discussion in: http://reviews.llvm.org/D17859

This should complete the fixes for: PR26701, PR26819:
https://llvm.org/bugs/show_bug.cgi?id=26701
https://llvm.org/bugs/show_bug.cgi?id=26819
 

llvm-svn: 269439
2016-05-13 17:28:12 +00:00
Sanjay Patel b79ab27853 [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)
*We don't currently handle the  edge case constants (min/max values), so it's not a complete
canonicalization.

To fully solve the motivating bugs, we need to enhance this to recognize a zero vector
too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector
or a ConstantDataVector.

Differential Revision: http://reviews.llvm.org/D17859 

llvm-svn: 269426
2016-05-13 15:10:46 +00:00
Chad Rosier 4e6cda2db5 [InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1.
This patch adds support for two optimizations:
icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1)
icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1

Differential Revision: http://reviews.llvm.org/D20123

llvm-svn: 269109
2016-05-10 20:22:09 +00:00
Arnaud A. de Grandmaison 333ef381b8 [InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges.
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.

This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.

llvm-svn: 269033
2016-05-10 09:24:49 +00:00
Chad Rosier 58919cc6f8 Typo. NFC.
llvm-svn: 268975
2016-05-09 21:37:43 +00:00
Chad Rosier 131a42ccdf [InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A.
Differential Revision: http://reviews.llvm.org/D20036

llvm-svn: 268960
2016-05-09 19:30:20 +00:00
Philip Reames 6f4d0088c6 Reapply 267210 with fix for PR27490
Original Commit Message
Extend load/store type canonicalization to handle unordered operations

Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case. 

Note that the concern about lowering is now much less likely.  PR27490 proved that we already *were* mucking with the types of ordered atomics and volatiles.  As a result, this change doesn't introduce as much new behavior as originally thought.

llvm-svn: 268809
2016-05-06 22:17:01 +00:00
Balaram Makam 569eaec5f3 "Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions.""
This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll
Modified the test case to reflect the new change.

llvm-svn: 268557
2016-05-04 21:32:14 +00:00
Balaram Makam 31e7e13789 Revert "[InstCombine] Canonicalize icmp instructions based on dominating conditions."
This reverts commit 573a40f79b35cf3e71db331bb00f6a84f03b835d.

llvm-svn: 268530
2016-05-04 18:37:35 +00:00
Balaram Makam cf3bcb2625 [InstCombine] Canonicalize icmp instructions based on dominating conditions.
Summary:
    This patch canonicalizes conditions based on the constant range information
    of the dominating branch condition.
    For example:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp sgt i64 %a, 0

    Would now be canonicalized into:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp ne i64 %a, 0

Reviewers: mcrosier, gberry, t.p.northover, llvm-commits, reames, hfinkel, sanjoy, majnemer

Subscribers: MatzeB, majnemer, mcrosier

Differential Revision: http://reviews.llvm.org/D18841

llvm-svn: 268521
2016-05-04 17:34:20 +00:00
Simon Pilgrim ca140b17cb [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
llvm-svn: 268206
2016-05-01 20:43:02 +00:00
Simon Pilgrim eeacc40e27 [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
llvm-svn: 268204
2016-05-01 20:22:42 +00:00
Simon Pilgrim e5e8c2fde0 [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
llvm-svn: 268202
2016-05-01 19:26:21 +00:00
Simon Pilgrim 8cddf8b3c6 [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
llvm-svn: 268199
2016-05-01 16:41:22 +00:00
Simon Pilgrim 640f9964c7 [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268158
2016-04-30 07:23:30 +00:00
Simon Pilgrim bf60cc492c [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268115
2016-04-29 21:34:54 +00:00
Chad Rosier cd62bf5821 [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
Sanjay Patel 9190b4add8 [InstCombine] clean up; NFC
llvm-svn: 268099
2016-04-29 20:54:56 +00:00
Sanjay Patel d5b0e54b49 [InstCombine] add helper function for ICmp with constant canonicalization; NFCI
As suggested in http://reviews.llvm.org/D17859 , we should enhance this
to support vectors.

llvm-svn: 268059
2016-04-29 16:22:25 +00:00
David Majnemer 231a68cc22 [InstCombine] Propagate operand bundles
We neglected to transfer operand bundles for some transforms.  These
were found via inspection, I'll try to come up with some test cases.

llvm-svn: 268010
2016-04-29 08:07:20 +00:00
Ahmed Bougacha 17482a5696 [InstCombine] Remove trailing whitespace. NFC.
r267873.

llvm-svn: 267887
2016-04-28 14:36:07 +00:00
Simon Pilgrim bd4a3be7d2 [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.

This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.

Differential Revision: http://reviews.llvm.org/D19614

llvm-svn: 267873
2016-04-28 12:22:53 +00:00
Artur Pilipenko 9bb6beabf4 isSafeToLoadUnconditionally support queries without a context
This is required to use this function from isSafeToSpeculativelyExecute

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16231

llvm-svn: 267692
2016-04-27 11:00:48 +00:00
Arch D. Robison be0490a6e8 Optimize store of "bitcast" from vector to aggregate.
This patch is what was the "instcombine" portion of D14185, with an additional 
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). 
The patch causes instcombine to replace sequences of extractelement-insertvalue-store 
that act essentially like a bitcast followed by a store.

Differential review: http://reviews.llvm.org/D14260

llvm-svn: 267482
2016-04-25 22:22:39 +00:00
Etienne Bergeron 50f02aa3fa Cleanup redundant expression in InstCombineAndOrXor.
Summary:
The expression is redundant on both side of operator |.

detected by : http://reviews.llvm.org/D19451

Reviewers: rnk, majnemer

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D19459

llvm-svn: 267458
2016-04-25 20:15:33 +00:00
Anna Thomas 95f68aa7eb Test commit: modified comment. NFC
llvm-svn: 267406
2016-04-25 13:58:05 +00:00
Simon Pilgrim 4c564ad4dd Tweak comments to make it clear that these combines are for SSE scalar instructions.
llvm-svn: 267360
2016-04-24 19:31:56 +00:00
Simon Pilgrim 4b5462f119 [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.

llvm-svn: 267359
2016-04-24 18:35:59 +00:00
Simon Pilgrim 83020942d3 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).

2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements

3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly

We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).

Differential Revision: http://reviews.llvm.org/D19318

llvm-svn: 267357
2016-04-24 18:23:14 +00:00
Simon Pilgrim 424da1637a [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)

2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input

Differential Revision: http://reviews.llvm.org/D17490

llvm-svn: 267356
2016-04-24 18:12:42 +00:00
Simon Pilgrim 1c9a9f255c [InstCombine] Avoid updating argument demanded elements in separate passes.
As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can.

llvm-svn: 267355
2016-04-24 17:57:27 +00:00
Simon Pilgrim 2f6097d113 [X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI.
llvm-svn: 267352
2016-04-24 17:23:46 +00:00
Simon Pilgrim c0c56e747a [X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI.
llvm-svn: 267351
2016-04-24 17:00:34 +00:00
Nico Weber 0aa9845d15 Revert r267210, it makes clang assert (PR27490).
llvm-svn: 267232
2016-04-22 22:08:42 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Philip Reames 5f0e36947b [unordered] sink unordered stores at end of blocks
The existing code turned out to be completely correct when auditted.  Thus, only minor code changes and adding a couple of tests.

llvm-svn: 267215
2016-04-22 20:53:32 +00:00
Sanjoy Das f97229d6ba Fold compares for distinct allocations
Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.

Patch by Anna Thomas!

Reviewers: majnemer, reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19390

llvm-svn: 267214
2016-04-22 20:52:25 +00:00
Philip Reames eedef73b63 [unordered] Extend load/store type canonicalization to handle unordered operations
Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case.  

llvm-svn: 267210
2016-04-22 20:33:48 +00:00
Silviu Baranga e985c76b90 [InstCombine] Preserve fast math flags when combining PHIs
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.

This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19370

llvm-svn: 267139
2016-04-22 11:21:36 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
JF Bastien c22d29982b NFC: fix copy / paste comment
llvm-svn: 267039
2016-04-21 19:53:39 +00:00
JF Bastien 3e2e69f607 NFC: fix nonsensical comment
llvm-svn: 267036
2016-04-21 19:41:48 +00:00
Sanjoy Das a085cfc150 Folding compares with unescaped allocations
Summary:
If we know that the pointer allocated within a function does not escape,
we can fold away comparisons that are done with global pointers

Patch by Anna Thomas!

Reviewers: reames, majnemer, sanjoy

Subscribers: mgrang, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D19276

llvm-svn: 267035
2016-04-21 19:26:45 +00:00
Philip Reames a98c7ead30 [instcombine][unordered] Extend load(select) transform to handle unordered loads
llvm-svn: 267023
2016-04-21 17:59:40 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Philip Reames 3ac0718423 [unordered] unordered loads from null are still unreachable
llvm-svn: 267019
2016-04-21 17:45:05 +00:00
Philip Reames ac55090e96 [instcombine][unordered] Implement *-load forwarding for unordered atomics
This builds on 266999 which made FindAvailableValue do the right thing.  Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements.

llvm-svn: 267006
2016-04-21 17:03:33 +00:00
David Majnemer b4b27230bf [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCall
The functionality contained within getIntrinsicIDForCall is two-fold: it
checks if a CallInst's callee is a vectorizable intrinsic.  If it isn't
an intrinsic, it attempts to map the call's target to a suitable
intrinsic.

Move the mapping functionality into getIntrinsicForCallSite and rename
getIntrinsicIDForCall to getVectorIntrinsicIDForCall while
reimplementing it in terms of getIntrinsicForCallSite.

llvm-svn: 266801
2016-04-19 19:10:21 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Sanjoy Das 99042473d0 Fix a typo in rL265762
I accidentally replaced `mayBeOverridden` with `!isInterposable`.
Remove the negation and add a test case that would've caught this.

Many thanks to Håkan Hjort for spotting this!

llvm-svn: 266551
2016-04-17 04:30:43 +00:00
David Majnemer 2e02ba78d5 [InstCombine] Don't transform compares of calls to functions named fabs{f,l,}
InstCombine wants to optimize compares of calls to fabs with zero.
However, we didn't have the necessary legality checking to verify that
the function call had the same behavior as fabs.

llvm-svn: 266452
2016-04-15 17:21:03 +00:00
Sanjay Patel e998b91d86 [InstCombine] remove constant by inverting compare + logic (PR27105)
https://llvm.org/bugs/show_bug.cgi?id=27105

We can check if all bits outside of a constant mask are set with a 
single constant.

As noted in the bug report, although this form should be considered the
canonical IR, backends may want to transform this into an 'andn' / 'andc' 
comparison against zero because that could be a single machine instruction.

Differential Revision: http://reviews.llvm.org/D18842

llvm-svn: 266362
2016-04-14 20:17:40 +00:00
David Majnemer 3ee5f34469 [InstCombine] We folded an fcmp to an i1 instead of a vector of i1
Remove an ad-hoc transform in InstCombine and replace it with more
general machinery (ValueTracking, InstructionSimplify and VectorUtils).

This fixes PR27332.

llvm-svn: 266175
2016-04-13 06:55:52 +00:00
Sanjay Patel 5e5056d939 [x86, InstCombine] fix masked load pass-through operand to be a zero vector
This bug was introduced with:
http://reviews.llvm.org/rL262269

AVX masked loads are specified to set vector lanes to zero when the high bit of the mask
element for that lane is zero:
"If the mask is 0, the corresponding data element is set to zero in the load form of these
instructions, and unmodified in the store form." --Intel manual

Differential Revision: http://reviews.llvm.org/D19017

llvm-svn: 266148
2016-04-12 23:16:23 +00:00
George Burgess IV 278199f615 Add the allocsize attribute to LLVM.
`allocsize` is a function attribute that allows users to request that
LLVM treat arbitrary functions as allocation functions.

This patch makes LLVM accept the `allocsize` attribute, and makes
`@llvm.objectsize` recognize said attribute.

The review for this was split into two patches for ease of reviewing:
D18974 and D14933. As promised on the revisions, I'm landing both
patches as a single commit.

Differential Revision: http://reviews.llvm.org/D14933

llvm-svn: 266032
2016-04-12 01:05:35 +00:00
Sanjay Patel b91bcd704a add FIXME comment; NFC
llvm-svn: 265970
2016-04-11 17:35:57 +00:00
Sanjay Patel 3a48e9823e add an assert for safety; NFC
llvm-svn: 265969
2016-04-11 17:27:44 +00:00
Sanjay Patel 4b9c682acf variable names start with a capital letter; NFC
llvm-svn: 265968
2016-04-11 17:25:23 +00:00
Sanjay Patel 371290790f [InstCombine] use canEvaluateShiftedShift() to handle the lshr case (NFCI)
We need just a couple of logic tweaks to consolidate the shl and lshr cases.

This is step 5 of refactoring to solve PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760

llvm-svn: 265965
2016-04-11 17:11:55 +00:00
Sanjay Patel 816ec8882a [InstCombine] don't try to shift an illegal amount (PR26760)
This is the straightforward fix for PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760

But we still need to make some changes to generalize this helper function
and then send the lshr case into here.

llvm-svn: 265960
2016-04-11 16:50:32 +00:00
Sanjay Patel bd8b779d16 [InstCombine] rename variables in shifted-shift helper function (NFCI)
This is step 3 of refactoring to solve PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760

llvm-svn: 265954
2016-04-11 16:11:07 +00:00
Sanjay Patel 6eaff5cec6 [InstCombine] add helper function for shift-shift optimization (NFCI)
This is step 2 of refactoring to solve PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760

llvm-svn: 265951
2016-04-11 15:43:41 +00:00
David Majnemer 56737722e4 [InstCombine] Fix miscompile in FoldSPFofSPF
We had a select of a cast of a select but attempted to replace the outer
select with the inner select dispite their incompatible types.

Patch by Anton Korobeynikov!

This fixes PR27236.

llvm-svn: 265805
2016-04-08 16:51:49 +00:00
David Majnemer fcc5811797 [InstCombine] Add a peephole for redundant assumes
Two or more identical assumes are occasionally next to each other in a
basic block.
While our generic machinery will turn a redundant assume into a no-op,
it is not super cheap.
We can perform a simpler check to achieve the same result for this case.

llvm-svn: 265801
2016-04-08 16:37:12 +00:00
Sanjoy Das 5ce3272833 Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

llvm-svn: 265762
2016-04-08 00:48:30 +00:00
David Majnemer fe3f9d1721 [InstCombine] Don't sink an instr after a catchswitch
A catchswitch is a terminator, instructions cannot be inserted after it.

llvm-svn: 265158
2016-04-01 17:28:17 +00:00
Matt Arsenault 2fe4fbc184 AMDGPU: Add frexp_exp intrinsic
llvm-svn: 264944
2016-03-30 22:28:52 +00:00
Matt Arsenault 5cd4f8f89f AMDGPU: Constant folding for frexp_mant
llvm-svn: 264943
2016-03-30 22:28:26 +00:00
Junmo Park 820964e9c6 Minor code cleanup. NFC.
llvm-svn: 264124
2016-03-23 01:38:35 +00:00
David Majnemer cdf2873e36 [InstCombine] Don't insert instructions before a catch switch
CatchSwitches are not splittable, we cannot insert casts, etc. before
them.

This fixes PR26992.

llvm-svn: 263874
2016-03-19 04:39:52 +00:00
Guozhi Wei 7b390ec4cd [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 263734
2016-03-17 18:47:20 +00:00
Bjorn Steinbrink 37ca462508 Also handle the new Rust pers fn to isCatchAll()
llvm-svn: 263585
2016-03-15 20:57:07 +00:00
Justin Lebar 9d94397859 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

llvm-svn: 263481
2016-03-14 20:18:54 +00:00
Mehdi Amini ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Sanjay Patel c4acbae63f [x86, InstCombine] delete x86 SSE2 masked store with zero mask
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.

llvm-svn: 263340
2016-03-12 15:16:59 +00:00
Eric Christopher 35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Chandler Carruth b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Benjamin Kramer c126353473 [InstCombine] Use Twines to generate names.
Since the names are used in a loop this does more work in debug builds. In
release builds value names are generally discarded so we don't have to do
the concatenation at all. It's also simpler code, no functional change
intended.

llvm-svn: 263215
2016-03-11 10:20:56 +00:00
Philip Reames 8f12eba78d [ValueTracking] Extract isKnownPositive [NFCI]
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.

llvm-svn: 263062
2016-03-09 21:31:47 +00:00
Philip Reames ec8a8b5437 [InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)
When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min.

Differential Revision: http://reviews.llvm.org/D17873

llvm-svn: 263059
2016-03-09 21:05:07 +00:00
Matthias Braun c31032d607 InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2
As part of r251146 InstCombine was extended to call computeKnownBits on
every value in the function to determine whether it happens to be
constant. This increases typical compiletime by 1-3% (5% in irgen+opt
time) in my measurements. On the other hand this case did not trigger
once in the whole llvm-testsuite.

This patch introduces the notion of ExpensiveCombines which are only
enabled for OptLevel > 2. I removed the check in InstructionSimplify as
that is called from various places where the OptLevel is not known but
given the rarity of the situation I think a check in InstCombine is
enough.

Differential Revision: http://reviews.llvm.org/D16835

llvm-svn: 263047
2016-03-09 18:47:11 +00:00
Petar Jovanovic 921c2b4eb3 Reland r262337 "calculate builtin_object_size if arg is a removable pointer"
Original commit message:
 calculate builtin_object_size if argument is a removable pointer

 This patch fixes calculating correct value for builtin_object_size function
 when pointer is used only in builtin_object_size function call and never
 after that.

 Patch by Strahinja Petrovic.

 Differential Revision: http://reviews.llvm.org/D17337

Reland the original change with a small modification (first do a null check
and then do the cast) to satisfy ubsan.

llvm-svn: 263011
2016-03-09 14:12:47 +00:00
Junmo Park 974eb0a96d Revert "[InstCombine] Combine A->B->A BitCast"
This reverts commit r262670 due to compile failure.

llvm-svn: 262916
2016-03-08 07:09:46 +00:00
Guozhi Wei 92e9d0e80e [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 262670
2016-03-03 23:21:38 +00:00
Sanjay Patel 9bba75084b [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Amaury Sechet 3b8b2ea2e1 Explode store of arrays in instcombine
Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

llvm-svn: 262530
2016-03-02 22:36:45 +00:00
Amaury Sechet 7cd3fe7db6 Unpack array of all sizes in InstCombine
Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

llvm-svn: 262521
2016-03-02 21:28:30 +00:00
Sanjay Patel 5e4c46de6d revert r262424 because there's a *clang test* for AArch64 that checks -O3 asm output
that is broken by this change

llvm-svn: 262440
2016-03-02 01:04:09 +00:00
Sanjay Patel 147e927957 [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case. 

Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.

But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
 

llvm-svn: 262424
2016-03-01 23:55:18 +00:00
Dehao Chen 1012be120a Perform InstructioinCombiningPass before SampleProfile pass.
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17742

llvm-svn: 262419
2016-03-01 22:53:02 +00:00
Owen Anderson 7ea02fc787 Fix an issue where fast math flags were dropped during scalarization.
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.

llvm-svn: 262376
2016-03-01 19:35:52 +00:00
Petar Jovanovic 6315f3f9b7 Revert "calculate builtin_object_size if argument is a removable pointer"
Revert r262337 as "check-llvm ubsan" step failed on
sanitizer-x86_64-linux-fast buildbot.

llvm-svn: 262349
2016-03-01 16:50:08 +00:00
Petar Jovanovic 8aef99aa86 calculate builtin_object_size if argument is a removable pointer
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D17337

llvm-svn: 262337
2016-03-01 14:39:55 +00:00
Sanjay Patel 6f2c01f712 [x86, InstCombine] transform more x86 masked loads to LLVM intrinsics
Continuation of:
http://reviews.llvm.org/rL262269

llvm-svn: 262273
2016-02-29 23:59:00 +00:00
Sanjay Patel 98a71505f5 [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:

__m128 mload_zeros(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0));
}

__m128 mload_fakeones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(1));
}

__m128 mload_ones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}

__m128 mload_oneset(float *f) {
  return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}

...so none of the above will actually generate a masked load for optimized code.

This is the masked load counterpart to:
http://reviews.llvm.org/rL262064

llvm-svn: 262269
2016-02-29 23:16:48 +00:00
Reid Kleckner 892ae2e2b6 [InstCombine] Be more conservative about removing stackrestore
We ended up removing a save/restore pair around an inalloca call,
leading to a miscompile in Chromium.

llvm-svn: 262095
2016-02-27 00:53:54 +00:00
Sanjay Patel fc7e7ebf36 [x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsics
Replicate everything for integers...because x86.

Continuation of:
http://reviews.llvm.org/rL262064

llvm-svn: 262077
2016-02-26 21:51:44 +00:00
Sanjay Patel 1ace99351f [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when
the store mask is constant:

void mstore_zero_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}

void mstore_fake_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}

void mstore_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}

void mstore_one_set_elt_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}

...so none of the above will actually generate a masked store for optimized code.

Differential Revision: http://reviews.llvm.org/D17485

llvm-svn: 262064
2016-02-26 21:04:14 +00:00
Sanjay Patel dbbaca0e1b [InstCombine] enable optimization of casted vector xor instructions
This is part of the payoff for the refactoring in:
http://reviews.llvm.org/rL261649
http://reviews.llvm.org/rL261707

In addition to removing a pile of duplicated code, the xor case was
missing the optimization for vector types because it checked
"SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()"
like 'and' and 'or' were already doing.

This solves part of:
https://llvm.org/bugs/show_bug.cgi?id=26702

llvm-svn: 261750
2016-02-24 17:00:34 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Sanjay Patel 75b4ae25cb [InstCombine] refactor visitOr() to use foldCastedBitwiseLogic()
Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra
check from the nearly identical 'or' case:
  if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src))

But I'm not sure how to expose that difference in a regression test. 
Without that check, the 'or' path will infinite loop on:
test/Transforms/InstCombine/zext-or-icmp.ll
because the zext-or-icmp fold is attempting a reverse transform.

The refactoring should extend to the 'xor' case next to solve part of
PR26702.

llvm-svn: 261707
2016-02-23 23:56:23 +00:00
Sanjay Patel 713f25e0f8 [InstCombine] improve readability ; NFCI
Less indenting, named local variables, more descriptive names.

llvm-svn: 261659
2016-02-23 17:41:34 +00:00
Sanjay Patel 7d0d810ce5 [InstCombine] less indenting; NFC
llvm-svn: 261652
2016-02-23 16:59:21 +00:00
Sanjay Patel 40e7ba0046 [InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCI
This is a straight cut and paste of the existing code and is intended to
be the first step in solving part of PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

We should be able to reuse most of this and delete the nearly identical 
existing code in visitOr(). Then, we can enhance visitXor() to use the
same code too.

llvm-svn: 261649
2016-02-23 16:36:07 +00:00
Justin Lebar ccbd8f5a02 Revert "[attrs] Handle convergent CallSites."
This reverts r261544, which was causing a test failure in
Transforms/FunctionAttrs/readattrs.ll.

llvm-svn: 261549
2016-02-22 18:24:43 +00:00
Justin Lebar 7bf9187abb [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Reviewers: chandlerc, jingyue

Subscribers: hfinkel, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17317

llvm-svn: 261544
2016-02-22 17:51:35 +00:00
Sanjay Patel 2440130437 fix inaccurate comment; NFC
llvm-svn: 261484
2016-02-21 17:33:31 +00:00
Sanjay Patel 368ac5dbf7 [InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC
Originally part of:
http://reviews.llvm.org/D17485

We need this when simplifying masked memory ops too.

llvm-svn: 261483
2016-02-21 17:29:33 +00:00
Simon Pilgrim 471efd244a [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element
llvm-svn: 261460
2016-02-20 23:17:35 +00:00
Chandler Carruth ac07270828 [AA] Preserve the AA results wrapper pass as well as BasicAA in a few
more places to prevent gratuitous re-"runs" of these passes.

The passes themselves don't do any work when run, but we keep spending
time scheduling and running these needlessly when we really don't need
to do so.

This is the first patch towards fixing the really horrible loop pass
pipeline fragmentation pointed out by Sanjoy in PR24804.

llvm-svn: 261302
2016-02-19 03:12:14 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Amaury Sechet da71cb7b92 NFC: Fix formating
llvm-svn: 261156
2016-02-17 21:21:29 +00:00
Amaury Sechet 61a7d629ec Fix load alignement when unpacking aggregates structs
Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it.

Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17326

llvm-svn: 261139
2016-02-17 19:21:28 +00:00
David Majnemer 0f0abc7bc2 [InstCombine] Don't aggressively replace xor with icmp
For some cases, InstCombine replaces the sequence of xor/sub instruction
followed by cmp instruction into a single cmp instruction.

However, this replacement may result suboptimal result especially when
the xor/sub has more than one use, as discussed in
bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465).

This patch make the replacement happen only when xor/sub has only one
use.

Differential Revision: http://reviews.llvm.org/D16915

Patch by Taewook Oh!

llvm-svn: 260695
2016-02-12 18:12:38 +00:00
Quentin Colombet 490cfbe2a2 Re-apply r238452, the bug was in clang and was fixed in r260567.
Original commit message:
[InstCombine] Fold IntToPtr and PtrToInt into preceding loads.

Currently we only fold a BitCast into a Load when the BitCast is its
only user.

Do the same for any no-op cast.

Patch by Philip Pfaffe!

Differential Revision: http://reviews.llvm.org/D9152

llvm-svn: 260612
2016-02-11 22:30:41 +00:00
Pete Cooper 5562c333b8 Set load alignment on aggregate loads.
When optimizing a extractvalue(load), we generate a load from the
aggregate type.  This load didn't have alignment set and so would
get the alignment of the type.  This breaks when the type is packed
and so the alignment should be lower.

For example, loading { int, int } would give us alignment of 4, but
the original load from this type may have an alignment of 1 if packed.

Reviewed by David Majnemer

Differential revision: http://reviews.llvm.org/D17158

llvm-svn: 260587
2016-02-11 21:10:40 +00:00
Jun Bum Lim 10e58e867d Fixed typo in r260530
llvm-svn: 260541
2016-02-11 16:46:13 +00:00
Jun Bum Lim 339e9723c1 [InstCombine] Simplify a known nonzero incoming value of PHI
Summary:
When a PHI is used only to be compared with zero, it is possible to replace an
incoming value with any non-zero constant if the incoming value can be proved as
a known nonzero value. For example, in below code, we can replace the incoming value %v with
any non-zero constant based on the fact that the PHI is only used to be compared with zero
and %v is a known non-zero value:
  %v = select %cond, 1, 2
  %p = phi [%v, BB] ...
  %c = icmp eq, %p, 0

Reviewers: mcrosier, jmolloy, sanjoy

Subscribers: hfinkel, mcrosier, majnemer, llvm-commits, haicheng, bmakam, mssimpso, gberry

Differential Revision: http://reviews.llvm.org/D16240

llvm-svn: 260530
2016-02-11 15:50:07 +00:00
Artur Pilipenko 44e7c51b05 Don't propagate dereferenceable attribute through gc.relocate in InstCombine
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16143

llvm-svn: 260509
2016-02-11 11:22:46 +00:00
Philip Reames ea4d8e8ce9 [InstCombine][GC] Handle gc.relocations of vector type
We introduced gc.relocates of vector-of-pointer types a couple of weeks back.  Somehow, I missed updating the InstCombine rule to account for this.  If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>.

I also took the chance to do a bit of code style cleanup.

llvm-svn: 260279
2016-02-09 21:09:22 +00:00
Quentin Colombet 7ec03dc7f8 [InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads.
According to git bisect, this is the root cause of a miscompile for Regex in
libLLVMSupport. I am still working on reducing a test case.
The actual bug may be elsewhere and this commit just exposed it.

Anyway, at the moment, to reproduce, follow these steps:
1. Build clang and libLTO in release mode.
2. Create a new build directory <stage2> and cd into it.
3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts
   using -O2 -flto
4. Run llvm-extract  -ralias '.*bar' -S test/Other/extract-alias.ll

Result:
program doesn't contain global named '.*bar'!

Expected result:
@a0a0bar = alias void ()* @bar
@a0bar = alias void ()* @bar

declare void @bar()

Note: In step #3, if you don't use lto or asserts, the miscompile disappears.
llvm-svn: 259674
2016-02-03 18:04:13 +00:00