Commit Graph

8820 Commits

Author SHA1 Message Date
Teresa Johnson 720d9b4111 Fix code section prefix for proper layout
Summary:
r284533 added hot and cold section prefixes based on profile
information, to enable grouping of hot/cold functions at link time.
However, it used "cold" as the prefix for cold sections, but gold only
recognizes "unlikely" (which is used by gcc for cold sections).
Therefore, cold sections were not properly being grouped. Switch to
using "unlikely"

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32983

llvm-svn: 302502
2017-05-09 01:43:24 +00:00
Sanjoy Das 9c52c7656c Add basic test case for -instnamer
llvm-svn: 302482
2017-05-08 23:18:46 +00:00
Sanjay Patel 2a895ce134 [InstCombine] add tests from D32285 to show current problems; NFC
llvm-svn: 302475
2017-05-08 22:33:20 +00:00
Sanjay Patel a1c8814891 [InstCombine] add folds for not-of-shift-right
This is another step towards getting rid of dyn_castNotVal, 
so we can recommit:
https://reviews.llvm.org/rL300977

As the tests show, we were missing the lshr case for constants
and both ashr/lshr vector splat folds. The ashr case with constant
was being performed inefficiently in 2 steps. It's also possible
there was a latent bug in that case because we can't do that fold
if the constant is positive:
http://rise4fun.com/Alive/Bge

llvm-svn: 302465
2017-05-08 20:49:59 +00:00
Sanjay Patel 322db476f3 [InstCombine] move/add tests for not(shr (not X), Y); NFC
llvm-svn: 302451
2017-05-08 18:16:04 +00:00
Daniel Berlin 0f2af7f93b ConstantFold: Handle gep nonnull, undef as well
llvm-svn: 302447
2017-05-08 17:37:33 +00:00
Daniel Berlin 74ffa5c62f ConstantFold: Fold getelementptr (i32, i32* null, i64 undef) to null.
Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses.
Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was
trying to test.

llvm-svn: 302446
2017-05-08 17:37:29 +00:00
Craig Topper 868813ffbb [ValueTracking] Use KnownOnes to provide a better bound on known zeros for ctlz/cttz intrinics
This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound.

Differential Revision: https://reviews.llvm.org/D32521

llvm-svn: 302444
2017-05-08 17:22:34 +00:00
Sanjay Patel 0fbdaa1f0c [InstCombine] add another test for PR32949; NFC
A patch for the InstSimplify variant of this bug is up for review here:
https://reviews.llvm.org/D32954

llvm-svn: 302434
2017-05-08 15:58:57 +00:00
Sanjay Patel 390f1dc6ba [InstSimplify] add tests for PR32949 miscompile; NFC
llvm-svn: 302374
2017-05-07 18:19:13 +00:00
Zvi Rackover 973ff7c74c InstructionSimplify: Relanding r301766
Summary:
Re-applying r301766 with a fix to a typo and a regression test.

The log message for r301766 was:
==================================================================================
    InstructionSimplify: Canonicalize shuffle operands. NFC-ish.

    Summary:
     Apply canonicalization rules:
        1. Input vectors with no elements selected from can be replaced with undef.
        2. If only one input vector is constant it shall be the second one.

    This allows constant-folding to cover more ad-hoc simplifications that
    were in place and avoid duplication for RHS and LHS checks.

    There are more rules we may want to add in the future when we see a
    justification. e.g. mask elements that select undef elements can be
    replaced with undef.
==================================================================================

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32863

llvm-svn: 302373
2017-05-07 18:16:37 +00:00
Sanjay Patel 599e65b1ff [InstSimplify] use ConstantRange to simplify or-of-icmps
We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything seems 
to check out. Eg, the deleted code in instcombine was completely ignoring predicates with
mismatched signedness.

This is a follow-up to:
https://reviews.llvm.org/rL301260
https://reviews.llvm.org/D32143

llvm-svn: 302370
2017-05-07 15:11:40 +00:00
Sanjay Patel 3bf1d6b763 [InstSimplify] fix copy-paste mistake in test comments; NFC
llvm-svn: 302251
2017-05-05 16:24:58 +00:00
Sanjay Patel 34cd5e4307 [InstSimplify] add tests for (icmp X, C1 | icmp X, C2); NFC
These are the 'or' counterparts for the tests added with r300493.

llvm-svn: 302248
2017-05-05 16:12:05 +00:00
Aditya Kumar 1c42d135e1 [LoopIdiom] check for safety while expanding
Loop Idiom recognition was generating memset in a case that
would result generating a division operation to an unsafe location.

Differential Revision: https://reviews.llvm.org/D32674

llvm-svn: 302238
2017-05-05 14:49:45 +00:00
Martin Storsjo 2b0fae877e [ArgPromotion] Add a testcase for PR32917
Differential Revision: https://reviews.llvm.org/D32882

llvm-svn: 302216
2017-05-05 08:40:24 +00:00
Dehao Chen a75d0da91b Update VP prof metadata during inlining.
Summary: r298270 added profile update logic for branch_weights. This patch implements profile update logic for VP prof metadata too.

Reviewers: eraman, tejohnson, davidxl

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32773

llvm-svn: 302209
2017-05-05 00:47:34 +00:00
Sanjay Patel e42b4d566e [InstSimplify] add folds for or-of-casted-icmps
The sibling folds for 'and' with casts were added with https://reviews.llvm.org/rL273200.
This is a preliminary step for adding the 'or' variants for the folds added with https://reviews.llvm.org/rL301260.

The reason for the strange form with constant LHS in the 1st test is because there's another missing fold in that
case for the inverted predicate. That should be fixed when we add the ConstantRange functionality for 'or-of-icmps' 
that already exists for 'and-of-icmps'.

I'm hoping to share more code for the and/or cases, so we won't have these differences. This will allow us to remove
code from InstCombine. It's also possible that we can remove some code here in InstSimplify. I think we have some 
duplicated folds because patterns are not matched in a general way.

Differential Revision: https://reviews.llvm.org/D32876

llvm-svn: 302189
2017-05-04 19:51:34 +00:00
Sanjay Patel 500e5122d3 [InstSimplify] add tests for or-of-casted-icmps; NFC
llvm-svn: 302174
2017-05-04 17:36:53 +00:00
Easwaran Raman 5e6f9bd4f8 [PM] Add ProfileSummaryAnalysis as a required pass in the new pipeline.
Differential revision: https://reviews.llvm.org/D32768

llvm-svn: 302170
2017-05-04 16:58:45 +00:00
Adrian Prantl defc99a94e Cleanup tests to not share a DISubprogram between multiple Functions.
rdar://problem/31926379

llvm-svn: 302166
2017-05-04 16:24:31 +00:00
Matthias Braun 739a7b2f9c strlen-1.ll: Fix test
Change test for `strlen(x) == 0 --> *x == 0` to actually test the
pattern.

llvm-svn: 302094
2017-05-03 23:32:51 +00:00
Craig Topper cff357c322 [InstCombine][KnownBits] Use KnownBits better to detect nsw adds
Change checkRippleForAdd from a heuristic to a full check -
if it is provable that the add does not overflow return true, otherwise false.

Patch by Yoav Ben-Shalom

Differential Revision: https://reviews.llvm.org/D32686

llvm-svn: 302093
2017-05-03 23:22:46 +00:00
Elad Cohen ef5798acf5 Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.

Differential revision: https://reviews.llvm.org/D31490

llvm-svn: 302018
2017-05-03 12:28:54 +00:00
Anna Thomas 53c8d95c85 [Loop Deletion] Delete loops that are never executed
Summary:
Currently, loop deletion deletes loop where the only values
that are used outside the loop are loop-invariant.
This patch adds logic to delete loops where the loop is proven to be
never executed (i.e. the only predecessor of the loop preheader has a
constant conditional branch as terminator, and the preheader is not the
taken target). This will remove loops that become dead after
loop-unswitching generates constant conditional branches.

The next steps are:
1. moving the loop deletion implementation to LoopUtils.
2. Add logic in loop-simplifyCFG which will support changing conditional
constant branches to unconditional branches. If loops become unreachable in this
process, they can be removed using `deleteDeadLoop` function.

Reviewers: chandlerc, efriedma, sanjoy, reames

Reviewed by: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32494

llvm-svn: 302015
2017-05-03 11:47:11 +00:00
Matt Arsenault 6a288c1e32 Replace hardcoded intrinsic list with speculatable attribute.
No change in which intrinsics should be speculated.

llvm-svn: 301995
2017-05-03 02:26:10 +00:00
Peter Collingbourne e95901caa4 Revert r295861, "[ModuleSummaryAnalysis] Don't crash when referencing unnamed globals."
We should always expect values to be named before running the module summary
analysis (see NameAnonGlobals pass), so it's fine if we crash in that case.

llvm-svn: 301991
2017-05-03 00:18:48 +00:00
Xinliang David Li ab8722f80a [PartialInlining] Add more early filtering
This is a follow up to the previous
inline cost patch for quicker filtering.

llvm-svn: 301959
2017-05-02 18:43:21 +00:00
Matt Arsenault 9ac7d6be3c SpeculativeExecution: Stop using whitelist for costs
Just let TTI's cost do this instead of arbitrarily restricting
this.

llvm-svn: 301950
2017-05-02 18:02:18 +00:00
Matt Arsenault 7b82b4bddb AMDGPU: Make intrinsics speculatable
llvm-svn: 301937
2017-05-02 16:57:44 +00:00
Sanjay Patel 6381db18fe [InstCombine] don't use DeMorgan's Law on integer constants (2nd try)
This was originally checked in here:
https://reviews.llvm.org/rL301923

And reverted here:
https://reviews.llvm.org/rL301924

Because there's a clang test that would fail after this. I fixed/removed the
offending CHECK lines in:
https://reviews.llvm.org/rL301928

So let's try this again. Original commit message:

This is the fold that causes the infinite loop in BoringSSL
(https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c)
when we fix instcombine demanded bits to prefer 'not' ops as in https://reviews.llvm.org/D32255.

There are 2 or 3 problems with dyn_castNotVal, and I don't think we can
reinstate https://reviews.llvm.org/D32255 until dyn_castNotVal is completely eliminated.

1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot.
2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors.
3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X | ~Y)

  That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X | ~ Y) --> ~(X & Y)

Differential Revision: https://reviews.llvm.org/D32665

llvm-svn: 301929
2017-05-02 15:31:40 +00:00
Sanjay Patel da0b4deafa revert r301923 : [InstCombine] don't use DeMorgan's Law on integer constants
There's a clang test that is wrongly using -O1 and failing after this commit.

llvm-svn: 301924
2017-05-02 14:48:23 +00:00
Sanjay Patel 096a981982 [InstCombine] don't use DeMorgan's Law on integer constants
This is the fold that causes the infinite loop in BoringSSL 
(https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) 
when we fix instcombine demanded bits to prefer 'not' ops as in D32255.

There are 2 or 3 problems with dyn_castNotVal, and I don't think we can 
reinstate D32255 until dyn_castNotVal is completely eliminated.
1. As shown here, it transforms 'not' into random xor. This transform is 
   harmful to SCEV and codegen because 'not' can often be folded while 
   random xor cannot.
2. It does not transform vector constants. This is actually a good thing, 
   but if you don't believe the above argument, then we shouldn't have 
   excluded vectors.
3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't
   match the greedy nature of instcombine. If we DeMorganize a pattern 
   that has an extra 'not' in it:
   ~(~(~X) & Y) --> (~X | ~Y)

   That's just another case of DeMorgan, so we should trust that we'll fold
   that pattern too:
   (~X | ~ Y) --> ~(X & Y)

Differential Revision: https://reviews.llvm.org/D32665

llvm-svn: 301923
2017-05-02 14:31:30 +00:00
Xinliang David Li 6133846be1 [PartialInlining] Hook up inline cost analysis
Differential Revision: http://reviews.llvm.org/D32666

llvm-svn: 301894
2017-05-02 02:44:14 +00:00
George Burgess IV 7bc507a2e8 Revert r301880
This change caused buildbot failures, apparently because we're not
passing around types that InstSimplify is used to seeing. I'm not overly
familiar with InstSimplify, so I'm reverting this until I can figure out
what exactly is wrong.

llvm-svn: 301885
2017-05-01 23:54:41 +00:00
George Burgess IV 6935aefdf0 [InstSimplify] Handle selects of GEPs with 0 offset
In particular (since it wouldn't fit nicely in the summary):
(select (icmp eq V 0) P (getelementptr P V)) -> (getelementptr P V)

Differential Revision: https://reviews.llvm.org/D31435

llvm-svn: 301880
2017-05-01 23:12:08 +00:00
Davide Italiano 2dfd46bf08 [NewGVN] Don't derive incorrect implications.
In the testcase attached,  we believe %tmp1 implies %tmp4.
where:
  br i1 %tmp1, label %bb2, label %bb7
  br i1 %tmp4, label %bb5, label %bb7

because Wwhile looking at PredicateInfo stuffs we end up calling
isImpliedTrueByMatchingCmp() with the arguments backwards.

Differential Revision:  https://reviews.llvm.org/D32718

llvm-svn: 301849
2017-05-01 22:26:28 +00:00
Sanjay Patel 59d0aeaafe [InstCombine] check one-use before applying DeMorgan nor/nand folds
If we have ~(~X & Y), it only makes sense to transform it to (X | ~Y) when we do not need 
the intermediate (~X & Y) value. In that case, we would need an extra instruction to 
generate ~Y + 'or' (as shown in the test changes).

It's ok if we have multiple uses of ~X or Y, however. In those cases, we may not reduce the
instruction count or critical path, but we might improve throughput because we can generate 
~X and ~Y in parallel. Whether that actually makes perf sense or not for a target is something 
we can't answer in IR.

Differential Revision: https://reviews.llvm.org/D32703

llvm-svn: 301848
2017-05-01 22:25:42 +00:00
Xin Tong a4b9b9f42a Take indirect branch into account as well when folding.
We may not be able to rewrite indirect branch target, but we also want to take it into
account when folding, i.e. if it and all its successor's predecessors go to the same
destination, we can fold, i.e. no need to thread.

llvm-svn: 301816
2017-05-01 17:15:37 +00:00
Xin Tong 21f8ac235e [JumpThread] Do RAUW in case Cond folds to a constant in the CFG
Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32407

llvm-svn: 301804
2017-05-01 15:34:17 +00:00
Sanjay Patel d2f13b62d9 [InstCombine] add multi-use variants for DeMorgan folds; NFC
llvm-svn: 301802
2017-05-01 14:52:17 +00:00
Sanjay Patel c526fbcfa9 [InstCombine] use FileCheck and auto-generate checks; NFC
llvm-svn: 301801
2017-05-01 14:20:30 +00:00
Sanjay Patel 4e312203af [InstCombine] consolidate more DeMorgan tests; NFC
llvm-svn: 301800
2017-05-01 14:10:59 +00:00
Sanjay Patel 0c6086f493 [InstCombine] consolidate tests for DeMorgan folds; NFC
I'm proposing to add tests and change behavior in D32665.

llvm-svn: 301774
2017-04-30 18:57:12 +00:00
Zvi Rackover 4086e13e0d InstructionSimplify: Simplify a shuffle with a undef mask to undef
Summary:
Following the discussion in pr32486, adding the simplification:
 shuffle %x, %y, undef -> undef

Reviewers: spatel, RKSimon, andreadb, davide

Reviewed By: spatel

Subscribers: jroelofs, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D32293

llvm-svn: 301764
2017-04-30 06:06:26 +00:00
Akira Hatanaka 6fdcb3c2ce [ObjCARC] Do not move a release between a call and a
retainAutoreleasedReturnValue that retains the returned value.

This commit fixes a bug in ARC optimizer where it moves a release
between a call and a retainAutoreleasedReturnValue, causing the returned
object to be released before the retainAutoreleasedReturnValue can
retain it.

This commit accomplishes that by doing a lookahead and checking whether
the call prevents the release from moving upwards. In the long term, we
should treat the region between the retainAutoreleasedReturnValue and
the call as a critical section and disallow moving anything there
(possibly using operand bundles).

rdar://problem/20449878

llvm-svn: 301724
2017-04-29 00:23:11 +00:00
Davide Italiano 534e314356 [LoopUnswitch] Don't remove instructions with side effects.
This fixes PR32818.

Differential Revision:  https://reviews.llvm.org/D32664

llvm-svn: 301722
2017-04-29 00:12:18 +00:00
Sanjay Patel c8ab6bb27d [InstCombine] add tests to show potentially bogus application of DeMorgan (NFC)
llvm-svn: 301714
2017-04-28 23:14:33 +00:00
Matt Arsenault e0f9e984fd InferAddressSpaces: Search constant expressions for addrspacecasts
These are pretty common when using local memory, and the 64-bit generic
addressing is much more expensive to compute.

llvm-svn: 301711
2017-04-28 22:52:41 +00:00
Matt Arsenault a1e734050c InferAddressSpaces: Infer from just addrspacecasts
Eliminates some more cases where some subset of the addressing
computation remains flat. Some cases with addrspacecasts
in nested constant expressions are still left behind however.

llvm-svn: 301704
2017-04-28 22:18:08 +00:00