Commit Graph

12941 Commits

Author SHA1 Message Date
Sam Elliott 96c8bc7956 [RISCV] Add RISCV-specific TargetTransformInfo
Summary:
LLVM Allows Targets to provide information that guides optimisations
made to LLVM IR. This is done with callbacks on a TargetTransformInfo object.

This patch adds a TargetTransformInfo class for RISC-V. This will allow us to
implement RISC-V specific callbacks as they become necessary.

This commit also adds the getIntImmCost callbacks, and tests them with a simple
constant hoisting test. Our immediate costs are on the conservative side, for
the moment, but we prevent hoisting in most circumstances anyway.

Previous review was on D63007

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63433

llvm-svn: 364046
2019-06-21 13:36:09 +00:00
Cameron McInally 1c0bd6dd2c [Reassociate] Remove bogus assert reported in PR42349.
Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.

The bogus assert with introduced in D63445.

llvm-svn: 363998
2019-06-20 23:03:55 +00:00
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Alina Sbirlea d0b11698cd [LICM & MSSA] Limit unsafe sinking and hoisting.
Summary:
The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
Given:
```
for i
  load a[i]
  store a[i]
```
The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.

Caught by PR42294.

This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63582

llvm-svn: 363982
2019-06-20 21:09:09 +00:00
Sanjay Patel 3207566dd6 [InstSimplify] add tests for known-not-a-power-of-2; NFC
I added a canonicalization to create this general pattern in:
rL363956

But as noted in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314#c11

...we have a (potentially expensive) simplification for the version
of the code that we just canonicalized away from, so we should
add/adjust that code to match.

llvm-svn: 363981
2019-06-20 21:04:14 +00:00
Cameron McInally 9589db7a98 [NFC][SLP] Pre-commit unary FNeg test to X86/propagate_ir_flags.ll
llvm-svn: 363978
2019-06-20 20:53:51 +00:00
David Bolvansky e0c1c3baf9 [NFC] Updated tests for D63546
llvm-svn: 363967
2019-06-20 19:30:56 +00:00
Sanjay Patel 63311bfb83 [InstCombine] canonicalize check for power-of-2
The form that compares against 0 is better because:
1. It removes a use of the input value.
2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS).

This is a root cause for PR42314, but probably doesn't completely answer the codegen request:
https://bugs.llvm.org/show_bug.cgi?id=42314

Alive proof:
https://rise4fun.com/Alive/9kG

  Name: is power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp eq i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp eq i32 %a2, 0

  Name: is not power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp ne i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp ne i32 %a2, 0

llvm-svn: 363956
2019-06-20 17:41:15 +00:00
Philip Reames 8c80d08052 [Tests] Add a tricky LFTR case for documentation purposes
Thought of this case while working on something else.  We appear to get it right in all of the variations I tried, but that's by accident.  So, add a test which would catch the potential bug.

llvm-svn: 363953
2019-06-20 17:16:53 +00:00
David Bolvansky 01511192b2 [InstCombine] cttz(-x) -> cttz(x)
Summary: Signedness does not change number of trailing zeros.

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63534

llvm-svn: 363951
2019-06-20 17:04:14 +00:00
Sanjay Patel d729ed8d44 [InstCombine] add commuted variants for power-of-2 checks; NFC
llvm-svn: 363945
2019-06-20 16:27:23 +00:00
Sanjay Patel 345473c791 [InstCombine] add tests for checking power-of-2; NFC
llvm-svn: 363938
2019-06-20 15:25:18 +00:00
Cameron McInally 4452c3b490 [NFC][SLP] Pre-commit unary FNeg test to X86/phi3.ll
llvm-svn: 363937
2019-06-20 15:17:17 +00:00
Simon Pilgrim 72186a2494 [SLP][X86] Add lookahead reordering tests from D60897
llvm-svn: 363925
2019-06-20 12:52:58 +00:00
Philip Reames eda1ba65ca LFTR for multiple exit loops
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

Differential Revision: https://reviews.llvm.org/D62625

llvm-svn: 363883
2019-06-19 21:58:25 +00:00
Philip Reames 80eb1ce7a0 [Tests] Autogen a test so that future changes are understandable
llvm-svn: 363882
2019-06-19 21:39:07 +00:00
Huihui Zhang 670778c762 [InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier
Summary:
To generate simplified IR, make sure fold
```
  (X & signbit) ==/!= 0) -> X s>=/s< 0;
```
is scheduled before fold
```
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
```

https://rise4fun.com/Alive/fbdh

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63026

llvm-svn: 363845
2019-06-19 17:31:39 +00:00
Sanjay Patel 3e03bf6921 [InstSimplify] add a phi test with 1 incoming value; NFC
D63489 proposes to change this behavior, but there's no
direct -instsimplify test to verify that the transform exists.

llvm-svn: 363842
2019-06-19 17:23:29 +00:00
Hubert Tong e9983eed5a [NFC][LSR] Avoid undefined grep in pr2570.ll
greater-than-sign is not a BRE special character.

POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\>`
is undefined. This patch replaces the pattern.

llvm-svn: 363828
2019-06-19 16:02:54 +00:00
Cameron McInally a027cf4764 [Reassociate] Handle unary FNeg in the Reassociate pass
Differential Revision: https://reviews.llvm.org/D63445

llvm-svn: 363813
2019-06-19 14:59:14 +00:00
David Bolvansky e3cd19d330 [NFC] Added tests for D63534
llvm-svn: 363796
2019-06-19 12:59:37 +00:00
David Bolvansky 21fd232385 [NFC] Added tests for cttz(abs(x)) -> cttz(x) fold
llvm-svn: 363795
2019-06-19 12:55:39 +00:00
Orlando Cazalet-Hyams 1251cac62a [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

> llvm-svn: 363046

llvm-svn: 363786
2019-06-19 10:50:47 +00:00
Jay Foad 45d19fb470 [ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Summary:
The test case does an (out of bounds) load from a global constant with
type <3 x float>. InstSimplify tried to turn this into an integer load
of the whole alloc size of the vector, which is 128 bits due to
alignment padding, and then bitcast this to <3 x vector> which failed
an assertion due to the type size mismatch.

The fix is to do an integer load of the normal size of the vector, with
no alignment padding.

Reviewers: tpr, arsenm, majnemer, dstuttard

Reviewed By: arsenm

Subscribers: hfinkel, wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63375

llvm-svn: 363784
2019-06-19 10:28:48 +00:00
Michael Liao 4f7f70e262 Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas
[SROA] Enhance SROA to handle `addrspacecast`ed allocas

- Fix typo in original change
- Add additional handling to ensure all return pointers are properly
  casted.

Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363743
2019-06-18 21:41:13 +00:00
Matt Arsenault e8d8bb5170 InstCombine: Pre-commit test for reassociating nuw
D39417

llvm-svn: 363741
2019-06-18 21:32:51 +00:00
Adrian Prantl 1db8d4a866 Fix broken debug info in in an !llvm.loop attachment in this testcase.
llvm-svn: 363730
2019-06-18 20:07:53 +00:00
Jordan Rupprecht 33e85ad956 Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas
This reverts r363711 (git commit 76a149ef81)

This causes stage2 build failures, e.g.:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio
http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio

llvm-svn: 363718
2019-06-18 18:40:04 +00:00
Michael Liao 76a149ef81 [SROA] Enhance SROA to handle `addrspacecast`ed allocas
Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363711
2019-06-18 17:58:49 +00:00
Philip Reames 44475363e8 Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Philip Reames fe8bd96ebd Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363613
2019-06-17 20:32:22 +00:00
Philip Reames 58c75565f3 Reduced test case for pr42279 in advance of the relevant re-commit + fix
llvm-svn: 363601
2019-06-17 19:27:45 +00:00
Joseph Tremoulet daa1ae6142 [EarlyCSE] Fix hashing of self-compares
Summary:
Update compare normalization in SimpleValue hashing to break ties (when
the same value is being compared to itself) by switching to the swapped
predicate if it has a lower numerical value.  This brings the hashing in
line with isEqual, which already recognizes the self-compares with
swapped predicates as equal.

Fixes PR 42280.

Reviewers: spatel, efriedma, nikic, fhahn, uabelho

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63349

llvm-svn: 363598
2019-06-17 19:11:28 +00:00
Matt Arsenault 6d741f29ec AMDGPU: Fold readlane/readfirstlane calls
llvm-svn: 363587
2019-06-17 17:52:35 +00:00
Warren Ristow 6452bdd29b [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764

llvm-svn: 363581
2019-06-17 17:20:08 +00:00
Matt Arsenault 1df203d78e InferAddressSpaces: Fix cloning original addrspacecast
If an addrspacecast needed to be inserted again, this was creating a
clone of the original cast for each user. Just use the original, which
also saves losing the value name.

llvm-svn: 363562
2019-06-17 14:13:29 +00:00
Matt Arsenault b10f097833 AMDGPU: Ignore subtarget for InferAddressSpaces
Even if the target doesn't have flat instructions, addrspace(0) is
still flat. It just happens to not work.

llvm-svn: 363561
2019-06-17 14:13:24 +00:00
Matt Arsenault f3b64d80bc AMDGPU: Mark exp/exp.compr as inaccessiblememonly
Should also be marked writeonly, but I think that would require
splitting the version with done set to a separate intrinsic

Test change is only from renumbering the attribute group numbers,
which for some reason the generated check lines consider.

llvm-svn: 363560
2019-06-17 13:52:24 +00:00
Sam Parker 1bd3d00e7e [CodeGen] Check for HardwareLoop Latch ExitBlock
The HardwareLoops pass finds exit blocks with a scevable exit count.
If the target specifies to update the loop counter in a register,
through a phi, we need to ensure that the exit block is a latch so
that we can insert the phi with the correct value for the incoming
edge.

Differential Revision: https://reviews.llvm.org/D63336

llvm-svn: 363556
2019-06-17 13:39:28 +00:00
Bjorn Pettersson 83773b77a5 [LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.

Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.

Reviewers: fhahn, Ayal, craig.topper

Reviewed By: fhahn

Subscribers: hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63386

llvm-svn: 363547
2019-06-17 12:02:24 +00:00
Sam Parker 60d6fb2a63 [SCEV] Use NoWrapFlags when expanding a simple mul
Second functional change following on from rL362687. Pass the
NoWrapFlags from the MulExpr to InsertBinop when we're generating a
shl or mul.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363540
2019-06-17 10:05:18 +00:00
Fangrui Song ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
Hans Wennborg a9e5d2f35d Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
Third time's the charm.

This was reverted in r363220 due to being suspected of an internal benchmark
regression and a test failure, none of which turned out to be caused by this.

llvm-svn: 363529
2019-06-17 07:47:28 +00:00
Yevgeny Rouban ee62c40eae [SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases
SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
if unreachable switch cases are removed. This patch fixes this bug by making use
of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
A new test is created.

Differential Revision: https://reviews.llvm.org/D62186

llvm-svn: 363527
2019-06-17 05:55:12 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Sanjay Patel c8d88ad1a9 [CodeGenPrepare][x86] shift both sides of a vector select when profitable
This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428

Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.

This extends CGP functionality introduced with:
rL201655

Differential Revision: https://reviews.llvm.org/D63233

llvm-svn: 363511
2019-06-16 15:29:03 +00:00
Nikita Popov 9145562b48 [SimplifyIndVar] Simplify non-overflowing saturating add/sub
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.

Differential Revision: https://reviews.llvm.org/D62792

llvm-svn: 363489
2019-06-15 08:48:52 +00:00
Huihui Zhang dc2fd6a14e [InstCombine] Add tests to show missing fold opportunity for "icmp and shift" (nfc).
Summary:
For icmp pred (and (sh X, Y), C), 0

  When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0,
  rather than (X & (signbit >> Y)) != 0.

  When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1,
  rather than (X & (~C >> Y)) == 0.

For icmp pred (and X, (sh signbit, Y)), 0

  Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0
  Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0

  Reviewers: lebedev.ri, efriedma, spatel, craig.topper

  Reviewed By: lebedev.ri

  Subscribers: llvm-commits

  Tags: #llvm

  Differential Revision: https://reviews.llvm.org/D63025

llvm-svn: 363479
2019-06-15 00:33:41 +00:00
Akira Hatanaka a704a8f28c [ObjC][ARC] Delete ObjC runtime calls on global variables annotated
with 'objc_arc_inert'

Those calls are no-ops, so they can be safely deleted.

rdar://problem/49839633

Differential Revision: https://reviews.llvm.org/D62433

llvm-svn: 363468
2019-06-14 22:06:32 +00:00
Matt Arsenault 282dac717e SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

llvm-svn: 363462
2019-06-14 21:38:31 +00:00
Matt Arsenault e6efb6433f SROA: Add baseline test for addrspacecast changes
llvm-svn: 363460
2019-06-14 21:22:26 +00:00
Florian Hahn dcdd12b68c Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363289 (git commit eb88badff9)

llvm-svn: 363427
2019-06-14 17:23:09 +00:00
Sanjay Patel 7ea378b940 [CodeGenPrepare] propagate debuginfo when copying a shuffle
llvm-svn: 363409
2019-06-14 15:05:35 +00:00
Matt Arsenault 492d71cc99 AMDGPU: Fold readlane intrinsics of constants
I'm not 100% sure about this, since I'm worried about IR transforms
that might end up introducing divergence downstream once replaced with
a constant, but I haven't come up with an example yet.

llvm-svn: 363406
2019-06-14 14:51:26 +00:00
Sam Parker 0cf9639a9c [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363364
2019-06-14 09:19:41 +00:00
Stanislav Mekhanoshin 68a2fef9ae [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32
Differential Revision: https://reviews.llvm.org/D63301

llvm-svn: 363339
2019-06-13 23:47:36 +00:00
Shawn Landden 24f4085811 [SimplifyCFG] NFC, update Switch tests as a baseline.
Also add baseline tests to show effect of later patches.

There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.

This is the third attempt to land this patch.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363319
2019-06-13 19:36:38 +00:00
Sanjay Patel 5bf7f81aa8 [InstCombine] add test for failed libfunction prototype matching; NFC
llvm-svn: 363291
2019-06-13 18:26:10 +00:00
Philip Reames eb88badff9 Fix a bug w/inbounds invalidation in LFTR
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying.  As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363289
2019-06-13 18:23:13 +00:00
Sanjay Patel 4d93fb528e [InstCombine] auto-generate complete test checks; NFC
llvm-svn: 363286
2019-06-13 18:14:49 +00:00
David Bolvansky a9d8388e80 [NFC] Updated testcase for D54411/rL363284
llvm-svn: 363285
2019-06-13 18:13:03 +00:00
Joseph Tremoulet 3bc6e2a7aa [EarlyCSE] Ensure equal keys have the same hash value
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`.  Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes.  This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.  A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.

Reviewers: spatel, nikic

Reviewed By: spatel, nikic

Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62644

llvm-svn: 363274
2019-06-13 15:24:11 +00:00
Sander de Smalen 51c2fa0e2a Improve reduction intrinsics by overloading result value.
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.

For example:

  ; The LLVM LangRef specifies that the scalar result must equal the
  ; vector element type, but this is not checked/enforced by LLVM.
  declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)

This patch changes that into:

  declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)

Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.

Reviewers: RKSimon, arsenm, rnk, greened, aemerson

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62996

llvm-svn: 363240
2019-06-13 09:37:38 +00:00
Sam Parker 9d28473a35 [ARM][TTI] Scan for existing loop intrinsics
TTI should report that it's not profitable to generate a hardware loop
if it, or one of its child loops, has already been converted.

Differential Revision: https://reviews.llvm.org/D63212

llvm-svn: 363234
2019-06-13 08:28:46 +00:00
Shawn Landden 8b142bcc3f [SimplifyCFG] reverting preliminary Switch patches again
This reverts 363226 and 363227, both NFC intended

I swear I fixed the test case that is failing, and ran
the tests, but I will look into it again.

llvm-svn: 363229
2019-06-13 05:26:17 +00:00
Shawn Landden c54b2011bd [SimplifyCFG] NFC, update Switch tests to better examine successive patches
Also add baseline tests to show effect of later patches.

There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363226
2019-06-13 04:51:35 +00:00
Shawn Landden c6cba2957d [SimplifyCFG] revert the last commit.
I ran ALL the test suite locally, so I will look into this...

llvm-svn: 363223
2019-06-13 02:47:47 +00:00
Shawn Landden f93b99b2b6 [SimplifyCFG] NFC, update Switch tests to HEAD so I can
see if my changes change anything

Also add baseline tests to show effect of later patches.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363222
2019-06-13 02:24:24 +00:00
David L. Jones c73fadaa84 Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.

- Performance regressions:
  - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
  - Benchmarks for Abseil's SwissTable.
- Correctness:
  - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).

hwennborg has already been notified, and is aware of reproducer failures.

llvm-svn: 363220
2019-06-13 02:04:45 +00:00
Dinar Temirbulatov b2f45ba1e8 [SLP] Update propagate_ir_flags.ll test to check that we do retain the common subset, NFC.
llvm-svn: 363218
2019-06-13 00:19:50 +00:00
Philip Reames 0bded8442f [Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer
llvm-svn: 363217
2019-06-12 23:39:49 +00:00
Sanjay Patel a1421e8347 [x86] add tests for vector shifts; NFC
llvm-svn: 363203
2019-06-12 21:30:06 +00:00
Philip Reames 00e481b75d [Tests] Autogen RLEV test and add tests for a future enhancement
llvm-svn: 363193
2019-06-12 19:23:10 +00:00
Philip Reames 851adc000c [Tests] Add tests to highlight sibling loop optimization order issue for exit rewriting
The issue addressed in r363180 is more broadly relevant.  For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops.

llvm-svn: 363192
2019-06-12 19:04:51 +00:00
Philip Reames e51c3d8b82 [SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673
SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA.  It's not entirely clear this restriction is neccessary, but for the moment it exists.   For this reason, we don't analyze single-entry phi inputs.  However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant.  Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops.

This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values.  We should probably also add similiar logic directly in the SCEV construction path itself.

Patch by: mkazantsev (with revised commit message by me)
Differential Revision: https://reviews.llvm.org/D58113

llvm-svn: 363180
2019-06-12 17:21:47 +00:00
Sanjay Patel 64006896ac [InstCombine] add tests for fmin/fmax libcalls; NFC
llvm-svn: 363175
2019-06-12 15:29:40 +00:00
Sam Parker 3d42959dd8 Revert rL363156.
The patch was to fix buildbots, but rL363157 should now be fixing it
in a cleaner way.

llvm-svn: 363174
2019-06-12 15:28:00 +00:00
Matt Arsenault f29366b1f5 StackProtector: Use PointerMayBeCaptured
This was using its own, outdated list of possible captures. This was
at minimum not catching cmpxchg and addrspacecast captures.

One change is now any volatile access is treated as capturing. The
test coverage for this pass is quite inadequate, but this required
removing volatile in the lifetime capture test.

Also fixes some infrastructure issues to allow running just the IR
pass.

Fixes bug 42238.

llvm-svn: 363169
2019-06-12 14:23:33 +00:00
Matt Arsenault aa6bdf9dcd LoopVersioning: Respect convergent
This changes the standalone pass only. Arguably the utility class
itself should assert there are no convergent calls. However, a target
pass with additional context may still be able to version a loop if
all of the dynamic conditions are sufficiently uniform.

llvm-svn: 363165
2019-06-12 14:05:58 +00:00
Sanjay Patel 082a41994a [InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFC
llvm-svn: 363163
2019-06-12 13:51:33 +00:00
Matt Arsenault 86325be3d7 LoopLoadElim: Respect convergent
llvm-svn: 363162
2019-06-12 13:50:47 +00:00
Matt Arsenault 2466ba97bc LoopDistribute/LAA: Respect convergent
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.

Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.

https://reviews.llvm.org/D62607

llvm-svn: 363160
2019-06-12 13:34:19 +00:00
Matt Arsenault 1e21181aee LoopDistribute/LAA: Add tests to catch regressions
I broke 2 of these with a patch, but were not covered by existing
tests.

https://reviews.llvm.org/D63035

llvm-svn: 363158
2019-06-12 13:15:59 +00:00
Sam Parker 52d7326f32 [NFC] Add HardwareLoops lit.local.cfg file
Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't
an arm target.

llvm-svn: 363157
2019-06-12 12:54:19 +00:00
Sam Parker ece316b56a Attempt to fix non-Arm buildbots
Adding REQUIRES: arm to failing tests

llvm-svn: 363156
2019-06-12 12:47:35 +00:00
Sam Parker 757ac02dc8 [ARM] Implement TTI::isHardwareLoopProfitable
Implement the backend target hook to drive the HardwareLoops pass.
The low-overhead branch extension for Arm M-class cores is flexible
enough that we don't have to ensure correctness at this point, except
checking that the loop counter variable can be stored in LR - a
32-bit register. For it to be profitable, we want to avoid loops that
contain function calls, or any other instruction that alters the PC.
    
This implementation uses TargetLoweringInfo, to query type and
operation actions, looks at intrinsic calls and also performs some
manual checks for remainder/division and FP operations.
    
I think this should be a good base to start and extra details can be
filled out later.

Differential Revision: https://reviews.llvm.org/D62907

llvm-svn: 363149
2019-06-12 12:00:42 +00:00
Orlando Cazalet-Hyams a947156396 Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077.
See phabricator thread for D60831.

llvm-svn: 363132
2019-06-12 08:34:51 +00:00
Philip Reames 082cd30327 Generalize icmp matching in IndVars' eliminateTrunc
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.

Differential Revision: https://reviews.llvm.org/D63112

llvm-svn: 363108
2019-06-11 22:43:25 +00:00
Cameron McInally 08200d6d26 [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ
Differential Revision: https://reviews.llvm.org/D62612

llvm-svn: 363082
2019-06-11 16:21:21 +00:00
Cameron McInally 796de11331 [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg
Differential Revision: https://reviews.llvm.org/D62629

llvm-svn: 363080
2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams 1a0f7a2077 [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046
2019-06-11 10:37:20 +00:00
Matt Arsenault c5830f5f05 AtomicExpand: Don't crash on non-0 alloca
This now produces garbage on AMDGPU with a call to an nonexistent,
anonymous libcall but won't assert.

llvm-svn: 363022
2019-06-11 01:35:07 +00:00
Matt Arsenault 383e72fcfe AMDGPU: Expand < 32-bit atomics
Also fix AtomicExpand asserting on atomicrmw fadd/fsub.

llvm-svn: 363021
2019-06-11 01:35:00 +00:00
Philip Reames efb14f9005 [Tests] Adjust LFTR dead-iv tests to bypass undef cases
As pointed out by Nikita in review, undef and poison need to be handled separately.  Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check.

llvm-svn: 363002
2019-06-10 23:17:10 +00:00
Rong Xu e44fa83c37 [PGO] Handle cases of non-instrument BBs
As shown in PR41279, some basic blocks (such as catchswitch) cannot be
instrumented. This patch filters out these BBs in PGO instrumentation.
It also sets the profile count to the fail-to-instrument edge, so that we
can propagate the counts in the CFG.

Differential Revision: https://reviews.llvm.org/D62700

llvm-svn: 362995
2019-06-10 22:36:27 +00:00
Philip Reames 1d322ccaac [Tests] Split an LFTR dead-iv case
There are two interesting sub-cases here.  1) Switching IVs is legal, but only in pre-increment form.  and 2) Switching IVs is legal, and so is post-increment form.

llvm-svn: 362993
2019-06-10 22:33:20 +00:00
Philip Reames 78c0d75697 [Tests] Add tests for D62939 (miscompiles around dead pointer IVs)
Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning.

llvm-svn: 362976
2019-06-10 19:45:59 +00:00
Philip Reames a9633d5f0b [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 

llvm-svn: 362975
2019-06-10 19:18:53 +00:00
Sanjay Patel 9650c95b7e [InstCombine] allow unordered preds when canonicalizing to fabs()
We have a known-never-nan value via 'nnan', so an unordered predicate
is the same as its ordered sibling.

Similar to:
rL362937

llvm-svn: 362954
2019-06-10 15:39:00 +00:00
Sanjay Patel 07bba68889 [InstCombine] add tests for fabs() with unordered preds; NFC
llvm-svn: 362949
2019-06-10 15:08:22 +00:00
Sanjay Patel 85de9634e6 [InstCombine] fix bug in canonicalization to fabs()
Forgot to translate the predicate clauses in rL362943.

llvm-svn: 362945
2019-06-10 14:57:45 +00:00
Sanjay Patel 8b6d9f60ed [InstCombine] change canonicalization to fabs() to use FMF on fsub
Similar to rL362909:
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fsub because they have the
same operand.

llvm-svn: 362943
2019-06-10 14:46:36 +00:00
Sanjay Patel 8cd8c5784b [InstCombine] allow unordered preds when canonicalizing to fabs()
PR42179:
https://bugs.llvm.org/show_bug.cgi?id=42179

llvm-svn: 362937
2019-06-10 14:14:51 +00:00
Sanjay Patel 4cdd3ceb57 [InstCombine] add tests for fcmp unordered pred -> fabs (PR42179); NFC
llvm-svn: 362936
2019-06-10 14:04:10 +00:00
David Green d847aa573b [ARM] Enable Unroll UpperBound
This option allows loops with small max trip counts to be fully unrolled. This
can help with code like the remainder loops from manually unrolled loops like
those that appear in the cmsis dsp library. We would apparently previously
runtime unroll them with the default unroll count (4).

Differential Revision: https://reviews.llvm.org/D63064

llvm-svn: 362928
2019-06-10 10:22:14 +00:00
Vivek Pandya 11cb15f8ed Do not derive no-recurse attribute if function does not have exact definition.
This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336

Reviewers: jdoerfert
Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D63045

llvm-svn: 362918
2019-06-10 04:16:04 +00:00
Roman Lebedev d669758d84 [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles
A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362911
2019-06-09 16:30:42 +00:00
Roman Lebedev ff0c99b017 [NFC][InstCombine] Revisit canonicalize-constant-low-bit-mask-and-icmp-s* tests in preparatio for PR42198.
The `icmp sgt`/`icmp sle` variants are, too, miscompiles:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL
A precondition 'x != 0' was forgotten by me.

While ensuring test coverage for `-1`, also add test coverage
for `0` mask. Mask `0` is allowed for all the folds,
mask `-1` is allowed for all the folds with unsigned `icmp` pred.
Constant mask `0` is missed though.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362910
2019-06-09 16:30:14 +00:00
Sanjay Patel 87cd16a86e [InstCombine] change canonicalization to fabs() to use FMF on fneg
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

llvm-svn: 362909
2019-06-09 16:22:01 +00:00
David Bolvansky 4e95b36b6d [NFC] Added test from PR42084 for D63058
llvm-svn: 362906
2019-06-09 14:56:46 +00:00
Nikita Popov 06beb48229 [InstCombine] Add tests for usub.sat(x,y)+y etc; NFC
For PR42178.

llvm-svn: 362905
2019-06-09 14:39:47 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel de4d4d5049 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
Opposite predicate for rL362742 / rL362879 / D62979

llvm-svn: 362902
2019-06-09 13:30:14 +00:00
Ayke van Laethem f18cf230e4 [CaptureTracking] Don't let comparisons against null escape inbounds pointers
Pointers that are in-bounds (either through dereferenceable_or_null or
thorough a getelementptr inbounds) cannot be captured with a comparison
against null. There is no way to construct a pointer that is still in
bounds but also NULL.

This helps safe languages that insert null checks before load/store
instructions. Without this patch, almost all pointers would be
considered captured even for simple loads. With this patch, an icmp with
null will not be seen as escaping as long as certain conditions are met.

There was a lot of discussion about this patch. See the Phabricator
thread for detals.

Differential Revision: https://reviews.llvm.org/D60047

llvm-svn: 362900
2019-06-09 10:20:33 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
David Bolvansky 54b1044983 [NFC] Added tests for D63038
llvm-svn: 362875
2019-06-08 12:07:59 +00:00
Matt Arsenault a59aeb3f29 LoopDistribute: Add testcase where SCEV wants to insert a runtime
check.

Only the memory based checks were being tested. Prepare for fix in
convergent handling.

llvm-svn: 362854
2019-06-07 23:17:38 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Sanjay Patel 38c5ee1802 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
llvm-svn: 362742
2019-06-06 20:14:06 +00:00
Craig Topper 6cda33ba36 [InlineCost] Add support for unary fneg.
This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.

Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.

Differential Revision: https://reviews.llvm.org/D62699

llvm-svn: 362732
2019-06-06 19:02:18 +00:00
Philip Reames 101915cfda [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284
This is a really silly bug that even a simple test w/an unconditional latch would have caught.  I tried to guard against the case, but put it in the wrong if check.  Oops.

llvm-svn: 362727
2019-06-06 18:02:36 +00:00
Sanjay Patel dd2d1a168f [InstCombine] add tests for loads of bitcasted vector pointer; NFC
llvm-svn: 362703
2019-06-06 13:18:20 +00:00
Benjamin Kramer f1249442cf Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

llvm-svn: 362699
2019-06-06 12:35:46 +00:00
Sam Parker 7cc580f5e9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 362687
2019-06-06 08:56:26 +00:00
Joseph Tremoulet acb5609063 [EarlyCSE] Add tests for negated min/max/abs [NFC]
Summary:
I'm planning to update the hashing logic to recognize their equivalence
in a subsequent change (D62644).

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62918

llvm-svn: 362657
2019-06-05 21:30:10 +00:00
Matt Arsenault 663d762c9a NewGVN: Handle addrspacecast
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.

llvm-svn: 362653
2019-06-05 21:15:52 +00:00
Tim Northover 8d7f118ab2 InstCombine: correctly change byval type attribute alongside call args.
When the byval attribute has a type, it must match the pointee type of
any parameter; but InstCombine was not updating the attribute when
folding casts of various kinds away.

llvm-svn: 362643
2019-06-05 20:38:17 +00:00
Cameron McInally 8b83a9c6b1 [NFC][Reassociate] Fix mistake in 468b2ad
Missed 2 'fast fsub(0.0,X) -> fneg(X)' changes.

llvm-svn: 362631
2019-06-05 18:50:07 +00:00
Cameron McInally 5162266515 [NFC][Reassociate] Add unary fneg tests to fast-basictest.ll
llvm-svn: 362630
2019-06-05 18:35:54 +00:00
Philip Reames 13dd125043 [Tests] Add poison inference tests for indvars showing both existing transforms, and some room for improvement
llvm-svn: 362628
2019-06-05 18:00:59 +00:00
Cameron McInally 0a31726d20 [NFC][Reassociate] Regenerate CHECKs for fast-basictest.ll
llvm-svn: 362627
2019-06-05 18:00:27 +00:00
Dinar Temirbulatov 15c657d13d [SLP] Fix regression in broadcasts caused by operand reordering patch D59973.
This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.

Committed on behalf of @vporpo (Vasileios Porpodas)
    
Differential Revision: https://reviews.llvm.org/D62427

llvm-svn: 362613
2019-06-05 15:26:28 +00:00
Yevgeny Rouban a3e16719c4 Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 5b32f60ec3.
The fix is in commit 4f9e68148b.

This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 362583
2019-06-05 05:46:40 +00:00
Johannes Doerfert aade782a98 [Attributor] Pass infrastructure and fixpoint framework
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames

Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59918

llvm-svn: 362578
2019-06-05 03:02:24 +00:00
Johannes Doerfert 76467c4d7f [NFC][FnAttrs] Stress tests for attribute deduction
This commit is a preparation of upcoming patches on attribute deduction.
It will shorten the diffs and make it clear what we inferred before.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes

Subscribers: bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59903

llvm-svn: 362577
2019-06-05 03:00:06 +00:00
Nemanja Ivanovic fe97754acf Initial support for IBM MASS vector library
This is the LLVM portion of patch https://reviews.llvm.org/D59881.
The clang portion is to follow.

llvm-svn: 362568
2019-06-05 01:31:43 +00:00
Cameron McInally 5c7245b830 [Scalarizer] Add UnaryOperator visitor to scalarization pass
Differential Revision: https://reviews.llvm.org/D62858

llvm-svn: 362558
2019-06-04 23:01:36 +00:00
Philip Reames 0cdaf3a09f [Tests] Autogen a test so future changes are visible
Oddly, I had to change a value name from "tmp0" to "bc0" to get the autogened test to pass.  I'm putting this down to an oddity of update_test_checks or FileCheck, but don't understand it.

llvm-svn: 362532
2019-06-04 17:29:55 +00:00
Nikita Popov df621bdfc8 [LVI][CVP] Add support for urem, srem and sdiv
The underlying ConstantRange functionality has been added in D60952,
D61207 and D61238, this just exposes it for LVI.

I'm switching the code from using a whitelist to a blacklist, as
we're down to one unsupported operation here (xor) and writing it
this way seems more obvious :)

Differential Revision: https://reviews.llvm.org/D62822

llvm-svn: 362519
2019-06-04 16:24:09 +00:00
Philip Reames af11a4376c [Tests] Update a test to consistently use new pass manager and FileCheck the result
llvm-svn: 362518
2019-06-04 16:19:34 +00:00
Philip Reames 78e71c4d09 [Tests] Autogen tests so that diffs for a future change are understandable
llvm-svn: 362516
2019-06-04 16:15:19 +00:00
Shawn Landden 2ee9a827ad [SimplifyCFG] fix last commit
llvm-svn: 362501
2019-06-04 14:32:52 +00:00
Shawn Landden 7f22fecac2 [SimplifyCFG] NFC; remove bogus test case
Even if one bit is defined, the code is not clear what it is suppose to do.

The test wants to assert that some bits are undef, but that's not what the IR does and I don't think it's even possible to do that in any meaningful way. It was added in D12497, so @reames might want to double check.

Differential Revision: https://reviews.llvm.org/D60859

llvm-svn: 362499
2019-06-04 14:17:46 +00:00
Cameron McInally 89f9af5487 [SCCP] Add UnaryOperator visitor to SCCP for unary FNeg
Differential Revision: https://reviews.llvm.org/D62819

llvm-svn: 362449
2019-06-03 21:53:56 +00:00
Matt Arsenault 8dbeb9256c TTI: Improve default costs for addrspacecast
For some reason multiple places need to do this, and the variant the
loop unroller and inliner use was not handling it.

Also, introduce a new wrapper to be slightly more precise, since on
AMDGPU some addrspacecasts are free, but not no-ops.

llvm-svn: 362436
2019-06-03 18:41:34 +00:00
Andrew Kaylor 4172dbab5d Fix a crash when the default of a switch is removed
This patch fixes a problem that occurs in LowerSwitch when a switch statement has a PHI node as its condition, and the PHI node only has two incoming blocks, and one of those incoming blocks is through an unreachable default in the switch statement. When this condition occurs, LowerSwitch holds a pointer to the condition value, but removes the switch block as a predecessor of the PHI block, causing the PHI node to be replaced. LowerSwitch then tries to use its stale pointer to the original condition value, causing a crash.

Differential Revision: https://reviews.llvm.org/D62560

llvm-svn: 362427
2019-06-03 17:54:15 +00:00
Philip Reames 83645d214d [Tests] Add LFTR tests for multiple exit loops (try 2)
(Recommit after fixing a keymash in the run line.  Sorry for breakage.)

This is preparation for D62625 <https://reviews.llvm.org/D62625>

llvm-svn: 362426
2019-06-03 17:41:12 +00:00
Dmitri Gribenko b46934eeb8 Revert "[Tests] Add LFTR tests for multiple exit loops"
This reverts commit r362417.  There's a syntax error in the RUN line.

llvm-svn: 362418
2019-06-03 16:58:11 +00:00
Philip Reames 2fcd2bd0df [Tests] Add LFTR tests for multiple exit loops
This is preparation for D62625

llvm-svn: 362417
2019-06-03 16:46:03 +00:00
Simon Pilgrim 8a32ca381d [CostModel][X86] Improve masked load/store AVX1/AVX2 costs
A mixture of internal tests and review of the scheduler models indicates we're overestimating the cost of a masked load, which we're estimating at 4x regular memory ops - more realistic values indicates that its closer to 2x. Masked stores costs are a lot more diverse but 8x is roughly in the middle of the range.

e.g. SandyBridge
defm : X86WriteRes<WriteFMaskedLoad, [SBPort23,SBPort05], 8, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedLoadY, [SBPort23,SBPort05], 9, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedStore, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;
defm : X86WriteRes<WriteFMaskedStoreY, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;

e.g. Btver2
defm : X86WriteRes<WriteFMaskedLoad, [JLAGU, JFPU01, JFPX], 6, [1, 2, 2], 1>;
defm : X86WriteRes<WriteFMaskedLoadY, [JLAGU, JFPU01, JFPX], 6, [2, 4, 4], 2>;
defm : X86WriteRes<WriteFMaskedStore, [JSAGU, JFPU01, JFPX], 6, [1, 1, 4], 1>;
defm : X86WriteRes<WriteFMaskedStoreY, [JSAGU, JFPU01, JFPX], 6, [2, 2, 4], 2>;

Differential Revision: https://reviews.llvm.org/D61257

llvm-svn: 362338
2019-06-02 20:37:02 +00:00