Commit Graph

21364 Commits

Author SHA1 Message Date
Sanjay Patel 9907d3c8b4 [InstCombine] canonicalize add/sub with bool
add A, sext(B) --> sub A, zext(B)

We have to choose 1 of these forms, so I'm opting for the
zext because that's easier for value tracking.

The backend should be prepared for this change after:
D57401
rL353433

This is also a preliminary step towards reducing the amount
of bit hackery that we do in IR to optimize icmp/select.
That should be waiting to happen at a later optimization stage.

The seeming regression in the fuzzer test was discussed in:
D58359

We were only managing that fold in instcombine by luck, and
other passes should be able to deal with that better anyway.

llvm-svn: 354748
2019-02-24 16:57:45 +00:00
Matt Arsenault 65b4ab9921 BreakCriticalEdges: Update PostDominatorTree
llvm-svn: 354673
2019-02-22 15:01:41 +00:00
Roman Tereshin 99a6672bba [LowerSwitch][AMDGPU] Do not handle impossible values
This patch adds LazyValueInfo to LowerSwitch to compute the range of the
value being switched over and reduce the size of the tree LowerSwitch
builds to lower a switch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D58096

llvm-svn: 354670
2019-02-22 14:33:46 +00:00
Chijun Sima 70e97163e0 [DTU] Refine the interface and logic of applyUpdates
Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.

Logic changes:

Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.

Interface changes:
The second semantic of `applyUpdates`  is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.

Reviewers: kuhar, brzycki, dmgreen, grosser

Reviewed By: brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58170

llvm-svn: 354669
2019-02-22 13:48:38 +00:00
Alina Sbirlea 90d2e3a16d [MemorySSA & LoopPassManager] Resolve PR40038.
The correct edge being deleted is not to the unswitched exit block, but to the
original block before it was split. That's the key in the map, not the
value.
The insert is correct. The new edge is to the .split block.

The splitting turns OriginalBB into:
OriginalBB -> OriginalBB.split.
Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete
ParentBB->OriginalBB, not ParentBB->OriginalBB.split.

llvm-svn: 354656
2019-02-22 07:18:37 +00:00
Chijun Sima f131d6110e [DTU] Deprecate insertEdge*/deleteEdge*
Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated.

Reviewers: kuhar, brzycki

Reviewed By: kuhar, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58443

llvm-svn: 354652
2019-02-22 05:41:43 +00:00
Alina Sbirlea 97468e9282 [MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks.
MemorySSA is now updated when forming dedicated exit blocks.
Resolves PR40037.

llvm-svn: 354623
2019-02-21 21:13:34 +00:00
Alina Sbirlea d2d3244363 [LoopSimplifyCFG] Update MemorySSA after r353911.
Summary:
MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished.
All updates that follow are removal of edges which are safe to handle via the removeEdge() API.
Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT.

Reviewers: mkazantsev, rtereshin

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58524

llvm-svn: 354613
2019-02-21 19:54:05 +00:00
Alina Sbirlea 73446cd567 [EarlyCSE] Cleanup deadcode. [NFCI]
Summary: Cleanup nop assignments.

Reviewers: george.burgess.iv, davide

Subscribers: sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58308

llvm-svn: 354612
2019-02-21 19:49:57 +00:00
Joey Gouly fdf651ee8d [InferAddressSpaces] Fix fallthrough error
llvm-svn: 354580
2019-02-21 13:10:37 +00:00
Joey Gouly 92af1360f3 [InferAddressSpaces] Fix crash on select of non-ptr operands
Check the operands of a select are pointers, to determine if it is an address
expression or not.

https://reviews.llvm.org/D58226

llvm-svn: 354576
2019-02-21 12:31:36 +00:00
Max Kazantsev 10489d76f6 [LoopSimplifyCFG] Add missing MSSA edge deletion
When we create fictive switch in preheader, we should take
care about MSSA and delete edge between old preheader and
header.

llvm-svn: 354547
2019-02-21 05:51:29 +00:00
Wei Mi 500606f270 [Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled
is false.

Right now for inliner and partial inliner, we always pass the address of a
valid ORE object to getInlineCost even if RemarkEnabled is false because of
no -Rpass is specified. Since ComputeFullInlineCost will be set to true if
ORE is non-null in getInlineCost, this introduces the problem that in
getInlineCost we cannot return early even if we already know the cost is
definitely higher than the threshold. It is a general problem for compile
time.

This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is
false.

Differential Revision: https://reviews.llvm.org/D58399

llvm-svn: 354542
2019-02-21 02:57:52 +00:00
Philip Reames 79d5e16f51 [GVN] Small tweaks to comments, style, and missed vector handling
Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches.  The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast.

llvm-svn: 354412
2019-02-20 00:31:28 +00:00
Philip Reames a259dc3263 [GVN] Fix last crasher w/non-integral pointers
Same case as for memset and memcpy, but this time for clobbering stores and loads.  We still can't allow coercion to or from non-integrals, regardless of the transform.

Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers.

My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues.  My chanks to Cherry Zhang for helping debug.

llvm-svn: 354407
2019-02-20 00:15:54 +00:00
Philip Reames 952d234d00 [GVN] Fix a crash bug w/non-integral pointers and memtransfers
Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so.  Since we shouldn't be doing such coercions to start with, easy fix.  From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed.

llvm-svn: 354403
2019-02-19 23:49:38 +00:00
Philip Reames 322eb7660e [GVN] Fix a non-integral pointer bug w/vector types
GVN generally doesn't forward structs or array types, but it *will* forward vector types to non-vectors and vice versa.  As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves.

llvm-svn: 354401
2019-02-19 23:19:51 +00:00
Philip Reames 92756a80e7 [GVN] Fix a crash bug around non-integral pointers
If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed.  Such a forward is not legal in general, but we can forward null pointers.  Test for both cases are included.

llvm-svn: 354399
2019-02-19 23:07:15 +00:00
Sanjay Patel c1e0184317 [InstCombine] reduce even more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354393
2019-02-19 22:14:21 +00:00
Sanjay Patel dcb93c0dda [InstCombine] rearrange saturated add folds; NFC
This is no-functional-change-intended, but that was also
true when it was part of rL354276, and I managed to lose
2 predicates for the fold with constant...causing much bot
distress. So this time I'm adding a couple of negative tests
to avoid that.

llvm-svn: 354384
2019-02-19 21:46:13 +00:00
Max Kazantsev ebd95ea86e [NFC] API for signaling that the current loop is being deleted
We are planning to be able to delete the current loop in LoopSimplifyCFG
in the future. Add API to notify the loop pass manager that it happened.

llvm-svn: 354314
2019-02-19 11:14:05 +00:00
Max Kazantsev 30095d9795 [NFC] Store loop header in a local to keep it available after the loop is deleted
llvm-svn: 354313
2019-02-19 11:13:58 +00:00
Sanjay Patel 8a35d339c9 Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op"
This reverts commit 079b610c29.
Bots are failing after this change on a stage 2 compile of clang.

llvm-svn: 354277
2019-02-18 16:04:22 +00:00
Sanjay Patel 079b610c29 [InstCombine] reduce even more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354276
2019-02-18 15:21:39 +00:00
Max Kazantsev 4561475e09 [NFC] Teach getInnermostLoopFor walk up the loop trees
This should be NFC in current use case of this method, but it will
help to use it for solving more compex tasks in follow-up patches.

llvm-svn: 354227
2019-02-17 18:21:51 +00:00
Sanjay Patel b341ee7071 [InstCombine] reduce more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: not op
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %notx, %y
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

  Name: not op ugt
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %y, %notx
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/niom

(The matching here is still incomplete.)

llvm-svn: 354224
2019-02-17 16:48:50 +00:00
Sanjay Patel bee2073542 [InstCombine] reduce unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

(The matching here is incomplete. Trying to take minimal steps
to make sure we don't induce infinite looping from existing
canonicalizations of the 'select'.)

llvm-svn: 354221
2019-02-17 15:58:48 +00:00
Max Kazantsev d72c1a0c5c [NFC] Fix name and clarifying comment for factored-out function
llvm-svn: 354220
2019-02-17 15:22:48 +00:00
Max Kazantsev 0f943269a0 [NFC] Factor out a function for future reuse
llvm-svn: 354218
2019-02-17 15:04:09 +00:00
Alina Sbirlea 383ccfb360 [EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE.
Summary:
Unlimitted number of calls to getClobberingAccess can lead to high
compile times in pathological cases.
Limitting getClobberingAccess to a fairly high number. Can be adjusted
based on users/need.
Note: this is the only user of MemorySSA currently enabled by default.
The same handling exists in LICM (disabled atm). As MemorySSA gains more
users, this logic of capping will need to move inside MemorySSA.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D58248

llvm-svn: 354182
2019-02-15 22:47:54 +00:00
Philip Reames 8220ecbce1 [InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC]
Better addressing comments from https://reviews.llvm.org/D58290.

llvm-svn: 354171
2019-02-15 21:31:39 +00:00
Philip Reames cae6c767e8 [InstCombine] Convert atomicrmws to xchg or store where legal
Implement two more transforms of atomicrmw:
1) We can convert an atomicrmw which produces a known value in memory into an xchg instead.
2) We can convert an atomicrmw xchg w/o users into a store for some orderings.

Differential Revision: https://reviews.llvm.org/D58290

llvm-svn: 354170
2019-02-15 21:23:51 +00:00
Vedant Kumar 5f5cac3ae2 [CodeExtractor] Do not lift lifetime.end markers for region inputs
If a lifetime.end marker occurs along one path through the extraction
region, but not another, then it's still incorrect to lift the marker,
because there is some path through the extracted function which would
ordinarily not reach the marker. If the call to the extracted function
is in a loop, unrolling can cause inputs to the function to become
optimized out as undef after the first iteration.

To prevent incorrect stack slot merging in the calling function, it
should be sufficient to lift lifetime.start markers for region inputs.
I've tested this theory out by doing a stage2 check-all with randomized
splitting enabled.

This is a follow-up to r353973, and there's additional context for this
change in https://reviews.llvm.org/D57834.

rdar://47896986

Differential Revision: https://reviews.llvm.org/D58253

llvm-svn: 354159
2019-02-15 18:46:58 +00:00
Vedant Kumar 47a0c9b69c [HotColdSplit] Schedule splitting late to fix perf regression
With or without PGO data applied, splitting early in the pipeline
(either before the inliner or shortly after it) regresses performance
across SPEC variants. The cause appears to be that splitting hides
context for subsequent optimizations.

Schedule splitting late again, in effect reversing r352080, which
scheduled the splitting pass early for code size benefits (documented in
https://reviews.llvm.org/D57082).

Differential Revision: https://reviews.llvm.org/D58258

llvm-svn: 354158
2019-02-15 18:46:44 +00:00
Sanjay Patel 8a2b543a13 [InstCombine] fix crash while trying to narrow a binop of shuffles (PR40734)
https://bugs.llvm.org/show_bug.cgi?id=40734

llvm-svn: 354144
2019-02-15 16:31:55 +00:00
Clement Courbet f7e84a2ccc [MergeICmps] Make base ordering really deterministic.
Summary:
The idea is that we now manipulate bases through a `unsigned BaseID` based on
order of appearance in the comparison chain rather than through the `Value*`.

Fixes 40714.

Reviewers: gchatelet

Subscribers: mgrang, jfb, jdoerfert, llvm-commits, hans

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58274

llvm-svn: 354131
2019-02-15 14:17:17 +00:00
Clement Courbet cc004df7eb [MergeICmps][NFC] Improve doc.
llvm-svn: 354128
2019-02-15 12:58:06 +00:00
Max Kazantsev c065b025a6 [NFCI] Factor out block removal from stack of nested loops
llvm-svn: 354124
2019-02-15 12:18:10 +00:00
Simon Pilgrim 623c38d6cd Fix "field 'DFS' will be initialized after field 'DTU'" warning. NFCI.
llvm-svn: 354123
2019-02-15 12:13:16 +00:00
Max Kazantsev 136f09bea1 [NFC] Promote DFS to field for further use
llvm-svn: 354118
2019-02-15 11:39:35 +00:00
Max Kazantsev 73db5c137a [NFC] Tweak SplitBlockAndInsertIfThen to use existing ThenBlock
llvm-svn: 354107
2019-02-15 08:18:00 +00:00
Teresa Johnson d0b1f30b32 [ThinLTO] Detect partially split modules during the thin link
Summary:
The changes to disable LTO unit splitting by default (r350949) and
detect inconsistently split LTO units (r350948) are causing some crashes
when the inconsistency is detected in multiple threads simultaneously.
Fix that by having the code always look for the inconsistently split
LTO units during the thin link, by checking for the presence of type
tests recorded in the summaries.

Modify test added in r350948 to remove single threading required to fix
a bot failure due to this issue (and some debugging options added in the
process of diagnosing it).

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57561

llvm-svn: 354062
2019-02-14 21:22:50 +00:00
Philip Reames db57ef6238 [InstCombine] Add todos for possible atomicrmw transforms
llvm-svn: 354059
2019-02-14 20:48:36 +00:00
Philip Reames 485474208e Canonicalize all integer "idempotent" atomicrmw ops
For "idempotent" atomicrmw instructions which we can't simply turn into load, canonicalize the operation and constant. This reduces the matching needed elsewhere in the optimizer, but doesn't directly impact codegen.

For any architecture where OR/Zero is not a good default choice, you can extend the AtomicExpand lowerIdempotentRMWIntoFencedLoad mechanism. I reviewed X86 to make sure this works well, haven't audited other backends.

Differential Revision: https://reviews.llvm.org/D58244

llvm-svn: 354058
2019-02-14 20:41:17 +00:00
Philip Reames 97067d3c73 Teach instcombine about remaining "idempotent" atomicrmw types
Expand on Quentin's r353471 patch which converts some atomicrmws into loads. Handle remaining operation types, and fix a slight bug. Atomic loads are required to have alignment. Since this was within the InstCombine fixed point, somewhere else in InstCombine was adding alignment before the verifier saw it, but still, we should fix.

Terminology wise, I'm using the "idempotent" naming that is used for the same operations in AtomicExpand and X86ISelLoweringInfo. Once this lands, I'll add similar tests for AtomicExpand, and move the pattern match function to a common location. In the review, there was seemingly consensus that "idempotent" was slightly incorrect for this context.  Once we setle on a better name, I'll update all uses at once.

Differential Revision: https://reviews.llvm.org/D58242

llvm-svn: 354046
2019-02-14 18:39:14 +00:00
Teresa Johnson c374a800e7 Refine ArgPromotion metadata handling
Summary:
In r353537 we now copy all metadata to the new function, with the old
being removed when the old function is eliminated. In some cases the old
function is dropped to a declaration (seems to only occur with the old
PM). Go ahead and clear all metadata from the old function to handle that
case, since verification will complain otherwise. This is consistent
with what was being done for debug metadata before r353537.

Reviewers: davidxl, uabelho

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58215

llvm-svn: 354032
2019-02-14 14:14:24 +00:00
Florian Hahn 6ab83b7db6 [LoopUnrollPeel] Add case where we should forget the peeled loop from SCEV.
The test case requires the peeled loop to be forgotten after peeling,
even though it does not have a parent. When called via the unroller,
SE->forgetTopmostLoop is also called, so the test case would also pass
without any SCEV invalidation, but peelLoop is exposed as utility
function. Also, in the test case, simplifyLoop will make changes,
removing the loop from SCEV, but it is better to not rely on this
behavior.

Reviewers: sanjoy, mkazantsev

Reviewed By: mkazantsev

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58192

llvm-svn: 354031
2019-02-14 13:59:39 +00:00
Clement Courbet c6e768f0ee [Instrumentation][NFC] Fix warning.
lib/Transforms/Instrumentation/AddressSanitizer.cpp:1173:29: warning: extra ‘;’ [-Wpedantic]

llvm-svn: 354024
2019-02-14 12:10:49 +00:00
Max Kazantsev deaf2ba280 [NFC] Refactor LICM code for better readability
llvm-svn: 354013
2019-02-14 09:04:12 +00:00
Leonard Chan 436fb2bd82 [NewPM] Second attempt at porting ASan
This is the second attempt to port ASan to new PM after D52739. This takes the
initialization requried by ASan from the Module by moving it into a separate
class with it's own analysis that the new PM ASan can use.

Changes:
- Split AddressSanitizer into 2 passes: 1 for the instrumentation on the
  function, and 1 for the pass itself which creates an instance of the first
  during it's run. The same is done for AddressSanitizerModule.
- Add new PM AddressSanitizer and AddressSanitizerModule.
- Add legacy and new PM analyses for reading data needed to initialize ASan with.
- Removed DominatorTree dependency from ASan since it was unused.
- Move GlobalsMetadata and ShadowMapping out of anonymous namespace since the
  new PM analysis holds these 2 classes and will need to expose them.

Differential Revision: https://reviews.llvm.org/D56470

llvm-svn: 353985
2019-02-13 22:22:48 +00:00