Commit Graph

8914 Commits

Author SHA1 Message Date
Benjamin Kramer 31a47f9890 Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata"
This reverts commit r369025. Crashes clang, test case is on the mailing
list.

llvm-svn: 369096
2019-08-16 10:59:18 +00:00
Tim Northover 22970d66be AssumptionCache: remove old affected values after RAUW.
If they're left in the cache then they can't be removed efficiently when the
cache is notified to unlink a @llvm.assume call, and that can lead to values
from different functions entirely remaining there.

llvm-svn: 369091
2019-08-16 09:34:27 +00:00
Florian Hahn 75be1a9e58 [ValueTracking] Fix recurrence detection to check both PHI operands.
Summary:
Currently we fail to compute known bits for recurrences where the
first incoming value is the start value of the recurrence.

Instead of exiting the loop when the first incoming value is not
the step of the recurrence, continue to check the second incoming
value.

The original code uses a loop to handle both cases, but incorrectly
exits instead of continuing.

Reviewers: lebedev.ri, spatel, nikic

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66216

llvm-svn: 369088
2019-08-16 09:15:02 +00:00
Evgeniy Stepanov 75344955fc Move isPointerOffset function to ValueTracking (NFC).
Summary: To be reused in MemTag sanitizer.

Reviewers: pcc, vitalybuka, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66165

llvm-svn: 369062
2019-08-15 22:58:28 +00:00
Alina Sbirlea 79ff20428e [MemorySSA] Remove restrictive asserts.
The verification I added has overly restrictive asserts.
Unreachable blocks can have any incoming value in practice, after an
update due to a "replaceAllUses" call when the repalced entry is
LiveOnEntry.

llvm-svn: 369050
2019-08-15 21:20:08 +00:00
Florian Hahn 3f2850bc60 [ValueTracking] Look through ptrmask intrinsics during getUnderlyingObject.
Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61669

llvm-svn: 369036
2019-08-15 18:39:56 +00:00
Mark Lacey 626ed22fbe [CallGraph] Refine call graph for indirect calls with !callees metadata
For indirect call sites having a small set of possible callees,
!callees metadata can be used to indicate what those callees are.
This patch updates the call graph and lazy call graph analyses so
that they consider this metadata when encountering call sites. For
the call graph, it adds a new external call graph node to the graph
for each unique !callees metadata node. A call graph edge connects
an indirect call site with the external node associated with the
!callees metadata that is attached to it. And there is an edge from
this external node to each of the callees indicated by the metadata.
Similarly, for the lazy call graph, the patch adds Ref edges from a
caller to the possible callees indicated by the metadata.

The primary purpose of the patch is to facilitate iterating over the
functions in a module such that all of the callees indicated by a
given !callees metadata node will be visited prior to the functions
containing call sites annotated by that node. This property is
required by optimizations performing a bottom-up traversal of the
SCC DAG. For example, the inliner can be made to inline through an
indirect call. If the call site is annotated with !callees metadata,
this patch ensures that the inliner will have visited all of the
callees prior to the caller, allowing it to reliably compute the
cost of inlining one or more of the potential callees.

Original patch by @mssimpso. I've made some small changes to get it
to apply, build, and pass tests on the top of tree, as well as
some minor tweaks to formatting and functionality.

Subscribers: mehdi_amini, hiraditya, llvm-commits, mssimpso

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D39339

llvm-svn: 369025
2019-08-15 17:47:53 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Florian Hahn fd72bf21c9 [ValueTracking] Add MustPreserveNullness arg to functions analyzing calls. (NFC)
Some uses of getArgumentAliasingToReturnedPointer and
isIntrinsicReturningPointerAliasingArgumentWithoutCapturing require the
calls/intrinsics to preserve the nullness of the argument.

For alias analysis, the nullness property does not really come into
play.

This patch explicitly sets it to true. In D61669, the alias analysis
uses will be switched to not require preserving nullness.

Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert

Reviewed By: jdoerfert

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64150

llvm-svn: 368993
2019-08-15 12:13:02 +00:00
Philip Reames 7b0515176b [SCEV] Rename getMaxBackedgeTakenCount to getConstantMaxBackedgeTakenCount [NFC]
llvm-svn: 368930
2019-08-14 21:58:13 +00:00
Matt Arsenault dbc1f207fa InferAddressSpaces: Move target intrinsic handling to TTI
I'm planning on handling intrinsics that will benefit from checking
the address space enums. Don't bother moving the address collection
for now, since those won't need th enums.

llvm-svn: 368895
2019-08-14 18:13:00 +00:00
Nikita Popov 2a4f26b4c2 [ValueTracking] Improve reverse assumption inference
Use isGuaranteedToTransferExecutionToSuccessor() instead of
isSafeToSpeculativelyExecute() when seeing whether we can propagate
the information in an assume backwards in isValidAssumeForContext().
The latter is more general - it also allows arbitrary loads/stores -
and is also the condition we want: if our assume is guaranteed to
execute, its condition not holding would be UB.

Original patch by arielb1.

Differential Revision: https://reviews.llvm.org/D37215

llvm-svn: 368723
2019-08-13 17:15:42 +00:00
Stanislav Mekhanoshin 4c9c98f36b [AMDGPU] Printf runtime binding pass
This pass is a port of the according pass from the HSAIL compiler.
It parses printf calls and setup runtime printf buffer.
After that it copies printf arguments to the buffer and fills in
module metadata for runtime.

Differential Revision: https://reviews.llvm.org/D24035

llvm-svn: 368592
2019-08-12 17:12:29 +00:00
Fedor Sergeev 92e160abab [MemDep] allow to select block-scan-limit when constructing MemoryDependenceAnalysis
Introducing non-global control for default block-scan-limit in MemDep analysis.
Useful when there are many compilations per initialized LLVM instance (e.g. JIT).

Reviewed By: asbirlea
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65806

llvm-svn: 368502
2019-08-10 01:23:38 +00:00
Whitney Tsang dd3b6498b0 Title: Loop Cache Analysis
Summary: Implement a new analysis to estimate the number of cache lines
required by a loop nest.
The analysis is largely based on the following paper:

Compiler Optimizations for Improving Data Locality
By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng
http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf
The analysis considers temporal reuse (accesses to the same memory
location) and spatial reuse (accesses to memory locations within a cache
line). For simplicity the analysis considers memory accesses in the
innermost loop in a loop nest, and thus determines the number of cache
lines used when the loop L in loop nest LN is placed in the innermost
position.

The result of the analysis can be used to drive several transformations.
As an example, loop interchange could use it determine which loops in a
perfect loop nest should be interchanged to maximize cache reuse.
Similarly, loop distribution could be enhanced to take into
consideration cache reuse between arrays when distributing a loop to
eliminate vectorization inhibiting dependencies.

The general approach taken to estimate the number of cache lines used by
the memory references in the inner loop of a loop nest is:

Partition memory references that exhibit temporal or spatial reuse into
reference groups.
For each loop L in the a loop nest LN: a. Compute the cost of the
reference group b. Compute the 'cache cost' of the loop nest by summing
up the reference groups costs
For further details of the algorithm please refer to the paper.
Authored By: etiotto
Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet,
fhahn
Reviewed By: Meinersbur
Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595,
venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO,
Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63459

llvm-svn: 368439
2019-08-09 13:56:29 +00:00
Craig Topper 66c08430f6 [ValueTracking] When calculating known bits for integer abs, make sure we're looking at a negate and not just any instruction with the nsw flag set.
The matchSelectPattern code can match patterns like (x >= 0) ? x : -x
for absolute value. But it can also match ((x-y) >= 0) ? (x-y) : (y-x).
If the latter form was matched we can only use the nsw flag if its
set on both subtracts.

This match makes sure we're looking at the former case only.

Differential Revision: https://reviews.llvm.org/D65692

llvm-svn: 368195
2019-08-07 18:28:16 +00:00
Nikolai Bozhenov 03edcd68dd [SCEV] Return zero from computeConstantDifference(X, X)
Without this patch computeConstantDifference returns None for cases like
these:

  computeConstantDifference(%x, %x)
  computeConstantDifference({%x,+,16}, {%x,+,16})

Differential Revision: https://reviews.llvm.org/D65474

llvm-svn: 368193
2019-08-07 17:38:38 +00:00
Alex Lorenz 8d5c280316 TLI: darwin does not support _bcmp
Not all Darwin targets support _bcmp in all circumstances.

Differential Revision: https://reviews.llvm.org/D65834

llvm-svn: 368113
2019-08-07 00:03:37 +00:00
Fangrui Song cb4327d7db Change two unnecessary uses of llvm::size(C) to C.size()
llvm-svn: 368011
2019-08-06 10:24:36 +00:00
David Bolvansky ef72cded32 [TLI][NFC] Fixed typo
llvm-svn: 367827
2019-08-05 10:14:09 +00:00
Fangrui Song d9b948b6eb Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC
F_{None,Text,Append} are kept for compatibility since r334221.

llvm-svn: 367800
2019-08-05 05:43:48 +00:00
Sanjay Patel 9ce5f41851 [InstCombine] fold cmp+select using select operand equivalence
As discussed in PR42696:
https://bugs.llvm.org/show_bug.cgi?id=42696
...but won't help that case yet.

We have an odd situation where a select operand equivalence fold was
implemented in InstSimplify when it could have been done more generally
in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand.

Here's an example:
https://rise4fun.com/Alive/Xplr

  %cmp = icmp eq i32 %x, 2147483647
  %add = add nsw i32 %x, 1
  %sel = select i1 %cmp, i32 -2147483648, i32 %add
  =>
  %sel = add i32 %x, 1

I've left the InstSimplify code in place for now, but my guess is that we'd
prefer to remove that as a follow-up to save on code duplication and
compile-time.

Differential Revision: https://reviews.llvm.org/D65576

llvm-svn: 367695
2019-08-02 17:39:32 +00:00
Hideki Saito 09fac2450b [LV] Avoid building interleaved group in presence of WAW dependency
Reviewers: hsaito, Ayal, fhahn, anna, mkazantsev

Reviewed By: hsaito

Patch by evrevnov, thanks!

Differential Revision: https://reviews.llvm.org/D63981

llvm-svn: 367654
2019-08-02 06:31:50 +00:00
Roman Lebedev 081e990d08 [IR] Value: add replaceUsesWithIf() utility
Summary:
While there is always a `Value::replaceAllUsesWith()`,
sometimes the replacement needs to be conditional.

I have only cleaned a few cases where `replaceUsesWithIf()`
could be used, to both add test coverage,
and show that it is actually useful.

Reviewers: jdoerfert, spatel, RKSimon, craig.topper

Reviewed By: jdoerfert

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65528

llvm-svn: 367548
2019-08-01 12:32:08 +00:00
Alina Sbirlea 7153f2784c [SCCP] Update condition to avoid overflow.
Summary:
Update condition to remove addition that may cause an overflow.
Resolves PR42814.

Reviewers: sanjoy, RKSimon

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65417

llvm-svn: 367461
2019-07-31 18:22:22 +00:00
Alina Sbirlea 63e97fa0b3 [MemorySSA] Add additional verification for phis.
Summary:
Verify that the incoming defs into phis are the last defs from the
respective incoming blocks.
When moving blocks, insertDef must RenameUses. Adding this verification
makes GVNHoist tests fail that uncovered this issue.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63147

llvm-svn: 367451
2019-07-31 17:41:04 +00:00
Florian Hahn 189efe295b Recommit "[GVN] Preserve loop related analysis/canonical forms."
This fixes some pipeline tests.
This reverts commit d0b6f42936.

llvm-svn: 367401
2019-07-31 09:27:54 +00:00
Alina Sbirlea 4bc625cae0 [MemorySSA] Extend allowed behavior for simplified instructions.
Summary:
LoopRotate may simplify instructions, leading to the new instructions not having memory accesses created for them.
Allow this behavior, by allowing the new access to be null when the template is null, and looking upwards for the proper defined access when dealing with simplified instructions.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65338

llvm-svn: 367352
2019-07-30 20:10:33 +00:00
Hideto Ueno 6e2be4eab3 [FunctionAttrs] Annotate "willreturn" for AssumeLikeInst
Summary:
In D37215, AssumeLikeInstruction are regarded as `willreturn`. In this patch, annotation is added to those which don't have `willreturn` now(`sideeffect, object_size, experimental_widenable_condition`).

Reviewers: jdoerfert, nikic, sstefan1

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65455

llvm-svn: 367342
2019-07-30 18:35:29 +00:00
Florian Hahn d0b6f42936 Revert [GVN] Preserve loop related analysis/canonical forms.
This reverts r367332 (git commit 2d7227ec3a)

llvm-svn: 367335
2019-07-30 17:04:58 +00:00
Florian Hahn 2d7227ec3a [GVN] Preserve loop related analysis/canonical forms.
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.

Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.

Reviewers: reames, hfinkel, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D65137

llvm-svn: 367332
2019-07-30 16:43:39 +00:00
Hideto Ueno 98d281a99f [ValueTracking] Remove volatile check in isGuaranteedToTransferExecutionToSuccessor
Summary: As clarified in D53184, volatile load and store do not trap. Therefore, we should remove volatile checks for instructions in  `isGuaranteedToTransferExecutionToSuccessor`.

Reviewers: jdoerfert, efriedma, nikic

Reviewed By: nikic

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65375

llvm-svn: 367226
2019-07-29 13:35:34 +00:00
Jay Foad dcb7532479 [DivergenceAnalysis] Add methods for querying divergence at use
Summary:
The existing isDivergent(Value) methods query whether a value is
divergent at its definition. However even if a value is uniform at its
definition, a use of it in another basic block can be divergent because
of divergent control flow between the def and the use.

This patch adds new isDivergent(Use) methods to DivergenceAnalysis,
LegacyDivergenceAnalysis and GPUDivergenceAnalysis.

This might allow D63953 or other similar workarounds to be removed.

Reviewers: alex-t, nhaehnle, arsenm, rtaylor, rampitec, simoll, jingyue

Reviewed By: nhaehnle

Subscribers: jfb, jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65141

llvm-svn: 367218
2019-07-29 10:22:09 +00:00
Whitney Tsang 8ee361ebe5 [LOOPINFO] Introduce the loop guard API.
Summary:
This is the first patch for the loop guard. We introduced
getLoopGuardBranch() and isGuarded().
This currently only works on simplified loop, as it requires a preheader
and a latch to identify the guard.
It will work on loops of the form:
/// GuardBB:
///   br cond1, Preheader, ExitSucc <== GuardBranch
/// Preheader:
///   br Header
/// Header:
///  ...
///   br Latch
/// Latch:
///   br cond2, Header, ExitBlock
/// ExitBlock:
///   br ExitSucc
/// ExitSucc:
Prior discussions leading upto the decision to introduce the loop guard
API: http://lists.llvm.org/pipermail/llvm-dev/2019-May/132607.html
Reviewer: reames, kbarton, hfinkel, jdoerfert, Meinersbur, dmgreen
Reviewed By: reames
Subscribers: wuzish, hiraditya, jsji, llvm-commits, bmahjour, etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63885

llvm-svn: 367033
2019-07-25 16:13:18 +00:00
Jay Foad 565c54320e [InstSimplify] Rename SimplifyFPUnOp and SimplifyFPBinOp
Summary:
SimplifyFPBinOp is a variant of SimplifyBinOp that lets you specify
fast math flags, but the name is misleading because both functions
can simplify both FP and non-FP ops. Instead, overload SimplifyBinOp
so that you can optionally specify fast math flags.

Likewise for SimplifyFPUnOp.

Reviewers: spatel

Reviewed By: spatel

Subscribers: xbolva00, cameron.mcinally, eraman, hiraditya, haicheng, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64902

llvm-svn: 366902
2019-07-24 12:50:10 +00:00
Peter Collingbourne 710605c085 Analysis: Don't look through aliases when simplifying GEPs.
It is not safe in general to replace an alias in a GEP with its aliasee
if the alias can be replaced with another definition (i.e. via strong/weak
resolution (linkonce_odr) or via symbol interposition (default visibility
in ELF)) while the aliasee cannot. An example of how this can go wrong is
in the included test case.

I was concerned that this might be a load-bearing misoptimization (it's
possible for us to use aliases to share vtables between base and derived
classes, and on Windows, vtable symbols will always be aliases in RTTI
mode, so this change could theoretically inhibit trivial devirtualization
in some cases), so I built Chromium for Linux and Windows with and without
this change. The file sizes of the resulting binaries were identical, so it
doesn't look like this is going to be a problem.

Differential Revision: https://reviews.llvm.org/D65118

llvm-svn: 366754
2019-07-22 22:13:46 +00:00
Michael Liao 17a8a9277c [LAA] Re-check bit-width of pointers after stripping.
Summary:
- As the pointer stripping now tracks through `addrspacecast`, prepare
  to handle the bit-width difference from the result pointer.

Reviewers: jdoerfert

Subscribers: jvesely, nhaehnle, hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64928

llvm-svn: 366470
2019-07-18 17:30:27 +00:00
Chen Zheng c38e3efe27 [SCEV] add no wrap flag for SCEVAddExpr.
Differential Revision: https://reviews.llvm.org/D64868

llvm-svn: 366419
2019-07-18 09:23:19 +00:00
Evgeniy Stepanov 6abd78cc7c Make DT a transitive dependency of LI.
Summary:
LoopInfoWrapperPass::verify uses DT, which means DT must be alive
even if it has no direct users.

Fixes a crash in expensive checks mode.

Reviewers: pcc, leonardchan

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64896

llvm-svn: 366388
2019-07-17 23:31:59 +00:00
Evgeniy Stepanov d752f5e953 Basic codegen for MTE stack tagging.
Implement IR intrinsics for stack tagging. Generated code is very
unoptimized for now.

Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are
used to implement a tagged stack frame pointer in a virtual register.

Differential Revision: https://reviews.llvm.org/D64172

llvm-svn: 366360
2019-07-17 19:24:02 +00:00
Daniil Fukalov d912a9ba9b [AMDGPU] Tune inlining parameters for AMDGPU target
Summary:
Since the target has no significant advantage of vectorization,
vector instructions bous threshold bonus should be optional.

amdgpu-inline-arg-alloca-cost parameter default value and the target
InliningThresholdMultiplier value tuned then respectively.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64642

llvm-svn: 366348
2019-07-17 16:51:29 +00:00
Michael Liao 543ba4e9e0 [InstructionSimplify] Apply sext/trunc after pointer stripping
Summary:
- As the pointer stripping could trace through `addrspacecast` now, need
  to sext/trunc the offset to ensure it has the same width as the
  pointer after stripping.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64768

llvm-svn: 366162
2019-07-16 01:03:06 +00:00
Nick Desaulniers c4f245b40a [LoopUnroll+LoopUnswitch] do not transform loops containing callbr
Summary:
There is currently a correctness issue when unrolling loops containing
callbr's where their indirect targets are being updated correctly to the
newly created labels, but their operands are not.  This manifests in
unrolled loops where the second and subsequent copies of callbr
instructions have blockaddresses of the label from the first instance of
the unrolled loop, which would result in nonsensical runtime control
flow.

For now, conservatively do not unroll the loop.  In the future, I think
we can pursue unrolling such loops provided we transform the cloned
callbr's operands correctly.

Such a transform and its legalities are being discussed in:
https://reviews.llvm.org/D64101

Link: https://bugs.llvm.org/show_bug.cgi?id=42489
Link: https://groups.google.com/forum/#!topic/clang-built-linux/z-hRWP9KqPI

Reviewers: fhahn, hfinkel, efriedma

Reviewed By: fhahn, hfinkel, efriedma

Subscribers: efriedma, hiraditya, zzheng, dmgreen, llvm-commits, pirama, kees, nathanchance, E5ten, craig.topper, chandlerc, glider, void, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64368

llvm-svn: 366130
2019-07-15 21:16:29 +00:00
Johannes Doerfert 2d63fbb7b1 [ValueTracking] Look through constant Int2Ptr/Ptr2Int expressions
Summary:
This is analogous to the int2ptr/ptr2int instruction handling introduced
in D54956.

Reviewers: fhahn, efriedma, spatel, nlopes, sanjoy, lebedev.ri

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64708

llvm-svn: 366036
2019-07-15 03:24:35 +00:00
Vitaly Buka b1bff76e22 isBytewiseValue checks ConstantVector element by element
Summary: Vector of the same value with few undefs will sill be considered "Bytewise"

Reviewers: eugenis, pcc, jfb

Reviewed By: jfb

Subscribers: dexonsmith, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64031

llvm-svn: 365971
2019-07-12 22:37:55 +00:00
Alina Sbirlea db101864bd [MemorySSA] Use SetVector to avoid nondeterminism.
Summary:
Use a SetVector for DeadBlockSet.
Resolves PR42574.

Reviewers: george.burgess.iv, uabelho, dblaikie

Subscribers: jlebar, Prazek, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64601

llvm-svn: 365970
2019-07-12 22:30:30 +00:00
Fangrui Song b251cc0d91 Delete dead stores
llvm-svn: 365903
2019-07-12 14:58:15 +00:00
Vitaly Buka 52096ee9a9 Return Undef from isBytewiseValue for empty arrays or structs
Reviewers: pcc, eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64052

llvm-svn: 365864
2019-07-12 02:23:07 +00:00
Vitaly Buka c559e63798 Handle IntToPtr in isBytewiseValue
Summary:
This helps with more efficient use of memset for pattern initialization

From @pcc prototype for -ftrivial-auto-var-init=pattern optimizations

Binary size change on CTMark, (with -fuse-ld=lld -Wl,--icf=all, similar results with default linker options)
```
                   master           patch      diff
Os           8.238864e+05    8.238864e+05       0.0
O3           1.054797e+06    1.054797e+06       0.0
Os zero      8.292384e+05    8.292384e+05       0.0
O3 zero      1.062626e+06    1.062626e+06       0.0
Os pattern   8.579712e+05    8.338048e+05 -0.030299
O3 pattern   1.090502e+06    1.067574e+06 -0.020481
```

Zero vs Pattern on master
```
               zero       pattern      diff
Os     8.292384e+05  8.579712e+05  0.036578
O3     1.062626e+06  1.090502e+06  0.025124
```

Zero vs Pattern with the patch
```
               zero       pattern      diff
Os     8.292384e+05  8.338048e+05  0.003333
O3     1.062626e+06  1.067574e+06  0.003193
```

Reviewers: pcc, eugenis

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63967

llvm-svn: 365858
2019-07-12 01:42:03 +00:00
Tim Northover 67828edbbd OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC.
llvm-svn: 365770
2019-07-11 13:13:02 +00:00
Tim Northover 030bb3d363 InstructionSimplify: Simplify InstructionSimplify. NFC.
The interface predates CallBase, so both it and implementation were
significantly more complicated than they needed to be. There was even
some redundancy that could be eliminated.

Should also help with OpaquePointers by not trying to derive a
function's type from it's PointerType.

llvm-svn: 365767
2019-07-11 13:11:44 +00:00
Chen Zheng 627095ec5b [SCEV] teach SCEV symbolical execution about overflow intrinsics folding.
Differential Revision: https://reviews.llvm.org/D64422

llvm-svn: 365726
2019-07-11 02:18:22 +00:00
Johannes Doerfert 3ed286a388 Replace three "strip & accumulate" implementations with a single one
This patch replaces the three almost identical "strip & accumulate"
implementations for constant pointer offsets with a single one,
combining the respective functionalities. The old interfaces are kept
for now.

Differential Revision: https://reviews.llvm.org/D64468

llvm-svn: 365723
2019-07-11 01:14:48 +00:00
Vitaly Buka d03bd1db59 NFC: Pass DataLayout into isBytewiseValue
Summary:
We will need to handle IntToPtr which I will submit in a separate patch as it's
not going to be NFC.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63940

llvm-svn: 365709
2019-07-10 22:53:52 +00:00
Jinsong Ji 06fef0b359 Revert "[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()"
This reverts commit d955573065.

llvm-svn: 365520
2019-07-09 17:53:09 +00:00
Chen Zheng d955573065 [HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()
Differential Revision: https://reviews.llvm.org/D64197

llvm-svn: 365497
2019-07-09 14:56:17 +00:00
Tim Northover 60afa49abe OpaquePtr: add Type parameter to Loads analysis API.
This makes the functions in Loads.h require a type to be specified
independently of the pointer Value so that when pointers have no structure
other than address-space, it can still do its job.

Most callers had an obvious memory operation handy to provide this type, but a
SROA and ArgumentPromotion were doing more complicated analysis. They get
updated to merge the properties of the various instructions they were
considering.

llvm-svn: 365468
2019-07-09 11:35:35 +00:00
Denis Bakhvalov 74be349bcf [SCEV] Fix for PR42397. SCEVExpander wrongly adds nsw to shl instruction.
Change-Id: I76c9f628c092ae3e6e78ebdaf55cec726e25d692
llvm-svn: 365363
2019-07-08 18:03:43 +00:00
Brian Homerding b4b21d807e Add, and infer, a nofree function attribute
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365336
2019-07-08 15:57:56 +00:00
Eugene Leviant 3aef35288b [ThinLTO] Attempt to recommit r365188 after alignment fix
llvm-svn: 365215
2019-07-05 15:25:05 +00:00
Eugene Leviant e91f86f0ac Reverted r365188 due to alignment problems on i686-android
llvm-svn: 365206
2019-07-05 13:26:05 +00:00
Eugene Leviant 820cc01d1e [ThinLTO] Attempt to recommit r365040 after caching fix
It's possible that some function can load and store the same
variable using the same constant expression:

store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**)
%42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**)

The bitcast expression was mistakenly cached while processing loads,
and never examined later when processing store. This caused @bar to
be mistakenly treated as read-only variable. See load-store-caching.ll.

llvm-svn: 365188
2019-07-05 12:00:10 +00:00
Reid Kleckner f7e52fbdb5 Revert [ThinLTO] Optimize writeonly globals out
This reverts r365040 (git commit 5cacb91475)

Speculatively reverting, since this appears to have broken check-lld on
Linux. Partial analysis in https://crbug.com/981168.

llvm-svn: 365097
2019-07-04 00:03:30 +00:00
Evgeniy Stepanov 50dc28b556 Teach ValueTracking that aarch64.irg result aliases its input.
Reviewers: javed.absar, olista01

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64103

llvm-svn: 365079
2019-07-03 20:19:14 +00:00
Philip Reames 39e7a97ad7 [SCEV] Preserve flags on add/muls in getSCEVATScope
We haven't changed the set of users, just specialized an operand for those users.  Given that, the previous wrap flags must still be correct.

Sorry for the lack of test case.  Noticed this while working on something else, and haven't figured out to exercise this standalone.

llvm-svn: 365053
2019-07-03 16:34:08 +00:00
Eugene Leviant 5cacb91475 [ThinLTO] Optimize writeonly globals out
Differential revision: https://reviews.llvm.org/D63444

llvm-svn: 365040
2019-07-03 14:14:52 +00:00
Eugene Leviant ac407a7b4a [SCEV][LSR] Prevent using undefined value in binops
On some occasions ReuseOrCreateCast may convert previously
expanded value to undefined. That value may be passed by
SCEVExpander as an argument to InsertBinop making IV chain
undefined.

Differential revision: https://reviews.llvm.org/D63928 

llvm-svn: 365009
2019-07-03 09:36:32 +00:00
Jordan Rupprecht 02647f73d4 Revert [InlineCost] cleanup calculations of Cost and Threshold
This reverts r364422 (git commit 1a3dc76186)

The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining.

llvm-svn: 365000
2019-07-03 04:01:51 +00:00
Chen Zheng dfdccbb26b [PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop.
Differential Revision: https://reviews.llvm.org/D63477

llvm-svn: 364993
2019-07-03 01:49:03 +00:00
Teresa Johnson a700436323 [ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).

Also added are the necessary bitcode records, and the corresponding
assembly support.

The follow-on index-based WPD patch is D55153.

Depends on D53890.

Reviewers: pcc

Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54815

llvm-svn: 364960
2019-07-02 19:38:02 +00:00
Kristof Umann 9353421ecd [IDF] Generalize IDFCalculator to be used with Clang's CFG
I'm currently working on a GSoC project that aims to improve the the bug reports
of the analyzer. The main heuristic I plan to use is to explain values that are
a control dependency of the bug location better.

01 bool b = messyComputation();
02 int i = 0;
03 if (b) // control dependency of the bug site, let's explain why we assume val
04        // to be true
05   10 / i; // warn: division by zero

Because of this, I'd like to generalize IDFCalculator so that I could use it for
Clang's CFG: D62883.

In detail:

* Rename IDFCalculator to IDFCalculatorBase, make it take a general CFG node
  type as a template argument rather then strictly BasicBlock (but preserve
  ForwardIDFCalculator and ReverseIDFCalculator)
* Move IDFCalculatorBase from llvm/include/llvm/Analysis to
  llvm/include/llvm/Support (but leave the BasicBlock variants in
  llvm/include/llvm/Analysis)
* clang-format the file since this patch messes up git blame anyways
* Change typedef to using
* Add the new type ChildrenGetterTy, and store an instance of it in
  IDFCalculatorBase. This is important because I'll have to specialize it for
  Clang's CFG to filter out nullpointer successors, similarly to D62507.

Differential Revision: https://reviews.llvm.org/D63389

llvm-svn: 364911
2019-07-02 11:30:12 +00:00
Fangrui Song 78ee2fbf98 Cleanup: llvm::bsearch -> llvm::partition_point after r364719
llvm-svn: 364720
2019-06-30 11:19:56 +00:00
Johannes Doerfert 6ed459fd41 Use "willreturn" in isGuaranteedToTransferExecutionToSuccessor
The `willreturn` function attribute guarantees that a function call will
come back to the call site if the call is also known not to throw.
Therefore, this attribute can be used in
`isGuaranteedToTransferExecutionToSuccessor`.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63372

llvm-svn: 364580
2019-06-27 19:29:48 +00:00
Philip Reames 1cf9e72cbc Update -analyze -scalar-evolution output for multiple exit loops w/computable exit values
The previous output was next to useless if *any* exit was not computable.  If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging.

llvm-svn: 364579
2019-06-27 19:22:43 +00:00
Fedor Sergeev 1a3dc76186 [InlineCost] cleanup calculations of Cost and Threshold
Summary:
Doing better separation of Cost and Threshold.
Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay.
There are two parts:
     - huge 15K last-call-to-static bonus is no longer subtracted from Cost
       but rather is now added to Threshold.

       That makes much more sense, as the cost of inlining (Cost) is not changed by the fact
       that internal function is called once. It only changes the likelyhood of this inlining
       being profitable (Threshold).

     - bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost
       but added to Threshold instead.

While calculations are somewhat different,  overall InlineResult should stay the same since Cost >= Threshold compares the same.

Reviewers: eraman, greened, chandlerc, yrouban, apilipenko
Reviewed By: apilipenko
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60740

llvm-svn: 364422
2019-06-26 13:24:24 +00:00
Chen Zheng aa99952896 [HardwareLoops] NFC - move loop with irreducible control flow checking logic to HarewareLoopInfo.
llvm-svn: 364415
2019-06-26 12:02:43 +00:00
Chen Zheng 46ce9e4fff [HardwareLoops] NFC - move loop with irreducible control flow checking logic to isHardwareLoopProfitable()
llvm-svn: 364397
2019-06-26 09:12:52 +00:00
Clement Courbet 3bc5ad551a [ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.

llvm-svn: 364281
2019-06-25 08:04:13 +00:00
Bjorn Pettersson 485a421876 [ConstantFolding] Use hasVectorInstrinsicScalarOpd. NFC
Summary:
Use the hasVectorInstrinsicScalarOpd helper function
in ConstantFoldVectorCall.

Reviewers: rengolin, RKSimon, dblaikie

Reviewed By: rengolin, RKSimon

Subscribers: tschuett, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63705

llvm-svn: 364178
2019-06-24 12:07:17 +00:00
Bjorn Pettersson 512b118779 [Scalarizer] Add scalarizer support for smul.fix.sat
Summary:
Handle smul.fix.sat in the scalarizer. This is done by
adding smul.fix.sat to the set of "isTriviallyVectorizable"
intrinsics.

The addition of smul.fix.sat in isTriviallyVectorizable and
hasVectorInstrinsicScalarOpd can also be seen as a preparation
to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding.

Reviewers: rengolin, RKSimon, dblaikie

Reviewed By: rengolin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63704

llvm-svn: 364177
2019-06-24 12:07:11 +00:00
Fangrui Song dc8de6037c Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC
llvm-svn: 364006
2019-06-21 05:40:31 +00:00
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Alina Sbirlea 109d2ea153 [MemorySSA] Cleanup trivial phis.
Summary:
This is unfortunately needed for correctness, if we are to extend the tolerance of the update API to the way simple loop unswitch is doing cloning.

In simple loop unswitch (as opposed to loop unswitch), not all blocks are cloned. This can create unreachable cloned blocks (no predecessor), which are later cleaned up.

In MemorySSA, the  APIs for supporting these kind of updates (clone + update exit blocks), make certain assumption on the integrity of the CFG. When cloning, if something was not cloned, it's values in MemorySSA default to LiveOnEntry. When updating exit blocks, it is safe to assume that we can first insert phis in the blocks merging two clones, then add additional phis in the IDF of the blocks that received phis. This no longer holds true if one of the clones being merged comes from an unreachable block. We'd conservatively need to add all phis before filling in their incoming definitions. In practice this restriction can be relaxed if we clean up trivial phis after the first round of insertion.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63354

llvm-svn: 363880
2019-06-19 21:33:09 +00:00
Alina Sbirlea 238b8e62b6 [MemorySSA] Use GraphDiff info when computing IDF.
Summary:
When computing IDF for insert updates, ensure we use the snapshot CFG offered by GraphDiff.
Caught by D63389.

Reviewers: kuhar, george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits, Szelethus

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63443

llvm-svn: 363879
2019-06-19 21:17:31 +00:00
Bjorn Pettersson 16ff5fea87 [ConstantFolding] Add constant folding for smul.fix and smul.fix.sat
Summary:
This patch teaches ConstantFolding to constant fold
both scalar and vector variants of llvm.smul.fix and
llvm.smul.fix.sat.

As described in the LangRef rounding is unspecified for
these instrinsics. If the result cannot be represented
exactly the default behavior in ConstantFolding is to
round down towards negative infinity. If a target has a
preferred rounding that is different some kind of target
hook would be needed (same strategy as used by the
SelectionDAG legalizer).

Reviewers: nikic, leonardchan, RKSimon

Reviewed By: leonardchan

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63385

llvm-svn: 363811
2019-06-19 14:28:03 +00:00
Bjorn Pettersson b81b9a4e7b [ConstantFolding] Refactor ConstantFoldScalarCall. NFC
This patch splits ConstantFoldScalarCall into several
functions.

Benefits:
- Reduces indentation levels and avoids long if-statements.
- Makes it easier to add support for > 3 operands.

llvm-svn: 363810
2019-06-19 14:27:51 +00:00
Jay Foad 45d19fb470 [ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Summary:
The test case does an (out of bounds) load from a global constant with
type <3 x float>. InstSimplify tried to turn this into an integer load
of the whole alloc size of the vector, which is 128 bits due to
alignment padding, and then bitcast this to <3 x vector> which failed
an assertion due to the type size mismatch.

The fix is to do an integer load of the normal size of the vector, with
no alignment padding.

Reviewers: tpr, arsenm, majnemer, dstuttard

Reviewed By: arsenm

Subscribers: hfinkel, wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63375

llvm-svn: 363784
2019-06-19 10:28:48 +00:00
Chen Zheng c5b918de58 [NFC] move some hardware loop checking code to a common place for other using.
Differential Revision: https://reviews.llvm.org/D63478

llvm-svn: 363758
2019-06-19 01:26:31 +00:00
Amara Emerson 146882242f [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block.
Inter-block localization is the same as what currently happens, except now it
only runs on the entry block because that's where the problematic constants with
long live ranges come from.

The second phase is a new intra-block localization phase which attempts to
re-sink the already localized instructions further right before one of the
multiple uses.

One additional change is to also localize G_GLOBAL_VALUE as they're constants
too. However, on some targets like arm64 it takes multiple instructions to
materialize the value, so some additional heuristics with a TTI hook have been
introduced attempt to prevent code size regressions when localizing these.

Overall, these changes improve CTMark code size on arm64 by 1.2%.

Full code size results:

Program                                         baseline       new       diff
------------------------------------------------------------------------------
 test-suite...-typeset/consumer-typeset.test    1249984      1217216     -2.6%
 test-suite...:: CTMark/ClamAV/clamscan.test    1264928      1232152     -2.6%
 test-suite :: CTMark/SPASS/SPASS.test          1394092      1361316     -2.4%
 test-suite...Mark/mafft/pairlocalalign.test    731320       714928      -2.2%
 test-suite :: CTMark/lencod/lencod.test        1340592      1324200     -1.2%
 test-suite :: CTMark/kimwitu++/kc.test         3853512      3820420     -0.9%
 test-suite :: CTMark/Bullet/bullet.test        3406036      3389652     -0.5%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    8017000      8016992     -0.0%
 test-suite...TMark/7zip/7zip-benchmark.test    2856588      2856588      0.0%
 test-suite...:: CTMark/sqlite3/sqlite3.test    765704       765704       0.0%
 Geomean difference                                                      -1.2%

Differential Revision: https://reviews.llvm.org/D63303

llvm-svn: 363632
2019-06-17 23:20:29 +00:00
Philip Reames 44475363e8 Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Alina Sbirlea 7a0098aa6e [MemorySSA] Don't use template when the clone is a simplified instruction.
Summary:
LoopRotate doesn't create a faithful clone of an instruction, it may
simplify it beforehand. Hence the clone of an instruction that has a
MemoryDef associated may not be a definition, but a use or not a memory
alternig instruction.
Don't rely on the template when the clone may be simplified.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63355

llvm-svn: 363597
2019-06-17 18:58:40 +00:00
Alina Sbirlea 05f77803f4 [MemorySSA] Add all MemoryPhis before filling their values.
Summary:
Add all MemoryPhis in IDF before filling in their incomign values.
Otherwise, a new Phi can be added that needs to become the incoming
value of another Phi.
Test fails the verification in verifyPrevDefInPhis.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63353

llvm-svn: 363590
2019-06-17 18:16:53 +00:00
Warren Ristow 6452bdd29b [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764

llvm-svn: 363581
2019-06-17 17:20:08 +00:00
Sam Parker 60d6fb2a63 [SCEV] Use NoWrapFlags when expanding a simple mul
Second functional change following on from rL362687. Pass the
NoWrapFlags from the MulExpr to InsertBinop when we're generating a
shl or mul.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363540
2019-06-17 10:05:18 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Nikita Popov 8550fb386a [SCEV] Use unsigned/signed intersection type in SCEV
Based on D59959, this switches SCEV to use unsigned/signed range
intersection based on the sign hint. This will prefer non-wrapping
ranges in the relevant domain. I've left the one intersection in
getRangeForAffineAR() to use the smallest intersection heuristic,
as there doesn't seem to be any obvious preference there.

Differential Revision: https://reviews.llvm.org/D60035

llvm-svn: 363490
2019-06-15 09:15:52 +00:00
Akira Hatanaka a704a8f28c [ObjC][ARC] Delete ObjC runtime calls on global variables annotated
with 'objc_arc_inert'

Those calls are no-ops, so they can be safely deleted.

rdar://problem/49839633

Differential Revision: https://reviews.llvm.org/D62433

llvm-svn: 363468
2019-06-14 22:06:32 +00:00
Matt Arsenault 282dac717e SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

llvm-svn: 363462
2019-06-14 21:38:31 +00:00
Sam Parker 0cf9639a9c [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363364
2019-06-14 09:19:41 +00:00
Nikita Popov ad81d427ca [LangRef] Clarify poison semantics
I find the current documentation of poison somewhat confusing,
mainly because its use of "undefined behavior" doesn't seem to
align with our usual interpretation (of immediate UB). Especially
the sentence "any instruction that has a dependence on a poison
value has undefined behavior" is very confusing.

Clarify poison semantics by:

 * Replacing the introductory paragraph with the standard rationale
   for having poison values.
 * Spelling out that instructions depending on poison return poison.
 * Spelling out how we go from a poison value to immediate undefined
   behavior and give the two examples we currently use in ValueTracking.
 * Spelling out that side effects depending on poison are UB.

Differential Revision: https://reviews.llvm.org/D63044

llvm-svn: 363320
2019-06-13 19:45:36 +00:00
Philip Reames 038e01dc9a Add a clarifying comment about branching on poison
I recently got this wrong (again), and I'm sure I'm not the only one.  Put a comment in the logical place someone would look to "fix" the obvious "missed optimization" which arrises based on the common misunderstanding.  Hopefully, this will save others time.  :)

llvm-svn: 363318
2019-06-13 19:27:56 +00:00
Joseph Tremoulet 3bc6e2a7aa [EarlyCSE] Ensure equal keys have the same hash value
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`.  Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes.  This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.  A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.

Reviewers: spatel, nikic

Reviewed By: spatel, nikic

Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62644

llvm-svn: 363274
2019-06-13 15:24:11 +00:00
Philip Reames e51c3d8b82 [SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673
SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA.  It's not entirely clear this restriction is neccessary, but for the moment it exists.   For this reason, we don't analyze single-entry phi inputs.  However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant.  Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops.

This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values.  We should probably also add similiar logic directly in the SCEV construction path itself.

Patch by: mkazantsev (with revised commit message by me)
Differential Revision: https://reviews.llvm.org/D58113

llvm-svn: 363180
2019-06-12 17:21:47 +00:00
Matt Arsenault 2466ba97bc LoopDistribute/LAA: Respect convergent
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.

Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.

https://reviews.llvm.org/D62607

llvm-svn: 363160
2019-06-12 13:34:19 +00:00
Nico Weber 8bbdea447e Fix a Wunused-lambda-capture warning.
The capture was added in the first commit of https://reviews.llvm.org/D61934
when it was used. In the reland, the use was removed but the capture
wasn't removed.

llvm-svn: 363155
2019-06-12 12:46:46 +00:00
Sam Parker 61de6a4e9c [NFC][SCEV] Add NoWrapFlag argument to InsertBinOp
'Use wrap flags in InsertBinop' (rL362687) was reverted due to
miscompiles. This patch introduces the previous change to pass
no-wrap flags but now only FlagAnyWrap is passed.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363147
2019-06-12 11:53:55 +00:00
Philip Reames 02f0b379f5 Fix a bug in getSCEVAtScope w.r.t. non-canonical loops
The issue is that if we have a loop with multiple predecessors outside the loop, the code was expecting to merge them and only return if equal, but instead returned the first one seen.

I have no idea if this actually tripped anywhere.  I noticed it by accident when reading the code and have no idea how to go about constructing a test case.

llvm-svn: 363112
2019-06-11 23:21:24 +00:00
Sanjay Patel 40e3bdf876 [Analysis] add isSplatValue() for vectors in IR
We have the related getSplatValue() already in IR (see code just above the proposed addition).
But sometimes we only need to know that the value is a splat rather than capture the splatted
scalar value. Also, we have an isSplatValue() function already in SDAG.

Motivation - recent bugs that would potentially benefit from improved splat analysis in IR:
https://bugs.llvm.org/show_bug.cgi?id=37428
https://bugs.llvm.org/show_bug.cgi?id=42174

Differential Revision: https://reviews.llvm.org/D63138

llvm-svn: 363106
2019-06-11 22:25:18 +00:00
Alina Sbirlea cb4ed8a7bc [MemorySSA] When applying updates, clean unnecessary Phis.
Summary: After applying a set of insert updates, there may be trivial Phis left over. Clean them up.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63033

llvm-svn: 363094
2019-06-11 19:09:34 +00:00
Alina Sbirlea 3cef1f7d64 Only passes that preserve MemorySSA must mark it as preserved.
Summary:
The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as
preserved, because it's being called in a lot of passes that do not
preserve MemorySSA.
Instead, mark the MemorySSA analysis as preserved by each pass that does
preserve it.
These changes only affect the new pass mananger.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62536

llvm-svn: 363091
2019-06-11 18:27:49 +00:00
Philip Reames 4bf1c23990 Factor out a helper function for readability and reuse in a future patch [NFC]
llvm-svn: 362980
2019-06-10 20:41:27 +00:00
Sanjay Patel 866db10228 [InstSimplify] reduce code duplication for fcmp folds; NFC
llvm-svn: 362904
2019-06-09 13:58:46 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Ayke van Laethem f18cf230e4 [CaptureTracking] Don't let comparisons against null escape inbounds pointers
Pointers that are in-bounds (either through dereferenceable_or_null or
thorough a getelementptr inbounds) cannot be captured with a comparison
against null. There is no way to construct a pointer that is still in
bounds but also NULL.

This helps safe languages that insert null checks before load/store
instructions. Without this patch, almost all pointers would be
considered captured even for simple loads. With this patch, an icmp with
null will not be seen as escaping as long as certain conditions are met.

There was a lot of discussion about this patch. See the Phabricator
thread for detals.

Differential Revision: https://reviews.llvm.org/D60047

llvm-svn: 362900
2019-06-09 10:20:33 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
Sanjay Patel e490e4a0e7 [Analysis] simplify code for getSplatValue(); NFC
AFAIK, this is only currently called by TTI, but it could be
used from instcombine or CGP to help solve problems like:
https://bugs.llvm.org/show_bug.cgi?id=37428
https://bugs.llvm.org/show_bug.cgi?id=42174

llvm-svn: 362810
2019-06-07 16:09:54 +00:00
Joerg Sonnenberger b2e96169b0 [NFC] Don't export helpers of ConstantFoldCall
llvm-svn: 362799
2019-06-07 13:28:52 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Craig Topper ca541b20d0 [CFLGraph] Add support for unary fneg instruction.
Differential Revision: https://reviews.llvm.org/D62791

llvm-svn: 362737
2019-06-06 19:21:23 +00:00
Craig Topper 6cda33ba36 [InlineCost] Add support for unary fneg.
This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.

Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.

Differential Revision: https://reviews.llvm.org/D62699

llvm-svn: 362732
2019-06-06 19:02:18 +00:00
Whitney Tsang 03e8369a72 [DA] Add an option to control delinearization validity checks
Summary: Dependence Analysis performs static checks to confirm validity
of delinearization. These checks often fail for 64-bit targets due to
type conversions and integer wrapping that prevent simplification of the
SCEV expressions. These checks would also fail at compile-time if the
lower bound of the loops are compile-time unknown.

For example:

void foo(int n, int m, int a[][m]) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < m; ++j) {
      a[i][j] = a[i+1][j-2];
    }
}

opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-pass-remarks=.* -debug-pass=Arguments
-da-permissive-validity-checks=false k3.ll -analyze -da
will produce the following by default:

da analyze - anti [* *|<]!
but will produce the following expected dependence vector if the
validity checks are disabled:

da analyze - consistent anti [1 -2]!
This revision will introduce a debug option that will leave the validity
checks in place by default, but allow them to be turned off. New tests
are added for cases where it cannot be proven at compile-time that the
individual subscripts stay in-bound with respect to a particular
dimension of an array. These tests enable the option to provide user
guarantee that the subscripts do not over/under-flow into other
dimensions, thereby producing more accurate dependence vectors.

For prior discussion on this topic, leading to this change, please see
the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-May/132372.html

Reviewers: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn
Reviewed By: Meinersbur, jdoerfert, dmgreen
Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney,
etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D62610

llvm-svn: 362711
2019-06-06 15:12:49 +00:00
Benjamin Kramer f1249442cf Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

llvm-svn: 362699
2019-06-06 12:35:46 +00:00
Sam Parker 7cc580f5e9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 362687
2019-06-06 08:56:26 +00:00
Whitney Tsang 2d0896c1cb [LOOPINFO] Extend Loop object to add utilities to get the loop bounds,
step, and loop induction variable.

Summary: This PR extends the loop object with more utilities to get loop
bounds, step, and loop induction variable. There already exists passes
which try to obtain the loop induction variable in their own pass, e.g.
loop interchange. It would be useful to have a common area to get these
information.

/// Example:
/// for (int i = lb; i < ub; i+=step)
///   <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
///   guardcmp = (lb < ub)
///   if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
///   i1 = phi[{lb, preheader}, {i2, latch}]
///   <loop body>
///   i2 = i1 + step
/// latch:
///   cmp = (i2 < ub)
///   if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
///   getInitialIVValue      --> lb
///   getStepInst            --> i2 = i1 + step
///   getStepValue           --> step
///   getFinalIVValue        --> ub
///   getCanonicalPredicate  --> '<'
///   getDirection           --> Increasing
/// getInductionVariable          --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical                   --> false

Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara,
fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya,
llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 362644
2019-06-05 20:42:47 +00:00
Whitney Tsang 590b1aee60 Revert "Title: [LOOPINFO] Extend Loop object to add utilities to get the loop"
This reverts commit d34797dfc2.

llvm-svn: 362615
2019-06-05 15:32:56 +00:00
Benjamin Kramer b90b354798 [LoopInfo] Fix unused variable warning. NFC.
llvm-svn: 362610
2019-06-05 14:43:58 +00:00
Whitney Tsang d34797dfc2 Title: [LOOPINFO] Extend Loop object to add utilities to get the loop
bounds, step, and loop induction variable.

Summary: This PR extends the loop object with more utilities to get loop
bounds, step, and loop induction variable. There already exists passes
which try to obtain the loop induction variable in their own pass, e.g.
loop interchange. It would be useful to have a common area to get these
information.

/// Example:
/// for (int i = lb; i < ub; i+=step)
///   <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
///   guardcmp = (lb < ub)
///   if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
///   i1 = phi[{lb, preheader}, {i2, latch}]
///   <loop body>
///   i2 = i1 + step
/// latch:
///   cmp = (i2 < ub)
///   if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
///   getInitialIVValue      --> lb
///   getStepInst            --> i2 = i1 + step
///   getStepValue           --> step
///   getFinalIVValue        --> ub
///   getCanonicalPredicate  --> '<'
///   getDirection           --> Increasing
/// getInductionVariable          --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical                   --> false

Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara,
fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya,
llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 362609
2019-06-05 14:34:12 +00:00
Nemanja Ivanovic fe97754acf Initial support for IBM MASS vector library
This is the LLVM portion of patch https://reviews.llvm.org/D59881.
The clang portion is to follow.

llvm-svn: 362568
2019-06-05 01:31:43 +00:00
Johannes Doerfert 40107ce753 Introduce Value::stripPointerCastsSameRepresentation
This patch allows current users of Value::stripPointerCasts() to force
the result of the function to have the same representation as the value
it was called on. This is useful in various cases, e.g., (non-)null
checks.

In this patch only a single call site was adjusted to fix an existing
misuse that would cause nonnull where they may be wrong. Uses in
attribute deduction and other areas, e.g., D60047, are to be expected.

For a discussion on this topic, please see [0].

[0] http://lists.llvm.org/pipermail/llvm-dev/2018-December/128423.html

Reviewers: hfinkel, arsenm, reames

Subscribers: wdng, hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61607

llvm-svn: 362545
2019-06-04 20:21:46 +00:00
Nikita Popov df621bdfc8 [LVI][CVP] Add support for urem, srem and sdiv
The underlying ConstantRange functionality has been added in D60952,
D61207 and D61238, this just exposes it for LVI.

I'm switching the code from using a whitelist to a blacklist, as
we're down to one unsupported operation here (xor) and writing it
this way seems more obvious :)

Differential Revision: https://reviews.llvm.org/D62822

llvm-svn: 362519
2019-06-04 16:24:09 +00:00
George Burgess IV c24a2f4ad9 CFLAA: reflow comments; NFC
llvm-svn: 362442
2019-06-03 19:56:22 +00:00
Craig Topper 7a4eabef39 [CFLGraph] Add FAdd to visitConstantExpr.
This looks like an oversight as all the other binary operators are present.

Accidentally noticed while auditing places that need FNeg handling.

No test because as noted in the review it would be contrived and amount to "don't crash"

Differential Revision: https://reviews.llvm.org/D62790

llvm-svn: 362441
2019-06-03 19:35:52 +00:00
Craig Topper 7cebf0af40 [InlineCost] Don't add the soft float function call cost for the fneg idiom, fsub -0.0, %x
Summary: Fneg can be implemented with an xor rather than a function call so we don't need to add the function call overhead. This was pointed out in D62699

Reviewers: efriedma, cameron.mcinally

Reviewed By: efriedma

Subscribers: javed.absar, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62747

llvm-svn: 362304
2019-06-01 19:40:07 +00:00
Erik Pilkington abb2a93c53 [SimplifyLibCalls] Fold more fortified functions into non-fortified variants
When the object size argument is -1, no checking can be done, so calling the
_chk variant is unnecessary. We already did this for a bunch of these
functions.

rdar://50797197

Differential revision: https://reviews.llvm.org/D62358

llvm-svn: 362272
2019-05-31 22:41:36 +00:00
Russell Gallop 802c9b59d5 ftime-trace: Trace loop passes
These can take a significant amount of time in some builds.

Suggested by Andrea Di Biagio.

Differential Revision: https://reviews.llvm.org/D62666

llvm-svn: 362219
2019-05-31 10:14:04 +00:00
Craig Topper b457e430f3 [InstructionSimplify] Add missing implementation of llvm::SimplifyUnOp. NFC
There are no callers currently, but the function is declared so we should at
least implement it.

llvm-svn: 362205
2019-05-31 08:10:23 +00:00
Nikita Popov 332c100562 [ValueTracking][ConstantRange] Distinguish low/high always overflow
In order to fold an always overflowing signed saturating add/sub,
we need to know in which direction the always overflow occurs.
This patch splits up AlwaysOverflows into AlwaysOverflowsLow and
AlwaysOverflowsHigh to pass through this information (but it is
not used yet).

Differential Revision: https://reviews.llvm.org/D62463

llvm-svn: 361858
2019-05-28 18:08:31 +00:00
Craig Topper ab53c5e5ab [InlineCost] Fix a couple comments. NFC
Replace "unary operator" with "unary instruction" in visitUnaryInstruction since
we now have a UnaryOperator class which might needs its own visit function.

Fix a copy/paste in visitCastInst that appears to have been copied from
visitPtrToInt.

llvm-svn: 361794
2019-05-28 07:25:27 +00:00
Craig Topper 50d502826b [CostModel] Add really basic support for being able to query the cost of the FNeg instruction.
Summary:
This reuses the getArithmeticInstrCost, but passes dummy values of the second
operand flags.

The X86 costs are wrong and can be improved in a follow up. I just wanted to
stop it from reporting an unknown cost first.

Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62444

llvm-svn: 361788
2019-05-28 04:09:18 +00:00
Xing Xue 3860aad6e7 [MustExecute] Improve MustExecute to correctly handle loop nest
Summary:
for.outer:
  br for.inner
for.inner:
  LI <loop invariant load instruction>
for.inner.latch:
  br for.inner, for.outer.latch
for.outer.latch:
  br for.outer, for.outer.exit

LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock().

Current algorithm of allLoopPathsLeadToBlock()

  1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch
  2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false
  3. return true

Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this.

Author: Whitney (committed by xingxue)

Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn

Reviewed By: jdoerfert

Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour

Tags: #LLVM

Differential Revision: https://reviews.llvm.org/D62418

llvm-svn: 361762
2019-05-27 13:57:28 +00:00
Nikita Popov d0f13e618f [ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI
The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.

llvm-svn: 361723
2019-05-26 13:22:01 +00:00
Nikita Popov 6bb5041e94 [LVI][CVP] Add support for saturating add/sub
Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

llvm-svn: 361703
2019-05-25 16:44:14 +00:00
Nikita Popov 024b18aca7 [LVI][CVP] Calculate with.overflow result range
In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.

Differential Revision: https://reviews.llvm.org/D60656

llvm-svn: 361693
2019-05-25 09:53:45 +00:00
Nikita Popov 17367b0d89 [LVI] Extract helper for binary range calculations; NFC
llvm-svn: 361692
2019-05-25 09:53:37 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Kit Barton 987fdfd9a7 Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f)

llvm-svn: 361553
2019-05-23 20:53:05 +00:00
Kit Barton 2049e4dd8f [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
Summary:
    This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.

      /// Example:
      /// for (int i = lb; i < ub; i+=step)
      ///   <loop body>
      /// --- pseudo LLVMIR ---
      /// beforeloop:
      ///   guardcmp = (lb < ub)
      ///   if (guardcmp) goto preheader; else goto afterloop
      /// preheader:
      /// loop:
      ///   i1 = phi[{lb, preheader}, {i2, latch}]
      ///   <loop body>
      ///   i2 = i1 + step
      /// latch:
      ///   cmp = (i2 < ub)
      ///   if (cmp) goto loop
      /// exit:
      /// afterloop:
      ///
      /// getBounds
      ///   getInitialIVValue      --> lb
      ///   getStepInst            --> i2 = i1 + step
      ///   getStepValue           --> step
      ///   getFinalIVValue        --> ub
      ///   getCanonicalPredicate  --> '<'
      ///   getDirection           --> Increasing
      /// getGuard             --> if (guardcmp) goto loop; else goto afterloop
      /// getInductionVariable          --> i1
      /// getAuxiliaryInductionVariable --> {i1}
      /// isCanonical                   --> false

    Committed on behalf of @Whitney (Whitney Tsang).

    Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn

    Reviewed By: kbarton

    Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 361517
2019-05-23 17:56:35 +00:00
Sanjay Patel 63fa690617 [InstSimplify] update stale comment; NFC
Missed this diff with rL361118.

llvm-svn: 361180
2019-05-20 17:52:18 +00:00
Nick Desaulniers 639b29b1b5 [INLINER] allow inlining of blockaddresses if sole uses are callbrs
Summary:
It was supposed that Ref LazyCallGraph::Edge's were being inserted by
inlining, but that doesn't seem to be the case.  Instead, it seems that
there was no test for a blockaddress Constant in an instruction that
referenced the function that contained the instruction. Ex:

```
define void @f() {
  %1 = alloca i8*, align 8
2:
  store i8* blockaddress(@f, %2), i8** %1, align 8
  ret void
}
```

When iterating blockaddresses, do not add the function they refer to
back to the worklist if the blockaddress is referring to the contained
function (as opposed to an external function).

Because blockaddress has sligtly different semantics than GNU C's
address of labels, there are 3 cases that can occur with blockaddress,
where only 1 can happen in GNU C due to C's scoping rules:
* blockaddress is within the function it refers to (possible in GNU C).
* blockaddress is within a different function than the one it refers to
(not possible in GNU C).
* blockaddress is used in to declare a global (not possible in GNU C).

The second case is tested in:

```
$ ./llvm/build/unittests/Analysis/AnalysisTests \
  --gtest_filter=LazyCallGraphTest.HandleBlockAddress
```

This patch adjusts the iteration of blockaddresses in
LazyCallGraph::visitReferences to not revisit the blockaddresses
function in the first case.

The Linux kernel contains code that's not semantically valid at -O0;
specifically code passed to asm goto. It requires that asm goto be
inline-able. This patch conservatively does not attempt to handle the
more general case of inlining blockaddresses that have non-callbr users
(pr/39560).

https://bugs.llvm.org/show_bug.cgi?id=39560
https://bugs.llvm.org/show_bug.cgi?id=40722
https://github.com/ClangBuiltLinux/linux/issues/6
https://reviews.llvm.org/rL212077

Reviewers: jyknight, eli.friedman, chandlerc

Reviewed By: chandlerc

Subscribers: george.burgess.iv, nathanchance, mgorny, craig.topper, mengxu.gatech, void, mehdi_amini, E5ten, chandlerc, efriedma, eraman, hiraditya, haicheng, pirama, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58260

llvm-svn: 361173
2019-05-20 16:48:09 +00:00
Cameron McInally 2d2a46db8e [InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg
Differential Revision: https://reviews.llvm.org/D62077

llvm-svn: 361151
2019-05-20 13:13:35 +00:00
Sanjay Patel 9ef99b4b11 [InstSimplify] fold fcmp (maxnum, X, C1), C2
This is the sibling transform for rL360899 (D61691):

  maxnum(X, GreaterC) == C --> false
  maxnum(X, GreaterC) <= C --> false
  maxnum(X, GreaterC) <  C --> false
  maxnum(X, GreaterC) >= C --> true
  maxnum(X, GreaterC) >  C --> true
  maxnum(X, GreaterC) != C --> true

llvm-svn: 361118
2019-05-19 14:26:39 +00:00
Cameron McInally 067e946859 [InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform
Differential Revision: https://reviews.llvm.org/D62013

llvm-svn: 361047
2019-05-17 16:47:00 +00:00
Sanjay Patel 152f81fae8 [InstSimplify] fold fcmp (minnum, X, C1), C2
minnum(X, LesserC) == C --> false
   minnum(X, LesserC) >= C --> false
   minnum(X, LesserC) >  C --> false
   minnum(X, LesserC) != C --> true
   minnum(X, LesserC) <= C --> true
   minnum(X, LesserC) <  C --> true

maxnum siblings will follow if there are no problems here.

We should be able to perform some other combines when the constants
are equal or greater-than too, but that would go in instcombine.

We might also generalize this by creating an FP ConstantRange
(similar to what we do for integers).

Differential Revision: https://reviews.llvm.org/D61691

llvm-svn: 360899
2019-05-16 14:03:10 +00:00
Fangrui Song 3e92df3e39 Add Triple::isPPC64()
llvm-svn: 360864
2019-05-16 08:31:22 +00:00
Cameron McInally 0c82d9b5a2 Teach InstSimplify -X + X --> 0.0 about unary FNeg
Differential Revision: https://reviews.llvm.org/D61916

llvm-svn: 360777
2019-05-15 14:31:33 +00:00
Nikita Popov 48c4e4fa80 [LVI][CVP] Add support for abs/nabs select pattern flavor
Based on ConstantRange support added in D61084, we can now handle
abs and nabs select pattern flavors in LVI.

Differential Revision: https://reviews.llvm.org/D61794

llvm-svn: 360700
2019-05-14 18:53:47 +00:00
Kit Barton 37b7922daa Save the induction binary operator in IVDescriptors for non FP induction variables.
Summary:
Currently InductionBinOps are only saved for FP induction variables, the PR extends it with non FP induction variable, so user of IVDescriptors can query the InductionBinOps for integer induction variables.

The changes in hasUnsafeAlgebra() and getUnsafeAlgebraInst() are required for the existing LIT test cases to pass. As described in the comment of the two functions, one of the requirement to return true is it is a FP induction variable. The checks was not needed because InductionBinOp was not set on non FP cases before.

https://reviews.llvm.org/D60565 depends on the patch.

Committed on behalf of @Whitney (Whitney Tsang).

Reviewers: jdoerfert, kbarton, fhahn, hfinkel, dmgreen, Meinersbur

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61329

llvm-svn: 360671
2019-05-14 13:26:36 +00:00
Teresa Johnson 37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Philip Reames 76ea748d2d Compile time tweak for libcall lookup
If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare.  Adding a fastpath reduces compile by 15% for one such example.

The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly.

llvm-svn: 360391
2019-05-09 23:13:09 +00:00
Warren Ristow d27b0c6247 [SCEV] Suppress hoisting insertion point of binops when unsafe
InsertBinop tries to move insertion-points out of loops for expressions
that are loop-invariant. This patch adds a new parameter, IsSafeToHost,
to guard that hoisting. This allows callers to suppress that hoisting
for unsafe situations, such as divisions that may have a zero
denominator.

This fixes PR38697.

Differential Revision: https://reviews.llvm.org/D55232

llvm-svn: 360280
2019-05-08 18:50:07 +00:00
Alina Sbirlea f31eba6494 [MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270
2019-05-08 17:05:36 +00:00
Nikita Popov 9fd02a71a3 Revert "[ValueTracking] Improve isKnowNonZero for Ints"
This reverts commit 3b137a4956.

As reported in https://reviews.llvm.org/D60846, this is causing
miscompiles.

llvm-svn: 360260
2019-05-08 14:50:01 +00:00
Dan Robertson 3b137a4956 [ValueTracking] Improve isKnowNonZero for Ints
Improve isKnownNonZero for integers in order to improve cttz
optimizations.

Differential Revision: https://reviews.llvm.org/D60846

llvm-svn: 360222
2019-05-08 02:25:08 +00:00
Sanjay Patel e088d03b9c [ValueTracking] add logic for known-never-nan with minnum/maxnum
From the LangRef: "Returns NaN only if both operands are NaN."

llvm-svn: 360206
2019-05-07 22:58:31 +00:00
Keno Fischer a1a4adf4b9 [SCEV] Add explicit representations of umin/smin
Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D50167

llvm-svn: 360159
2019-05-07 15:28:47 +00:00
Cameron McInally c3167696bc Add FNeg support to InstructionSimplify
Differential Revision: https://reviews.llvm.org/D61573

llvm-svn: 360053
2019-05-06 16:05:10 +00:00
Cameron McInally 1d0c845d9d Add FNeg IR constant folding support
llvm-svn: 359982
2019-05-05 16:07:09 +00:00
Alina Sbirlea 0363c3b8bb [MemorySSA] Check that block is reachable when adding phis.
Summary:
Originally the insertDef method was only used when building MemorySSA, and was limiting the number of Phi nodes that it created.
Now it's used for updates as well, and it can create additional Phis needed for correctness.
Make sure no Phis are created in unreachable blocks (condition met during MSSA build), otherwise the renamePass will find a null DTNode.

Resolves PR41640.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61410

llvm-svn: 359845
2019-05-02 23:41:58 +00:00
Alina Sbirlea 151ab4844a [MemorySSA] Refactor removing multiple trivial phis [NFC].
Summary: Create a method to clean up multiple potentially trivial phis, since we will need this often.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61471

llvm-svn: 359842
2019-05-02 23:12:49 +00:00
Keno Fischer a3e4b3bd33 [SCEV] Use isKnownViaNonRecursiveReasoning for smax simplification
Summary:
Commit
	rL331949: SCEV] Do not use induction in isKnownPredicate for simplification umax

changed the codepath for umax from isKnownPredicate to
isKnownViaNonRecursiveReasoning to avoid compile time blow up (and as
I found out also stack overflows). However, there is an exact copy of
the code for umax that was lacking this change. In D50167 I want to unify
these codepaths, but to avoid that being a behavior change for the smax
case, pull this independent bit out of it.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D61166

llvm-svn: 359693
2019-05-01 15:58:24 +00:00
Keno Fischer d8f856d265 [LoopInfo] Faster implementation of setLoopID. NFC.
Summary:
This change was part of D46460. However, in the meantime rL341926 fixed the
correctness issue here. What remained was the performance issue in setLoopID
where it would iterate through all blocks in the loop and their successors,
rather than just the predecessor of the header (the later presumably being
much faster). We already have the `getLoopLatches` to compute precisely these
basic blocks in an efficient manner, so just use it (as the original commit
did for `getLoopID`).

Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D61215

llvm-svn: 359684
2019-05-01 14:39:11 +00:00
Alina Sbirlea b468320313 [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: LLVM

Differential Revision: https://reviews.llvm.org/D61043

llvm-svn: 359627
2019-04-30 22:43:55 +00:00
Alina Sbirlea ba48a2c5e8 [AliasAnalysis/NewPassManager] Invalidate AAManager less often.
Summary:
This is a redo of D60914.

The objective is to not invalidate AAManager, which is stateless, unless
there is an explicit invalidate in one of the AAResults.

To achieve this, this patch adds an API to PAC, to check precisely this:
is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61284

llvm-svn: 359622
2019-04-30 22:15:47 +00:00
Fedor Sergeev eeae45dc77 [NFC][InlineCost] cleanup - comments, overflow handling.
Reviewed By: apilipenko
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60751

llvm-svn: 359609
2019-04-30 20:44:53 +00:00
Simon Pilgrim f5e8f222d6 Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61043
........
This was causing windows build bot failures

llvm-svn: 359555
2019-04-30 12:34:21 +00:00
Sjoerd Meijer ea31ddb36f [ARM] Implement TTI::getMemcpyCost
This implements TargetTransformInfo method getMemcpyCost, which estimates the
number of instructions to which a memcpy instruction expands to.

Differential Revision: https://reviews.llvm.org/D59787

llvm-svn: 359547
2019-04-30 10:28:50 +00:00
Alina Sbirlea 9a1edd14a2 [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61043

llvm-svn: 359519
2019-04-29 23:53:04 +00:00
Nikita Popov 7a94795b2b [ConstantRange] Add makeExactNoWrapRegion()
I got confused on the terminology, and the change in D60598 was not
correct. I was thinking of "exact" in terms of the result being
non-approximate. However, the relevant distinction here is whether
the result is

 * Largest range such that:
   Forall Y in Other: Forall X in Result: X BinOp Y does not wrap.
   (makeGuaranteedNoWrapRegion)
 * Smallest range such that:
   Forall Y in Other: Forall X not in Result: X BinOp Y wraps.
   (A hypothetical makeAllowedNoWrapRegion)
 * Both. (makeExactNoWrapRegion)

I'm adding a separate makeExactNoWrapRegion method accepting a
single APInt (same as makeExactICmpRegion) and using it in the
places where the guarantee is relevant.

Differential Revision: https://reviews.llvm.org/D60960

llvm-svn: 359402
2019-04-28 15:40:56 +00:00
Philip Reames 88cd69b56f Consolidate existing utilities for interpreting vector predicate maskes [NFC]
llvm-svn: 359163
2019-04-25 02:30:17 +00:00
Xinliang David Li 499c80b890 Add optional arg to profile count getters to filter
synthetic profile count.

Differential Revision: http://reviews.llvm.org/D61025

llvm-svn: 359131
2019-04-24 19:51:16 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Alina Sbirlea b341efce31 Revert [AliasAnalysis] AAResults preserves AAManager.
Triggers use-after-free.

llvm-svn: 359055
2019-04-24 00:28:29 +00:00
Josh Stone 27924c3a3c [Lint] Permit aliasing noalias readonly arguments
Summary:
If two arguments are both readonly, then they have no memory dependency
that would violate noalias, even if they do actually overlap.

Reviewers: hfinkel, efriedma

Reviewed By: efriedma

Subscribers: efriedma, hiraditya, llvm-commits, tstellar

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60239

llvm-svn: 359047
2019-04-23 23:43:47 +00:00
Alina Sbirlea 4fd1f266b1 [MemorySSA] LCSSA preserves MemorySSA.
Summary:
Enabling MemorySSA in the old pass manager leads to MemorySSA being run
twice due to the fact that LCSSA and LoopSimplify do not preserve
MemorySSA. This is the first step to address that: target LCSSA.

LCSSA does not make any changes that invalidate MemorySSA, so it
preserves it by design. It must preserve AA as well, for this to hold.

After this patch, MemorySSA is still run twice in the old pass manager.
Step two follows: target LoopSimplify.

Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60832

llvm-svn: 359032
2019-04-23 20:59:44 +00:00
Alina Sbirlea a809e8e5e7 [AliasAnalysis] AAResults preserves AAManager.
Summary:
AAResults should not invalidate AAManager.
Update tests.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60914

llvm-svn: 359014
2019-04-23 17:21:18 +00:00
Fangrui Song efd94c56ba Use llvm::stable_sort
While touching the code, simplify if feasible.

llvm-svn: 358996
2019-04-23 14:51:27 +00:00
Fedor Sergeev 652168a99b [CallSite removal] move InlineCost to CallBase usage
Converting InlineCost interface and its internals into CallBase usage.
Inliners themselves are still not converted.

Reviewed By: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60636

llvm-svn: 358982
2019-04-23 12:43:27 +00:00
Philip Reames d8d9b7b20e [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify from InstCombine
In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask.  

llvm-svn: 358913
2019-04-22 19:30:01 +00:00
Nikita Popov 5aacc7a573 Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.

After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.

llvm-svn: 358876
2019-04-22 09:01:38 +00:00
Nikita Popov 5299e25f50 [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().

llvm-svn: 358875
2019-04-22 08:36:05 +00:00
Nikita Popov dbc3fbafe7 [ConstantRange] Add getNonEmpty() constructor
ConstantRanges have an annoying special case: If upper and lower are
the same, it can be either an empty or a full set. When constructing
constant ranges nearly always a full set is intended, but this still
requires an explicit check in many places.

This revision adds a getNonEmpty() constructor that disambiguates this
case: If upper and lower are the same, a full set is created.

Differential Revision: https://reviews.llvm.org/D60947

llvm-svn: 358854
2019-04-21 15:22:54 +00:00
Chandler Carruth ce3f75df1f [CallSite removal] Move the legacy PM, call graph, and some inliner
code to `CallBase`.

This patch focuses on the legacy PM, call graph, and some of inliner and legacy
passes interacting with those APIs from `CallSite` to the new `CallBase` class.
No interesting changes.

Differential Revision: https://reviews.llvm.org/D60412

llvm-svn: 358739
2019-04-19 05:59:42 +00:00
Nicolai Haehnle 523f90a2ba [SDA] Bug fix: Use IPD outside the loop as divergence bound
Summary:
The immediate post dominator of the loop header may be part of the divergent loop.
Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: mmasten, arsenm, jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59042

llvm-svn: 358681
2019-04-18 16:17:35 +00:00
Philip Reames b2c9fc02d5 Fix a bug in SCEV's isSafeToExpand around speculation safety
isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute.  This can be trivially shown to be false by considering the following small example:
bb:
  add x, y  <-- InsertionPoint
  call @throws()
  udiv x, y <-- SCEV* S
  br ...

It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint.

Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block.  These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue.

As best I can tell, this is a silent bug in current ToT.  Given that, there's no tests with this change.  It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that.  That change will include the test which caused me to notice the issue.  I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation.

llvm-svn: 358680
2019-04-18 16:10:21 +00:00
Nikita Popov 2039581002 [LVI][CVP] Constrain values in with.overflow branches
If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1)
then we can constrain the value of %x inside the branch based on
makeGuaranteedNoWrapRegion(). We do this by extending the edge-value
handling in LVI. This allows CVP to then fold comparisons against %x,
as illustrated in the tests.

Differential Revision: https://reviews.llvm.org/D60650

llvm-svn: 358597
2019-04-17 16:57:42 +00:00
Nikita Popov 79dffc67b5 [IR] Add WithOverflowInst class
This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.

The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.

Differential Revision: https://reviews.llvm.org/D60668

llvm-svn: 358512
2019-04-16 18:55:16 +00:00
Alina Sbirlea f9f073a861 [MemorySSA] Add previous def to cache when found, even if trivial.
Summary:
When inserting a new Def, MemorySSA may be have non-minimal number of Phis.
While inserting, the walk to find the previous definition may cleanup minimal Phis.
When the last definition is trivial to obtain, we do not cache it.

It is possible while getting the previous definition for a Def to get two different answers:
- one that was straight-forward to find when walking the first path (a trivial phi in this case), and
- another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one.
While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards.

A way to fix this problem is to cache the straight-forward answer we got on the first walk.
The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache.
Resolves PR40749.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60634

llvm-svn: 358313
2019-04-12 21:58:52 +00:00
Alina Sbirlea 2312a06c87 [SCEV] Add option to forget everything in SCEV.
Summary:
Create a method to forget everything in SCEV.
Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.

Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch.
Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build.
Testcase showcasing this cannot be opensourced and is fairly large.

The option disabled by default, but it may be desirable to enable by
default. Evidence in favor (two difference runs on different days/ToT state):

File Before (s) After (s)
clang-9.bc 7267.91 6639.14
llvm-as.bc 194.12 194.12
llvm-dis.bc 62.50 62.50
opt.bc 1855.85 1857.53

File Before (s) After (s)
clang-9.bc 8588.70 7812.83
llvm-as.bc 196.20 194.78
llvm-dis.bc 61.55 61.97
opt.bc 1739.78 1886.26

Reviewers: sanjoy

Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60144

llvm-svn: 358304
2019-04-12 19:16:07 +00:00
Alina Sbirlea 57769382b1 [MemorySSA] Small fix for the clobber limit.
Summary:
After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry.
The test included triggered that assert.

The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this:
```
while (true) {
   if (getBlockingAccess()) { // calls walkToPhiOrClobber
   }
   for (...) {
     walkToPhiOrClobber();
   }
}
```

The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1.
This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced.
The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60479

llvm-svn: 358303
2019-04-12 18:48:46 +00:00
Fangrui Song cecc435250 Use llvm::lower_bound. NFC
This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version.

llvm-svn: 358246
2019-04-12 02:02:06 +00:00
Ali Tamur 7822b46188 Revert "Use llvm::lower_bound. NFC"
This reverts commit rL358161.

This patch have broken the test:
llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s

llvm-svn: 358199
2019-04-11 17:35:20 +00:00
Sander de Smalen 4f5d2df48d [ValueTracking] Change if-else chain into switch in computeKnownBitsFromAssume
This is a follow-up patch to D60504 to further improve
performance issues in computeKnownBitsFromAssume.
    
The patch is NFC, but may improve compile-time performance
if the compiler isn't clever enough to do the optimization
itself.

llvm-svn: 358163
2019-04-11 13:02:19 +00:00
Fangrui Song 71cce580b9 Use llvm::lower_bound. NFC
llvm-svn: 358161
2019-04-11 10:25:41 +00:00
Erik Pilkington cb5c7bd9eb Fix a hang when lowering __builtin_dynamic_object_size
If the ObjectSizeOffsetEvaluator fails to fold the object size call, then it may
litter some unused instructions in the function. When done repeatably in
InstCombine, this results in an infinite loop. Fix this by tracking the set of
instructions that were inserted, then removing them on failure.

rdar://49172227

Differential revision: https://reviews.llvm.org/D60298

llvm-svn: 358146
2019-04-10 23:42:11 +00:00
Sander de Smalen 0e66db5d77 Improve compile-time performance in computeKnownBitsFromAssume.
This patch changes the order of pattern matching by first testing
a compare instruction's predicate, before doing the pattern
match for the whole expression tree.

Patch by Paul Walker.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D60504

llvm-svn: 358097
2019-04-10 16:24:48 +00:00
Nikita Popov 4b2323d1a3 [ValueTracking] Use computeConstantRange() for signed sub overflow determination
This is the same change as D60420 but for signed sub rather than
signed add: Range information is intersected into the known bits
result, allows to detect more no/always overflow conditions.

Differential Revision: https://reviews.llvm.org/D60469

llvm-svn: 358020
2019-04-09 17:01:49 +00:00
Nikita Popov 10edd2b79d [ValueTracking] Use computeConstantRange() in signed add overflow determination
This is D59386 for the signed add case. The computeConstantRange()
result is now intersected into the existing known bits information,
allowing to detect additional no-overflow/always-overflow conditions
(though the latter isn't used yet).

This (finally...) covers the motivating case from D59071.

Differential Revision: https://reviews.llvm.org/D60420

llvm-svn: 358014
2019-04-09 16:12:59 +00:00
Nemanja Ivanovic 820b90318f NFC: Refactor library-specific mappings of scalar maths functions to their vector counterparts
This patch factors out mappings of scalar maths functions to their vector
counterparts from TargetLibraryInfo.cpp to a separate VecFuncs.def file. Such
mappings are currently available for Accelerate framework, and SVML library.

This is in support of the follow-up: https://reviews.llvm.org/D59881

Patch by pjeeva01

Differential revision: https://reviews.llvm.org/D60211

llvm-svn: 358001
2019-04-09 13:21:11 +00:00
Nikita Popov 6e9157d588 [ValueTracking] Use ConstantRange methods; NFC
Switch part of the computeOverflowForSignedAdd() implementation to
use Range.isAllNegative() rather than KnownBits.isNegative() and
similar. They do the same thing, but using the ConstantRange methods
allows dropping the KnownBits variables more easily in D60420.

llvm-svn: 357969
2019-04-09 07:13:09 +00:00
Nikita Popov 7bd7878d22 [ValueTracking] Explicitly specify intersection type; NFC
Preparation for D60420.

llvm-svn: 357968
2019-04-09 07:13:03 +00:00
Nikita Popov 3db93ac5d6 Reapply [ValueTracking] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This fixes an infinite InstCombine loop, with the
test case taken from D59378.

Relative to the previous iteration, this contains some adjustments for
AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen,
which ends up constant folding some existing med3 tests after this
change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option
is added, which allows disabling scalar IR passes (that use InstSimplify)
for testing purposes.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 357870
2019-04-07 17:22:16 +00:00
Guozhi Wei 36fc9c3107 [LCG] Add aliased functions as LCG roots
Current LCG doesn't check aliased functions. So if an internal function has a public alias it will not be added to CG SCC, but it is still reachable from outside through the alias.
So this patch adds aliased functions to SCC.

Differential Revision: https://reviews.llvm.org/D59898

llvm-svn: 357795
2019-04-05 18:51:08 +00:00
Nick Lewycky a6ed16c98f An unreachable block may have a route to a reachable block, don't fast-path return that it can't.
A block reachable from the entry block can't have any route to a block that's not reachable from the entry block (if it did, that route would make it reachable from the entry block). That is the intended performance optimization for isPotentiallyReachable. For the case where we ask whether an unreachable from entry block has a route to a reachable from entry block, we can't conclude one way or the other. Fix a bug where we claimed there could be no such route.

The fix in rL357425 ironically reintroduced the very bug it was fixing but only when a DominatorTree is provided. This fixes the remaining bug.

llvm-svn: 357734
2019-04-04 23:09:40 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Evandro Menezes 7c711ccf36 [IR] Create new method in `Function` class (NFC)
Create method `optForNone()` testing for the function level equivalent of
`-O0` and refactor appropriately.

Differential revision: https://reviews.llvm.org/D59852

llvm-svn: 357638
2019-04-03 21:27:03 +00:00
Alon Zakai b4f9991f38 [WebAssembly] Add Emscripten OS definition + small_printf
The Emscripten OS provides a definition of __EMSCRIPTEN__, and also that it
supports iprintf optimizations.

Also define small_printf optimizations, which is a printf with float support
but not long double (which in wasm can be useful since long doubles are 128
bit and force linking of float128 emulation code). This part is based on
sunfish's https://reviews.llvm.org/D57620 (which can't land yet since
the WASI integration isn't ready yet).

Differential Revision: https://reviews.llvm.org/D60167

llvm-svn: 357552
2019-04-03 01:08:35 +00:00
Matt Arsenault 03e7492876 InstSimplify: Fold round intrinsics from sitofp/uitofp
https://godbolt.org/z/gEMRZb

llvm-svn: 357549
2019-04-03 00:25:06 +00:00
Philip Reames d3d5d76a7b [WideableCond] Fix a nasty bug in detection of "explicit guards"
The code was failing to actually check for the presence of the call to widenable_condition.  The whole point of specifying the widenable_condition intrinsic was allowing widening transforms.  A normal branch is not widenable.  A normal branch leading to a deopt is not widenable (in general).

I added a test case via LoopPredication, but GuardWidening has an analogous bug.  Those are the only two passes actually using this utility just yet. Noticed while working on LoopPredication for non-widenable branches; POC in D60111.

llvm-svn: 357493
2019-04-02 16:51:43 +00:00
Nick Lewycky c0ebfbe3f3 Add an optional list of blocks to avoid when looking for a path in isPotentiallyReachable.
The leads to some ambiguous overloads, so update three callers.

Differential Revision: https://reviews.llvm.org/D60085

llvm-svn: 357447
2019-04-02 01:05:48 +00:00
Nick Lewycky 66d7eb9704 Not all blocks are reachable from entry. Don't assume they are.
Fixes a bug in isPotentiallyReachable, noticed by inspection.

llvm-svn: 357425
2019-04-01 20:03:16 +00:00
Alina Sbirlea c8d6e0496d [MemorySSA] Temporary fix assert when reaching 0 limit.
llvm-svn: 357327
2019-03-29 22:55:59 +00:00
Sanjoy Das 53a5952a93 Try to fix buildbot error
Error is:

llvm/lib/Analysis/ScalarEvolution.cpp:3534:10: error: chosen constructor is explicit in copy-initialization
  return {UniqueSCEVs.FindNodeOrInsertPos(ID, IP), std::move(ID), IP};
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/bin/../lib/gcc/aarch64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
        constexpr tuple(_UElements&&... __elements)
                  ^
1 error generated.

llvm-svn: 357324
2019-03-29 22:27:10 +00:00
Sanjoy Das 32fd32bc6f [SCEV] Check the cache in get{S|U}MaxExpr before doing any work
Summary:
This lets us avoid e.g. checking if A >=s B in getSMaxExpr(A, B) if we've
already established that (A smax B) is the best we can do.

Fixes PR41225.

Reviewers: asbirlea

Subscribers: mcrosier, jlebar, bixia, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60010

llvm-svn: 357320
2019-03-29 22:00:12 +00:00
Alina Sbirlea f085cc5aa7 [MemorySSA] Limit clobber walks.
Summary: This patch limits all getClobberingMemoryAccess() walks to MaxCheckLimit.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59569

llvm-svn: 357319
2019-03-29 21:56:09 +00:00
Alina Sbirlea e589067e61 [MemorySSA] Don't optimize incomplete phis.
Summary:
MemoryPhis cannot be optimized out until they are complete.
Resolves PR41254.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59966

llvm-svn: 357315
2019-03-29 21:16:31 +00:00
Mircea Trofin b27d0fd0bf [llvm][NFC] Factor out logic for getting incoming & back Loop edges
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59967

llvm-svn: 357284
2019-03-29 17:39:17 +00:00
Florian Hahn 9b41a7320d Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Updated to use DenseMap::insert instead of [] operator for insertion, to
avoid a crash caused by epoch checks.

This reverts commit 2b85de4383.

llvm-svn: 357257
2019-03-29 14:10:24 +00:00
Florian Hahn 2b85de4383 Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Another buildbot failure

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402

clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction*; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction*>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>*; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed.

0.	Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig

This reverts r357222 (git commit 64cccfcc72)

llvm-svn: 357227
2019-03-29 00:22:26 +00:00
Florian Hahn 64cccfcc72 Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Recommitting after addressing a buildbot failure.

This reverts commit c87869ebea.

llvm-svn: 357222
2019-03-28 23:11:00 +00:00
Florian Hahn c87869ebea Revert [DSE] Preserve basic block ordering using OrderedBasicBlock.
This reverts r357208 (git commit c0bfd37d38)

This causes a buildbot failure:  http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/16124

FAILED: lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/install/stage2/bin/clang++   -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/IR -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR -Iinclude -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3    -UNDEBUG  -fno-exceptions -fno-rtti -MD -MT lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -MF lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o.d -o lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -c /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR/IRBuilder.cpp
clang-9: /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/Analysis/OrderedBasicBlock.cpp:38: bool llvm::OrderedBasicBlock::comesBefore(const llvm::Instruction *, const llvm::Instruction *): Assertion `!(LastInstFound == BB->end() && NextInstPos != 0) && "Instruction supposed to be in NumberedInsts"' failed.

llvm-svn: 357211
2019-03-28 20:36:24 +00:00
Florian Hahn c0bfd37d38 [DSE] Preserve basic block ordering using OrderedBasicBlock.
By extending OrderedBB to allow removing and replacing cached
instructions, we can preserve OrderedBBs in DSE easily. This eliminates
one source of quadratic compile time in DSE.

Fixes PR38829.

Reviewers: rnk, efriedma, hfinkel

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59789

llvm-svn: 357208
2019-03-28 20:02:33 +00:00
Florian Hahn 6c3024368c [MemDepAnalysis] Allow caller to pass in an OrderedBasicBlock.
If the caller can preserve the OBB, we can avoid recomputing the order
for each getDependency call.

Reviewers: efriedma, rnk, hfinkel

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D59788

llvm-svn: 357206
2019-03-28 19:17:31 +00:00
Chandler Carruth 923ff550b9 [NewPM] Fix a nasty bug with analysis invalidation in the new PM.
The issue here is that we actually allow CGSCC passes to mutate IR (and
therefore invalidate analyses) outside of the current SCC. At a minimum,
we need to support mutating parent and ancestor SCCs to support the
ArgumentPromotion pass which rewrites all calls to a function.

However, the analysis invalidation infrastructure is heavily based
around not needing to invalidate the same IR-unit at multiple levels.
With Loop passes for example, they don't invalidate other Loops. So we
need to customize how we handle CGSCC invalidation. Doing this without
gratuitously re-running analyses is even harder. I've avoided most of
these by using an out-of-band preserved set to accumulate the cross-SCC
invalidation, but it still isn't perfect in the case of re-visiting the
same SCC repeatedly *but* it coming off the worklist. Unclear how
important this use case really is, but I wanted to call it out.

Another wrinkle is that in order for this to successfully propagate to
function analyses, we have to make sure we have a proxy from the SCC to
the Function level. That requires pre-creating the necessary proxy.

The motivating test case now works cleanly and is added for
ArgumentPromotion.

Thanks for the review from Philip and Wei!

Differential Revision: https://reviews.llvm.org/D59869

llvm-svn: 357137
2019-03-28 00:51:36 +00:00
Hans Wennborg 5519cb2d94 Fix the build with GCC 4.8 after r356783
llvm-svn: 356875
2019-03-25 09:27:42 +00:00
Nikita Popov 977934f00f [ConstantRange] Add getFull() + getEmpty() named constructors; NFC
This adds ConstantRange::getFull(BitWidth) and
ConstantRange::getEmpty(BitWidth) named constructors as more readable
alternatives to the current ConstantRange(BitWidth, /* full */ false)
and similar. Additionally private getFull() and getEmpty() member
functions are added which return a full/empty range with the same bit
width -- these are commonly needed inside ConstantRange.cpp.

The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet)
constructor is now mandatory for the few usages that still make use of it.

Differential Revision: https://reviews.llvm.org/D59716

llvm-svn: 356852
2019-03-24 09:34:40 +00:00
Nikita Popov 280a6b01c8 [ValueTracking] Avoid redundant known bits calculation in computeOverflowForSignedAdd()
We're already computing the known bits of the operands here. If the
known bits of the operands can determine the sign bit of the result,
we'll already catch this in signedAddMayOverflow(). The only other
way (and as the comment already indicates) we'll get new information
from computing known bits on the whole add, is if there's an assumption
on it.

As such, we change the code to only compute known bits from assumptions,
instead of computing full known bits on the add (which would unnecessarily
recompute the known bits of the operands as well).

Differential Revision: https://reviews.llvm.org/D59473

llvm-svn: 356785
2019-03-22 17:51:40 +00:00
Alina Sbirlea bfc779e491 [AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.

AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.

MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.

- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.

Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59315

llvm-svn: 356783
2019-03-22 17:22:19 +00:00
Bixia Zheng bdf0230cff [ConstantFolding] Fix GetConstantFoldFPValue to avoid cast overflow.
Summary:
In C++, the behavior of casting a double value that is beyond the range
of a single precision floating-point to a float value is undefined. This
change replaces such a cast with APFloat::convert to convert the value,
which is consistent with how we convert a double value to a half value.

Reviewers: sanjoy

Subscribers: lebedev.ri, sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59500

llvm-svn: 356781
2019-03-22 16:37:37 +00:00
Craig Topper 9f0b17a248 [ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics.
This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle.

I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway.

Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that.

Differential Revision: https://reviews.llvm.org/D59180

llvm-svn: 356687
2019-03-21 17:38:52 +00:00
Nikita Popov 3af5b28f47 [ValueTracking] Use ConstantRange based overflow check for signed sub
This is D59450, but for signed sub. This case is not NFC, because
the overflow logic in ConstantRange is more powerful than the existing
check. This resolves the TODO in the function.

I've added two tests to show that this indeed catches more cases than
the previous logic, but the main correctness test coverage here is in
the existing ConstantRange unit tests.

Differential Revision: https://reviews.llvm.org/D59617

llvm-svn: 356685
2019-03-21 17:23:51 +00:00
Fangrui Song ebfb7852be [BasicAA] Use DenseMap::try_emplace after D59151. NFC
llvm-svn: 356651
2019-03-21 08:47:40 +00:00
Alina Sbirlea 4fdbd822fc [BasicAA] Reduce no of map seaches [NFCI].
Summary:
This is a refactoring patch.
- Reduce the number of map searches by reusing the iterator.
- Add asserts to check that the entry is in the cache, as this is something BasicAA relies on to avoid infinite recursion.

Reviewers: chandlerc, aschwaighofer

Subscribers: sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59151

llvm-svn: 356644
2019-03-21 05:02:05 +00:00
George Burgess IV ae84e9ab49 [MSSA] Delete move ctor; remove dynamic never-moved verification
Code archaeology in D59315 revealed that MSSA should never be moved.
Rather than trying to check dynamically that this hasn't happened in the
verify() functions of Walkers, it's likely best to just delete its move
constructor.

Since all these verify() functions did is check that MSSA hasn't moved,
this allows us to remove these verify functions.

I can readd the verification checks if someone's super concerned about
us trying to `memcpy` MemorySSA or something somewhere, but I imagine we
have other problems if we're trying anything like that...

llvm-svn: 356641
2019-03-21 03:11:34 +00:00
Nikita Popov 00b5ecab5d [ValueTracking] Compute range for abs without nsw
This is a small followup to D59511. The code that was moved into
computeConstantRange() there is a bit overly conversative: If the
abs is not nsw, it does not compute any range. However, abs without
nsw still has a well-defined contiguous unsigned range from 0 to
SIGNED_MIN. This is a lot less useful than the usual 0 to SIGNED_MAX
range, but if we're already here we might as well specify it...

Differential Revision: https://reviews.llvm.org/D59563

llvm-svn: 356586
2019-03-20 18:16:02 +00:00
Nikita Popov 208381953b [ValueTracking] Use computeConstantRange() for unsigned add/sub overflow
Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by
intersecting the computeConstantRange() result into the ConstantRange
created from computeKnownBits(). This allows us to detect some
additional never/always overflows conditions that can't be determined
from known bits.

This revision also adds basic handling for constants to
computeConstantRange(). Non-splat vectors will be handled in a followup.

The signed case will also be handled in a followup, as it needs some
more groundwork.

Differential Revision: https://reviews.llvm.org/D59386

llvm-svn: 356489
2019-03-19 17:53:56 +00:00
Simon Pilgrim a56f2822d0 [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

llvm-svn: 356468
2019-03-19 16:24:55 +00:00
Simon Pilgrim 8ee477a2ab [InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undef
As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand:

When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction)
When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst).

Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef.

The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968.

Differential Revision: https://reviews.llvm.org/D59541

llvm-svn: 356456
2019-03-19 14:08:23 +00:00
Nikita Popov 3e9770d2dc Revert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()"
This reverts commit 106f0cdefb.

This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests.

llvm-svn: 356424
2019-03-18 22:26:27 +00:00
Nikita Popov 106f0cdefb [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 356415
2019-03-18 21:35:19 +00:00
Nikita Popov f89343bc47 [ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFC
This is preparation for D59506. The InstructionSimplify abs handling
is moved into computeConstantRange(), which is the general place for
such calculations. This is NFC and doesn't affect the existing tests
in test/Transforms/InstSimplify/icmp-abs-nabs.ll.

Differential Revision: https://reviews.llvm.org/D59511

llvm-svn: 356409
2019-03-18 21:20:03 +00:00
Warren Ristow ad7d0ded2e [SCEV] Guard movement of insertion point for loop-invariants
This reinstates r347934, along with a tweak to address a problem with
PHI node ordering that that commit created (or exposed). (That commit
was reverted at r348426, due to the PHI node issue.)

Original commit message:

r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.

This fixes PR30806.

Differential Revision: https://reviews.llvm.org/D57428

llvm-svn: 356392
2019-03-18 18:52:35 +00:00
Nikita Popov 322e2dbee1 [ValueTracking] Use ConstantRange overflow check for signed add; NFC
This is the same change as rL356290, but for signed add. It replaces
the existing ripple logic with the overflow logic in ConstantRange.

This is NFC in that it should return NeverOverflow in exactly the
same cases as the previous implementation. However, it does make
computeOverflowForSignedAdd() more powerful by now also determining
AlwaysOverflows conditions. As none of its consumers handle this yet,
this has no impact on optimization. Making use of AlwaysOverflows
in with.overflow folding will be handled as a followup.

Differential Revision: https://reviews.llvm.org/D59450

llvm-svn: 356345
2019-03-17 21:25:26 +00:00
Nikita Popov ef2d979943 [ConstantRange] Add fromKnownBits() method
Following the suggestion in D59450, I'm moving the code for constructing
a ConstantRange from KnownBits out of ValueTracking, which also allows us
to test this code independently.

I'm adding this method to ConstantRange rather than KnownBits (which
would have been a bit nicer API wise) to avoid creating a dependency
from Support to IR, where ConstantRange lives.

Differential Revision: https://reviews.llvm.org/D59475

llvm-svn: 356339
2019-03-17 20:24:02 +00:00
Nikita Popov 614b1bea97 [ValueTracking] Use ConstantRange overflow checks for unsigned add/sub; NFC
Use the methods introduced in rL356276 to implement the
computeOverflowForUnsigned(Add|Sub) functions in ValueTracking, by
converting the KnownBits into a ConstantRange.

This is NFC: The existing KnownBits based implementation uses the same
logic as the the ConstantRange based one. This is not the case for the
signed equivalents, so I'm only changing unsigned here.

This is in preparation for D59386, which will also intersect the
computeConstantRange() result into the range determined from KnownBits.

llvm-svn: 356290
2019-03-15 18:37:45 +00:00
Teresa Johnson 70ec64cb72 [ThinLTO] Restructure AliasSummary to contain ValueInfo of Aliasee
Summary:
The AliasSummary previously contained the AliaseeGUID, which was only
populated when reading the summary from bitcode. This patch changes it
to instead hold the ValueInfo of the aliasee, and always populates it.
This enables more efficient access to the ValueInfo (specifically in the
recent patch r352438 which needed to perform an index hash table lookup
using the aliasee GUID).

As noted in the comments in AliasSummary, we no longer technically need
to keep a pointer to the corresponding aliasee summary, since it could
be obtained by walking the list of summaries on the ValueInfo looking
for the summary in the same module. However, I am concerned that this
would be inefficient when walking through the index during the thin
link for various analyses. That can be reevaluated in the future.

By always populating this new field, we can remove the guard and special
handling for a 0 aliasee GUID when dumping the dot graph of the summary.

An additional improvement in this patch is when reading the summaries
from LLVM assembly we now set the AliaseeSummary field to the aliasee
summary in that same module, which makes it consistent with the behavior
when reading the summary from bitcode.

Reviewers: pcc, mehdi_amini

Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D57470

llvm-svn: 356268
2019-03-15 15:11:38 +00:00
Sanjay Patel de1d5d3675 [InstCombine] canonicalize funnel shift constant shift amount to be modulo bitwidth
The shift argument is defined to be modulo the bitwidth, so if that argument
is a constant, we can always reduce the constant to its minimal form to allow
better CSE and other follow-on transforms.

We need to be careful to ignore constant expressions here, or we will likely
infinite loop. I'm adding a general vector constant query for that case.

Differential Revision: https://reviews.llvm.org/D59374

llvm-svn: 356192
2019-03-14 19:22:08 +00:00
Alina Sbirlea 2d7458a351 [MemorySSA] Remove redundant walker assignment [NFC].
Subscribers: llvm-commits
llvm-svn: 356189
2019-03-14 18:45:17 +00:00
Teresa Johnson 4ab0a9f0a4 [SCEV] Use depth limit for trunc analysis
Summary:
This fixes an extremely long compile time caused by recursive analysis
of truncs, which were not previously subject to any depth limits unlike
some of the other ops. I decided to use the same control used for
sext/zext, since the routines analyzing these are sometimes mutually
recursive with the trunc analysis.

Reviewers: mkazantsev, sanjoy

Subscribers: sanjoy, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58994

llvm-svn: 355949
2019-03-12 18:28:05 +00:00
Sjoerd Meijer 31ff647c1d [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.
This is addressing the issue that we're not modeling the cost of clib functions
in TTI::getIntrinsicCosts and thus we're basically addressing this fixme:
    
// FIXME: This is wrong for libc intrinsics.

To enable analysis of clib functions, we not only need an intrinsic ID and
formal arguments, but also the actual user of that function so that we can e.g.
look at alignment and values of arguments. So, this is the initial plumbing to
pass the user of an intrinsinsic on to getCallCosts, which queries
getIntrinsicCosts.

Differential Revision: https://reviews.llvm.org/D59014

llvm-svn: 355901
2019-03-12 09:48:02 +00:00
Sanjoy Das 3f5ce18658 Reland "Relax constraints for reduction vectorization"
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.

Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355889
2019-03-12 01:31:44 +00:00
Sanjoy Das 2136a5bc49 Revert "Relax constraints for reduction vectorization"
This reverts commit r355868.  Breaks hexagon.

llvm-svn: 355873
2019-03-11 22:37:31 +00:00
Sanjoy Das 93f8cc186a Relax constraints for reduction vectorization
Summary:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355868
2019-03-11 21:36:41 +00:00
Nikita Popov 490975979b [ValueTracking] Move constant range computation into ValueTracking; NFC
InstructionSimplify currently has some code to determine the constant
range of integer instructions for some simple cases. It is used to
simplify icmps.

This change moves the relevant code into ValueTracking as
llvm::computeConstantRange(), so it can also be reused for other
purposes.

In particular this is with the optimization of overflow checks in
mind (ref D59071), where constant ranges cover some cases that
known bits don't.

llvm-svn: 355781
2019-03-09 21:17:42 +00:00
Michael Kruse 65c5821e3f [RegionPass] Fix forgotten "!".
Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the
template

   return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...));

for all pass kinds. For the RegionPass, it left out the not operator,
causing region passes to be skipped as soon as a pass gate is used.

llvm-svn: 355733
2019-03-08 21:03:06 +00:00
George Burgess IV 4ea679f1f4 [CFLAnders] Fix typo in comment; NFC
Patch by Enna1!

Differential Revision: https://reviews.llvm.org/D58756

llvm-svn: 355715
2019-03-08 19:28:55 +00:00
Clement Courbet 8e16d73346 [SelectionDAG] Allow the user to specify a memeq function.
Summary:
Right now, when we encounter a string equality check,
e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
small compile-time constant, and fall back on calling `memcmp()` else.

This is sub-optimal because memcmp has to compute much more than
equality.

This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
that support `bcmp`.

`bcmp` can be made much more efficient than `memcmp` because equality
compare is trivially parallel while lexicographic ordering has a chain
dependency.

Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits

Differential Revision: https://reviews.llvm.org/D56593

llvm-svn: 355672
2019-03-08 09:07:45 +00:00
Fangrui Song 9ade843ccb [IDF] Delete a redundant J-edge test
In the DJ-graph based computation of iterated dominance frontiers,
SuccNode->getIDom() == Node is one of the tests to check if (Node,Succ)
is a J-edge. If it is true, since Node is dominated by Root,

  SuccLevel = level(Node)+1 > RootLevel

which means the next test SuccLevel > RootLevel will also be true. test
the check is redundant and can be deleted as it also involves one
indirection and provides no speed-up.

llvm-svn: 355589
2019-03-07 11:42:59 +00:00
David Green 4511f3fa86 [SCEV] Ensure that isHighCostExpansion takes into account what is being divided
A SCEV is not low-cost just because you can divide it by a power of 2. We need to also
check what we are dividing to make sure it too is not a high-code expansion. This helps
to not expand the exit value of certain loops, helping not to bloat the code.

The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116,
and looks a lot like the other tests in replace-loop-exit-folds.ll.

Differential Revision: https://reviews.llvm.org/D58435

llvm-svn: 355393
2019-03-05 12:12:18 +00:00
Sanjoy Das 719e78631d PHI nodes are not `FPMathOperator` s
Reviewers: chandlerc, arsenm

Reviewed By: arsenm

Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58887

llvm-svn: 355362
2019-03-05 01:15:08 +00:00
Sanjay Patel 2a70703770 [ValueTracking] do not try to peek through bitcasts in computeKnownBitsFromAssume()
There are no tests for this case, and I'm not sure how it could ever work,
so I'm just removing this option from the matcher. This should fix PR40940:
https://bugs.llvm.org/show_bug.cgi?id=40940

llvm-svn: 355292
2019-03-03 18:59:33 +00:00
Fangrui Song 5fa53d1593 [DemandedBits] Remove some redundancy in the work list
InputIsKnownDead check is shared by all operands. Compute it once.

For non-integer instructions, use Visited.insert(I).second to replace a
find() and an insert().

llvm-svn: 355290
2019-03-03 14:50:01 +00:00
Fangrui Song 981f216d1d [DemandedBits] Optimize a find()+insert pattern with try_emplace and APInt::operator|=
llvm-svn: 355284
2019-03-03 11:12:57 +00:00
Florian Hahn 98f11a7d75 [SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.
In some cases, MaxBECount can be less precise than ExactBECount for AND
and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are
undef, but MaxBECount are different, so we hit the assertion below. This
patch uses the same solution the AND case already uses.

Assertion failed:
   ((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken))
     && "Exact is not allowed to be less precise than Max"), function ExitLimit

This patch also consolidates test cases for both AND and OR in a single
test case.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245

Reviewers: sanjoy, efriedma, mkazantsev

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D58853

llvm-svn: 355259
2019-03-02 02:31:44 +00:00
Florian Hahn 3c7e92b5d6 [SCEV] Remove undef check for SCEVConstant (NFC)
The value stored in SCEVConstant is of type ConstantInt*, which can
never be UndefValue. So we should never hit that code.

Reviewers: mkazantsev, sanjoy

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D58851

llvm-svn: 355257
2019-03-02 01:57:28 +00:00
Nikita Popov ed3ca9272f [ValueTracking] Known bits support for unsigned saturating add/sub
We have two sources of known bits:

1. For adds leading ones of either operand are preserved. For sub
leading zeros of LHS and leading ones of RHS become leading zeros in
the result.

2. The saturating math is a select between add/sub and an all-ones/
zero value. As such we can carry out the add/sub known bits
calculation, and only preseve the known one/zero bits respectively.

Differential Revision: https://reviews.llvm.org/D58329

llvm-svn: 355223
2019-03-01 20:07:04 +00:00
Jonas Hahnfeld e071cd86df Hide two unused debugging methods, NFCI.
GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and
StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used
in Release builds. Hide them behind 'ifndef NDEBUG'.

llvm-svn: 355205
2019-03-01 17:15:21 +00:00
Rong Xu a6ff69f6dd [PGO] Context sensitive PGO (part 2)
Part 2 of CSPGO changes (mostly related to ProfileSummary).
Note that I use a default parameter in setProfileSummary() and getSummary().
This is to break the dependency in clang. I will make the parameter explicit
after changing clang in a separated patch.

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 355131
2019-02-28 19:55:07 +00:00
Nikita Popov af2b0bef43 [ValueTracking] More accurate unsigned sub overflow detection
Second part of D58593.

Compute precise overflow conditions based on all known bits, rather
than just the sign bits. Unsigned a - b overflows iff a < b, and we
can determine whether this always/never happens based on the minimal
and maximal values achievable for a and b subject to the known bits
constraint.

llvm-svn: 355109
2019-02-28 18:04:20 +00:00
Bjorn Pettersson d30f308a9f Add support for computing "zext of value" in KnownBits. NFCI
Summary:
The description of KnownBits::zext() and
KnownBits::zextOrTrunc() has confusingly been telling
that the operation is equivalent to zero extending the
value we're tracking. That has not been true, instead
the user has been forced to explicitly set the extended
bits as known zero afterwards.

This patch adds a second argument to KnownBits::zext()
and KnownBits::zextOrTrunc() to control if the extended
bits should be considered as known zero or as unknown.

Reviewers: craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58650

llvm-svn: 355099
2019-02-28 15:45:29 +00:00
Nikita Popov 6c57395fb4 [ValueTracking] More accurate unsigned add overflow detection
Part of D58593.

Compute precise overflow conditions based on all known bits, rather
than just the sign bits. Unsigned a + b overflows iff a > ~b, and we
can determine whether this always/never happens based on the minimal
and maximal values achievable for a and ~b subject to the known bits
constraint.

llvm-svn: 355072
2019-02-28 08:11:20 +00:00
Richard Trieu b37a70f40e Fix IR/Analysis layering issue with OptBisect
OptBisect is in IR due to LLVMContext using it.  However, it uses IR units from
Analysis as well.  This change moves getDescription functions from OptBisect
to their respective IR units.  Generating names for IR units will now be up
to the callers, keeping the Analysis IR units in Analysis.  To prevent
unnecessary string generation, isEnabled function is added so that callers know
when the description needs to be generated.

Differential Revision: https://reviews.llvm.org/D58406

llvm-svn: 355068
2019-02-28 04:00:55 +00:00
Alina Sbirlea fcfa7c5f92 [MemorySSA] Make insertDef insert corresponding phi nodes.
Summary:
The original assumption for the insertDef method was that it would not
materialize Defs out of no-where, hence it will not insert phis needed
after inserting a Def.

However, when cloning an instruction (use case used in LICM), we do
materialize Defs "out of no-where". If the block receiving a Def has at
least one other Def, then no processing is needed. If the block just
received its first Def, we must check where Phi placement is needed.
The only new usage of insertDef is in LICM, hence the trigger for the bug.

But the original goal of the method also fails to apply for the move()
method. If we move a Def from the entry point of a diamond to either the
left or right blocks, then the merge block must add a phi.
While this usecase does not currently occur, or may be viewed as an
incorrect transformation, MSSA must behave corectly given the scenario.

Resolves PR40749 and PR40754.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58652

llvm-svn: 355040
2019-02-27 22:20:22 +00:00
Sanjay Patel 9dada83d6c [InstSimplify] remove zero-shift-guard fold for general funnel shift
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html

We can't remove the compare+select in the general case because
we are treating funnel shift like a standard instruction (as
opposed to a special instruction like select/phi).

That means that if one of the operands of the funnel shift is
poison, the result is poison regardless of whether we know that
the operand is actually unused based on the instruction's
particular semantics.

The motivating case for this transform is the more specific
rotate op (rather than funnel shift), and we are preserving the
fold for that case because there is no chance of introducing
extra poison when there is no anonymous extra operand to the
funnel shift.

llvm-svn: 354905
2019-02-26 18:26:56 +00:00
Simon Pilgrim a066f1f9e6 [Vectorizer] Add vectorization support for fixed smul/umul intrinsics
This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument.

Differential Revision: https://reviews.llvm.org/D58616

llvm-svn: 354790
2019-02-25 15:42:02 +00:00
Jordan Rupprecht 6387fa2715 [NFC] Fix typos: preceeding -> preceding
llvm-svn: 354715
2019-02-23 01:28:32 +00:00
Chijun Sima 70e97163e0 [DTU] Refine the interface and logic of applyUpdates
Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.

Logic changes:

Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.

Interface changes:
The second semantic of `applyUpdates`  is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.

Reviewers: kuhar, brzycki, dmgreen, grosser

Reviewed By: brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58170

llvm-svn: 354669
2019-02-22 13:48:38 +00:00
Sanjay Patel 68171e3cd6 [InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201

The previous attempt at this in rL354406 had a logic bug that
actually triggered a regression test failure, but I failed to
notice it the first time.

llvm-svn: 354467
2019-02-20 14:34:00 +00:00
Sanjay Patel 49f97395ab Revert "[InstSimplify] use any-zero matcher for fcmp folds"
This reverts commit 058bb83513.
Forgot to update another test affected by this change.

llvm-svn: 354408
2019-02-20 00:20:38 +00:00
Sanjay Patel 058bb83513 [InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201

llvm-svn: 354406
2019-02-20 00:09:50 +00:00
Sam Parker 0b53e8454b [BPI] Look through bitcasts in calcZeroHeuristic
Constant hoisting may have hidden a constant behind a bitcast so that
it isn't folded into its users. However, this prevents BPI from
calculating some of its heuristics that are based upon constant
values. So, I've added a simple helper function to look through these
casts.

Differential Revision: https://reviews.llvm.org/D58166

llvm-svn: 354119
2019-02-15 11:50:21 +00:00
Nick Desaulniers 6a84cd3b8e Revert "[INLINER] allow inlining of address taken blocks"
This reverts commit 19e95fe611.

llvm-svn: 354082
2019-02-14 23:42:21 +00:00
Nick Desaulniers 19e95fe611 [INLINER] allow inlining of address taken blocks
as long as their uses does not contain calls to functions that capture
the argument (potentially allowing the blockaddress to "escape" the
lifetime of the caller).

TODO:
- add more tests
- fix crash in llvm::updateCGAndAnalysisManagerForFunctionPass when
  invoking Transforms/Inline/blockaddress.ll

llvm-svn: 354079
2019-02-14 23:35:53 +00:00
Max Kazantsev 24383cd7bb Make widenable condition transparent for MemoryWriteTracking
Side effects of widenable condition intrinsic are modelled via
InaccessibleMemOnly, and there is no way to say that it isn't
really writing any memory. This patch teaches MemoryWriteTracking
ignore this intrinsic.

llvm-svn: 354021
2019-02-14 11:10:29 +00:00
Max Kazantsev b3168a400f Teach isGuaranteedToTransferExecutionToSuccessor about widenable conditions
Widenable condition intrinsic is guaranteed to return value, notify
the isGuaranteedToTransferExecutionToSuccessor function about it.

llvm-svn: 354020
2019-02-14 11:10:21 +00:00
Max Kazantsev 4a1c02987e [NFC] Simplify code & reduce nest slightly
llvm-svn: 353832
2019-02-12 11:31:46 +00:00
Evandro Menezes f4a369596f [TargetLibraryInfo] Update run time support for Windows
It seems that, since VC19, the `float` C99 math functions are supported for all
targets, unlike the C89 ones.

According to the discussion at https://reviews.llvm.org/D57625.

llvm-svn: 353758
2019-02-11 22:12:01 +00:00
Alina Sbirlea d77edc00a8 [MemorySSA] Remove verifyClobberSanity.
Summary:
This verification may fail after certain transformations due to
BasicAA's fragility. Added a small explanation and a testcase that
triggers the assert in checkClobberSanity (before its removal).
Addresses PR40509.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, llvm-commits, Prazek

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57973

llvm-svn: 353739
2019-02-11 19:51:21 +00:00
Michael Kruse 77a614a6e1 Refactor setAlreadyUnrolled() and setAlreadyVectorized().
Loop::setAlreadyUnrolled() and
LoopVectorizeHints::setLoopAlreadyUnrolled() both add loop metadata that
stops the same loop from being transformed multiple times. This patch
merges both implementations.

In doing so we fix 3 potential issues:

 * setLoopAlreadyUnrolled() kept the llvm.loop.vectorize/interleave.*
   metadata even though it will not be used anymore. This already caused
   problems such as http://llvm.org/PR40546. Change the behavior to the
   one of setAlreadyUnrolled which deletes this loop metadata.

 * setAlreadyUnrolled() used to create a new LoopID by calling
   MDNode::get with nullptr as the first operand, then replacing it by
   the returned references using replaceOperandWith. It is possible
   that MDNode::get would instead return an existing node (due to
   de-duplication) that then gets modified. To avoid, use a fresh
   TempMDNode that does not get uniqued with anything else before
   replacing it with replaceOperandWith.

 * LoopVectorizeHints::matchesHintMetadataName() only compares the
   suffix of the attribute to set the new value for. That is, when
   called with "enable", would erase attributes such as
   "llvm.loop.unroll.enable", "llvm.loop.vectorize.enable" and
   "llvm.loop.distribute.enable" instead of the one to replace.
   Fortunately, function was only called with "isvectorized".

Differential Revision: https://reviews.llvm.org/D57566

llvm-svn: 353738
2019-02-11 19:45:44 +00:00
Evandro Menezes 4b86c474ff [TargetLibraryInfo] Update run time support for Windows
It seems that the run time for Windows has changed and supports more math
functions than it used to, especially on AArch64, ARM, and AMD64.

Fixes PR40541.

Differential revision: https://reviews.llvm.org/D57625

llvm-svn: 353733
2019-02-11 19:02:28 +00:00