Commit Graph

9281 Commits

Author SHA1 Message Date
Tyker 821a0f23d8 [AssumeBundles] Prevent generation of some redundant assumes
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78014
2020-05-10 19:23:59 +02:00
Florian Hahn 8528186b9b [LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460
2020-05-10 17:39:26 +01:00
Simon Pilgrim d5a2870a6e CodeMetrics.cpp - remove unused includes. NFC. 2020-05-10 16:59:55 +01:00
Florian Hahn 96c63f544f Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."
The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f28.
2020-05-10 15:19:57 +01:00
Florian Hahn fc44617f28 Revert "[LAA] Remove one addRuntimeChecks function (NFC)."
This reverts commit c28114c8ff.

This causes some bots to fail:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio
2020-05-10 13:28:00 +01:00
Florian Hahn c28114c8ff [LAA] Remove one addRuntimeChecks function (NFC).
In order to reduce the API surface area (preparation for D78460), remove
a addRuntimeChecks() function and do the additional check in the single
caller.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79679
2020-05-10 12:48:55 +01:00
Florian Hahn 57fb56b30e [LAA] Remove unneeded PtrRtChecking argument (NFC).
The argument is not required and simplifies D78460 a bit.
2020-05-09 22:26:54 +01:00
Wei Mi aa2ddfc73d [SampleFDO] For functions without profiles, provide an option to put
them in a special text section.

For sampleFDO, because the optimized build uses profile generated from
previous release, previously we couldn't tell a function without profile
was truely cold or just newly created so we had to treat them conservatively
and put them in .text section instead of .text.unlikely. The result was when
we persuing the best performance by locking .text.hot and .text in memory,
we wasted a lot of memory to keep cold functions inside.

In https://reviews.llvm.org/D66374, we introduced profile symbol list to
discriminate functions being cold versus functions being newly added.
This mechanism works quite well for regular use cases in AutoFDO. However,
in some case, we can only have a partial profile when optimizing a target.
The partial profile may be an aggregated profile collected from many targets.
The profile symbol list method used for regular sampleFDO profile is not
applicable to partial profile use case because it may be too large and
introduce many false positives.

To solve the problem for partial profile use case, we provide an option called
--profile-unknown-in-special-section. For functions without profile, we will
still treat them conservatively in compiler optimizations -- for example,
treat them as warm instead of cold in inliner. When we use profile info to
add section prefix for functions, we will discriminate functions known to be
not cold versus functions without profile (being unknown), and we will put
functions being unknown in a special text section called .text.unknown.
Runtime system will have the flexibility to decide where to put the special
section in order to achieve a balance between performance and memory saving.

Differential Revision: https://reviews.llvm.org/D62540
2020-05-08 11:18:09 -07:00
Nikita Popov 5a2265647e Reapply [InstSimplify] Remove known bits constant folding
No changes relative to last time, but after a mitigation for
an AMDGPU regression landed.

---

If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-08 10:24:53 +02:00
Huihui Zhang 1ec0cc0f02 [InstCombine][SVE] Fix visitExtractElementInst for scalable type.
Summary:
This patch fix the following issues with visitExtractElementInst:

      1. Restrict VectorUtils::findScalarElement to fixed-length vector.
         For scalable type, the number of elements in shuffle mask is
         unknown at compile-time.
      2. Fix out-of-range calculation for fixed-length vector.
      3. Skip scalable type when analysis rely on fixed number of elements.
      4. Add unit tests to check functionality of extractelement for scalable type.

Reviewers: sdesmalen, efriedma, spatel, nikic

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78267
2020-05-07 13:03:52 -07:00
Hiroshi Yamauchi 1b4e3def03 [BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare.
Summary:
This helps detect some missed BFI updates during CodeGenPrepare.

This is debug build only and disabled behind a flag.

Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts().

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77417
2020-05-07 11:58:00 -07:00
Christopher Tetreault 782231ac79 [SVE] Fix invalid uses of VectorType::getNumElements() in ValueTracking
Summary:
Any function in this module that make use of DemandedElts laregely does
not work with scalable vectors. DemandedElts is used to define which
elements of the vector to look at. At best, for scalable vectors, we can
express the first N elements of the vector. However, in practice, most
code that uses these functions expect to be able to talk about the
entire vector. In principle, this module should be able to be extended
to work with scalable vectors. However, before we can do that, we should
ensure that it does not cause code with scalable vectors to miscompile.
All functions that use a DemandedElts will bail out if the vector is
scalable. Usages of getNumElements() are updated to go through
FixedVectorType pointers.

Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, spatel

Reviewed By: efriedma

Subscribers: david-arm, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79053
2020-05-06 10:06:06 -07:00
Sanjay Patel a954b8a363 [ValueTracking] fix CannotBeNegativeZero() to disregard 'nsz' FMF
The 'nsz' flag is different than 'nnan' or 'ninf' in that it does not create poison.
Make that explicit in the LangRef and fix ValueTracking analysis that misinterpreted
the definition.

This manifests as bugs in InstSimplify shown in the test diffs and as discussed in
PR45778:
https://bugs.llvm.org/show_bug.cgi?id=45778

Differential Revision: https://reviews.llvm.org/D79422
2020-05-05 16:04:59 -04:00
Simon Pilgrim 4e3c005554 [TTI] getScalarizationOverhead - use explicit VectorType operand
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions).

Followup to D78357

Reviewed By: @samparker

Differential Revision: https://reviews.llvm.org/D79341
2020-05-05 16:59:23 +01:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Nikita Popov 46ee652c70 Revert "[InstSimplify] Remove known bits constant folding"
This reverts commit 08556afc54.

This breaks some AMDGPU tests.
2020-05-03 20:45:10 +02:00
Nikita Popov 08556afc54 [InstSimplify] Remove known bits constant folding
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-03 20:26:58 +02:00
Nikita Popov 8148b11647 [ValueTracking] Short-circuit GEP known bits calculation (NFC)
Don't compute known bits of all GEP operands, if we already know
that we don't know anything.
2020-05-02 12:29:26 +02:00
Sanjay Patel 57f0eed98d [InstSimplify] allow insertelement-with-undef fold if poison-safe
The more general fold was not poison-safe, so it was removed:
rG5486e00
...but it is ok to have this transform if analysis can determine
the vector contains no poison. The test shows a simple example
of that: constant integer elements are not poison.
2020-05-01 10:34:29 -04:00
Sanjay Patel 5486e00dc3 [InstSimplify] remove poison-unsafe insertelement of undef value
PR45481:
https://bugs.llvm.org/show_bug.cgi?id=45481

SDAG has an identical transform to this, so there's little
chance of any real-world impact. OTOH, that means we are
effectively sweeping the bug out of sight because poison
exists in codegen too.
2020-05-01 09:22:05 -04:00
Arthur Eubanks a90948fd6e [NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79152
2020-04-30 09:42:13 -07:00
Kirill Naumov 0383253cdf [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.
This is a recommit as the original commit fail due to the absence of REQUIRES: assert in the test.

Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D79107
2020-04-30 15:38:36 +00:00
Evgeniy Brevnov bb0842a3f1 [BPI] Incorrect probability reported in case of mulptiple edges.
Summary:
By design 'BranchProbabilityInfo:: getEdgeProbability(const BasicBlock *Src, const BasicBlock *Dst) const' should return sum of probabilities over all edges from Src to Dst. Current implementation is buggy and returns 1/num_of_successors if probabilities are not explicitly set.

Note current implementation of BPI printing has an issue as well and annotates each edge with sum of probabilities over all ages from one basic block to another. That's why 30% probability reported (instead of 10%) in the lit test. This is not urgent issue since only printing is affected.
Note also current implementation assumes that either all or none edges have probabilities set. This is not the only place which uses such assumption. At least we should assert that in verifier. In addition we can think on a more robust API of BPI which would prevent situations.

Reviewers: skatkov, yrouban, taewookoh

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79071
2020-04-30 11:41:03 +07:00
Evgeniy Brevnov 3e68a66704 [BPI][NFC] Reuse post dominantor tree from analysis manager when available
Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager.

Reviewers: skatkov, taewookoh, yrouban

Reviewed By: skatkov

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78987
2020-04-30 11:31:03 +07:00
Kirill Naumov 0fa793e798 Revert "[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot"
This reverts commit 66947d05fd.
2020-04-29 22:00:51 +00:00
Alina Sbirlea 161ccfe5ba [MemorySSA] Pass DT to the upward iterator for proper PhiTranslation.
Summary:
A valid DominatorTree is needed to do PhiTranslation.
Before this patch, a MemoryUse could be optimized to an access outside a loop, while the address it loads from is modified in the loop.
This can lead to a miscompile.

Reviewers: george.burgess.iv

Subscribers: Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79068
2020-04-29 14:28:31 -07:00
Kirill Naumov 055f58fcfc [CFG] Turning on Heat Colors for CFG by default
This option seems to be very useful, so let's turn it on by default

Reviewed-By: davidxl
Diff: https://reviews.llvm.org/D79110
2020-04-29 20:44:10 +00:00
Kirill Naumov 66947d05fd [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.

Reviewed-By: mtrofin
Diff: https://reviews.llvm.org/D79107
2020-04-29 20:44:10 +00:00
Simon Pilgrim 090cae8491 [TTI] Add DemandedElts to getScalarizationOverhead
The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.

This patch does 2 things:
1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.

This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.

A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.

Reviewed By: @craig.topper

Differential Revision: https://reviews.llvm.org/D78216
2020-04-29 12:00:38 +01:00
Florian Hahn 616657b39c [LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.

Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78458
2020-04-28 21:47:31 +01:00
David Blaikie 1b56980845 MustBeExecutedContextPrinter::runOnModule: Use unique_ptr to simplify/clarify ownership 2020-04-28 11:30:53 -07:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Mircea Trofin 011a07c075 Fix missing namespace in API implementation. 2020-04-27 21:05:33 -07:00
Mircea Trofin cb56e9b923 [llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo
Summary:
getProfileCount requires the parameter be a valid CallBase, and its uses
reflect that.

Reviewers: dblaikie, craig.topper, wmi

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78940
2020-04-27 20:47:52 -07:00
Mircea Trofin 8a4013ed38 [llvm][NFC] Add an explicit 'ComputeFullInlineCost' API
Summary:
Added getInliningCostEstimate, which is essentially what getInlineCost
computes if passed default inlining params, and  non-null ORE or
InlineParams::ComputeFullInlineCost.

Reviewers: davidxl, eraman, jdoerfert

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78730
2020-04-27 09:11:45 -07:00
Hongtao Yu 93efe25ab3 [ViewCFG] Allow printing edge weights in debuggers
Summary:
Extending the Function::viewCFG prototypes to allow for printing block probability info in form of .dot files during debugging.

Also avoiding an AV when no BFI/BPI available.

Reviewers: wenlei, davidxl, knaumov

Reviewed By: wenlei, davidxl

Subscribers: MaskRay, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77978
2020-04-26 13:18:29 -07:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Nikita Popov 2b2827552a [CaptureTracking] Make MaxUsesToExplore cheaper (NFC)
The change in D78624 had a noticeable negative compile-time impact.
It seems that going through a function call for the MaxUsesToExplore
default is fairly expensive, at least if LLVM is not built with LTO.

This patch makes MaxUsesToExpore default to 0 and assigns the actual
default in the implementation instead. This recovers most of the
regression.

Differential Revision: https://reviews.llvm.org/D78734
2020-04-26 09:54:15 +02:00
Juneyoung Lee f5677fe700 [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into more constants/instructions
Summary:
This patch helps isGuaranteedNotToBeUndefOrPoison look into more constants and instructions (bitcast/alloca/gep/fcmp).

To deal with bitcast, Depth is added to isGuaranteedNotToBeUndefOrPoison.

This patch is splitted from https://reviews.llvm.org/D75808.

Checked with Alive2

Reviewers: reames, jdoerfert

Reviewed By: jdoerfert

Subscribers: sanwou01, spatel, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76010
2020-04-25 23:29:54 +09:00
Florian Hahn 82ce334727 [ValueLattice] Merging unknown with empty CR is unknown.
Currently an unknown/undef value is marked as overdefined when merged
with an empty range. An empty range can occur in unreachable/dead code.
When merging the new unknown state (= no value known yet) with an empty
range, there still isn't any information about the value yet and we can
stay in unknown.

This gives a few nice improvements on the number of instructions removed
by IPSCCP:
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.IPNumInstRemoved

Program                                        base     patch    diff
 test-suite...rks/FreeBench/mason/mason.test     3.00   6.00 100.0%
 test-suite...nchmarks/McCat/18-imp/imp.test     3.00   5.00 66.7%
 test-suite...C/CFP2000/179.art/179.art.test     2.00   3.00 50.0%
 test-suite...ijndael/security-rijndael.test     2.00   3.00 50.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    40.00  58.00 45.0%
 test-suite...ce/Applications/Burg/burg.test    26.00  37.00 42.3%
 test-suite...cCat/03-testtrie/testtrie.test     3.00   4.00 33.3%
 test-suite...Source/Benchmarks/sim/sim.test    29.00  36.00 24.1%
 test-suite.../Applications/spiff/spiff.test     9.00  11.00 22.2%
 test-suite...s/FreeBench/neural/neural.test     5.00   6.00 20.0%
 test-suite...pplications/treecc/treecc.test    66.00  79.00 19.7%
 test-suite...langs-C/football/football.test    85.00 101.00 18.8%
 test-suite...ce/Benchmarks/PAQ8p/paq8p.test    90.00 105.00 16.7%
 test-suite...oxyApps-C++/miniFE/miniFE.test    37.00  43.00 16.2%
 test-suite...rks/FreeBench/pifft/pifft.test    26.00  30.00 15.4%
 test-suite...lications/sqlite3/sqlite3.test   481.00  548.00  13.9%
 test-suite...marks/7zip/7zip-benchmark.test   4875.00 5522.00 13.3%
 test-suite.../CINT2000/176.gcc/176.gcc.test   1117.00 1197.00  7.2%
 test-suite...0.perlbench/400.perlbench.test   1618.00 1732.00  7.0%

Reviewers: efriedma, nikic, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78667
2020-04-25 13:43:34 +01:00
Tyker e5f8a77c19 [AssumeBundles] Refactor asssume builder
Summary:
refactor assume bulider for the next patch.
the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78013
2020-04-25 13:43:52 +02:00
Tyker 42431da895 [AssumeBundles] Use assume bundles in isKnownNonZero
Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero

Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1

Reviewed By: jdoerfert

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76149
2020-04-24 20:41:51 +02:00
Mircea Trofin b8960b5d81 [llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78789
2020-04-23 22:19:39 -07:00
Mircea Trofin 2059a6e3ef [llvm][NFC][CallSite] Remove ImmutableCallSite from a few locations
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78783
2020-04-23 21:18:44 -07:00
Eli Friedman 3291efc2b3 [ValueTracking] Handle shufflevector constants in ComputeNumSignBits
Differential Revision: https://reviews.llvm.org/D78688
2020-04-23 17:47:37 -07:00
James Y Knight 248a5db3f2 Change callbr to only define its output SSA variable on the normal
path, not the indirect targets.

Fixes: PR45565.

Differential Revision: https://reviews.llvm.org/D78341
2020-04-23 19:36:44 -04:00
Craig Topper d6c5daf0bf [CallSite removal][ValueTracking] Replace CallSite with CallBase. NFC" 2020-04-23 15:25:19 -07:00
Christopher Tetreault 9174e0229f [SVE] Remove calls to VectorType::isScalable from analysis
Reviewers: efriedma, sdesmalen, chandlerc, sunfish

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77692
2020-04-23 12:44:22 -07:00
Mircea Trofin 201498c6f3 [llvm][NFC] Factor out cost-model independent inling decision
Summary:
llvm::getInlineCost starts off by determining whether inlining should
happen or not because of user directives or easily determinable
unviability. This CL refactors this functionality as a reusable API.

Reviewers: davidxl, eraman

Reviewed By: davidxl, eraman

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73825
2020-04-23 10:58:43 -07:00
Mircea Trofin cea6f4d5f8 [llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78666
2020-04-23 08:23:16 -07:00
Sanjay Patel e86eff0e82 [InstSimplify] fold and/or of compares with equality to min/max constant
I found 12 (6 if we compress the DeMorganized forms) patterns for logic-of-compares
with a min/max constant while looking at PR45510:
https://bugs.llvm.org/show_bug.cgi?id=45510

The variations on those forms multiply the test cases by 8 (unsigned/signed, swapped
compare operands, commuted logic operands).
We have partial logic to deal with these for the unsigned min (zero) case, but
missed everything else.

We are deferring the majority of these patterns to InstCombine to allow more general
handling (see D78582).

We could use ConstantRange instead of predicate+constant matching here. I don't
expect there's any noticeable compile-time impact for either form.

Here's an abuse of Alive2 to show the 12 basic signed variants of the patterns in
one function:
http://volta.cs.utah.edu:8080/z/5Vpiyg

declare void @use(i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1)
define void @src(i8 %x, i8 %y)  {
  %m1 = icmp eq i8 %x, 127
  %c1 = icmp slt i8 %x, %y
  %r1 = and i1 %m1, %c1   ; (X == MAX) && (X < Y) --> false

  %m2 = icmp ne i8 %x, 127
  %c2 = icmp sge i8 %x, %y
  %r2 = or i1 %m2, %c2    ; (X != MAX) || (X >= Y) --> true

  %m3 = icmp eq i8 %x, -128
  %c3 = icmp sgt i8 %x, %y
  %r3 = and i1 %m3, %c3   ; (X == MIN) && (X > Y) --> false

  %m4 = icmp ne i8 %x, -128
  %c4 = icmp sle i8 %x, %y
  %r4 = or i1 %m4, %c4    ; (X != MIN) || (X <= Y) --> true

  %m5 = icmp eq i8 %x, 127
  %c5 = icmp sge i8 %x, %y
  %r5 = and i1 %m5, %c5   ; (X == MAX) && (X >= Y) --> X == MAX

  %m6 = icmp ne i8 %x, 127
  %c6 = icmp slt i8 %x, %y
  %r6 = or i1 %m6, %c6   ; (X != MAX) || (X < Y) --> X != MAX

  %m7 = icmp eq i8 %x, -128
  %c7 = icmp sle i8 %x, %y
  %r7 = and i1 %m7, %c7   ; (X == MIN) && (X <= Y) --> X == MIN

  %m8 = icmp ne i8 %x, -128
  %c8 = icmp sgt i8 %x, %y
  %r8 = or i1 %m8, %c8   ; (X != MIN) || (X > Y) --> X != MIN

  %m9 = icmp ne i8 %x, 127
  %c9 = icmp slt i8 %x, %y
  %r9 = and i1 %m9, %c9    ; (X != MAX) && (X < Y) --> X < Y

  %m10 = icmp eq i8 %x, 127
  %c10 = icmp sge i8 %x, %y
  %r10 = or i1 %m10, %c10    ; (X == MAX) || (X >= Y) --> X >= Y

  %m11 = icmp ne i8 %x, -128
  %c11 = icmp sgt i8 %x, %y
  %r11 = and i1 %m11, %c11    ; (X != MIN) && (X > Y) --> X > Y

  %m12 = icmp eq i8 %x, -128
  %c12 = icmp sle i8 %x, %y
  %r12 = or i1 %m12, %c12    ; (X == MIN) || (X <= Y) --> X <= Y

  call void @use(i1 %r1, i1 %r2, i1 %r3, i1 %r4, i1 %r5, i1 %r6, i1 %r7, i1 %r8, i1 %r9, i1 %r10, i1 %r11, i1 %r12)
  ret void
}

define void @tgt(i8 %x, i8 %y)  {
  %m5 = icmp eq i8 %x, 127
  %m6 = icmp ne i8 %x, 127
  %m7 = icmp eq i8 %x, -128
  %m8 = icmp ne i8 %x, -128
  %c9 = icmp slt i8 %x, %y
  %c10 = icmp sge i8 %x, %y
  %c11 = icmp sgt i8 %x, %y
  %c12 = icmp sle i8 %x, %y
  call void @use(i1 0, i1 1, i1 0, i1 1, i1 %m5, i1 %m6, i1 %m7, i1 %m8, i1 %c9, i1 %c10, i1 %c11, i1 %c12)
  ret void
}

Differential Revision: https://reviews.llvm.org/D78430
2020-04-23 09:16:10 -04:00
Serguei Katkov c0d2bbb1d4 [CaptureTracking] Replace hardcoded constant to option. NFC.
The motivation is to be able to play with the option and change if it is required.

Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert
Reviewed By: fedor.sergeev
Subscribers: hiraditya, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D78624
2020-04-23 18:23:35 +07:00
Juneyoung Lee aca335955c [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 08:08:53 +09:00
Juneyoung Lee 5ceef26350 Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison"
This reverts commit 80faa8c3af.
2020-04-23 08:07:09 +09:00
Juneyoung Lee 80faa8c3af RFC: [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 07:57:12 +09:00
Craig Topper be04aba6fc [CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC
Differential Revision: https://reviews.llvm.org/D78613
2020-04-22 12:06:58 -07:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Sanjay Patel cf30aafa2d [Analysis] recognize the 'null' pointer constant as not poison
Differential Revision: https://reviews.llvm.org/D78575
2020-04-21 14:23:06 -04:00
Simon Pilgrim 3caa03ec51 AliasAnalysisSummary.h - cleanup includes and forward declarations. NFC.
Push InstrTypes.h include down to AliasAnalysisSummary.cpp
2020-04-21 11:32:58 +01:00
Sam Parker ee959ddc5e [TTI] Remove getOperationCost
This API call has been used recently with, a very valid, expectation
that it would do something useful but it doesn't actually query any
backend information. So, remove this method and merge its
functionality into getUserCost. As well as that, also use
getCastInstrCost to get a proper cost from the backend for the
concerned instructions though we only currently return the answer if
it's considered free. The default implementation now also checks
int/ptr conversions too, as well as truncs and bitcasts.

Differential Revision: https://reviews.llvm.org/D76124
2020-04-21 09:15:34 +01:00
Mircea Trofin 1809949239 [llvm][NFC][CallSite] Remove CallSite from Lint.cpp
Summary: The CallSite arg iterator is really User::op_iterator.

Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78507
2020-04-20 12:29:11 -07:00
Nikita Popov 54d01cbc15 [IPT] Don't use OrderedInstructions (NFC)
Use Instruction::comesBefore() instead of OrderedInstructions
inside InstructionPrecedenceTracking. This also removes the
dominator tree dependency.

Differential Revision: https://reviews.llvm.org/D78461
2020-04-20 18:25:31 +02:00
Florian Hahn 2737362e7a [VectorUtils] Use early_inc_range instead of DelSet (NFC).
DelSet was used to avoid invalidating the current iterator while
modifying the map we are iterating over.

By using an early_inc_range, (which increments to iterator 'early',
allowing us to remove the current element), we can get rid of DelSet.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78420
2020-04-20 16:36:26 +01:00
Sam Parker e3056ae9a0 [NFC][TTI] Explicit use of VectorType
The API for shuffles and reductions uses generic Type parameters,
instead of VectorType, and so assertions and casts are used a lot.
This patch makes those types explicit, which means that the clients
can't be lazy, but results in less ambiguity, and that can only be a
good thing.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562

Differential Revision: https://reviews.llvm.org/D78357
2020-04-20 09:16:52 +01:00
Craig Topper 252873879e [CallSite removal][Analysis] Replace CallSite with CallBase in MemoryBuiltins. NFC
Differential Revision: https://reviews.llvm.org/D78449
2020-04-19 18:32:48 -07:00
Craig Topper dff18c79f2 [CallSite removal][Lint] Replace visitCallSite with visitCallBase. NFC
Drop visitCallInst/visitInvokeInst since the generic implementation
will delegate to visitCallBase. They appear to be from a time
before visitCallSite existed in the InstVisitor system.

Differential Revision: https://reviews.llvm.org/D78448
2020-04-19 18:32:40 -07:00
Nikita Popov f52e050757 [LVI] Use Optional instead of out parameter (NFC)
As suggested on D76788, this switches the LVI implementation to
return Optional<ValueLatticeElement> from various methods, instead
of passing in a ValueLatticeElement reference and returning a boolean.

Differential Revision: https://reviews.llvm.org/D78383
2020-04-19 21:17:43 +02:00
Mircea Trofin ec73ae11a3 [llvm][NFC][CallSite] Remove CallSite from ProfileSummary
Summary: Depends on D78395.

Reviewers: craig.topper, dblaikie, wmi, davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78414
2020-04-18 12:03:14 -07:00
Simon Pilgrim 9b95186c30 HeatUtils.h - remove unnecessary includes. NFC.
Replace with BlockFrequencyInfo/Function forward declarations
Move BlockFrequencyInfo.h include to HeatUtils.cpp
2020-04-18 13:37:06 +01:00
Florian Hahn 4ee45ab60f [LV] Invalidate cost model decisions along with interleave groups.
Cost-modeling decisions are tied to the compute interleave groups
(widening decisions, scalar and uniform values). When invalidating the
interleave groups, those decisions also need to be invalidated.

Otherwise there is a mis-match during VPlan construction.
VPWidenMemoryRecipes created initially are left around w/o converting them
into VPInterleave recipes. Such a conversion indeed should not take place,
and these gather/scatter recipes may in fact be right. The crux is leaving around
obsolete CM_Interleave (and dependent) markings of instructions along with
their costs, instead of recalculating decisions, costs, and recipes.

Alternatively to forcing a complete recompute later on, we could try
to selectively invalidate the decisions connected to the interleave
groups. But we would likely need to run the uniform/scalar value
detection parts again anyways and the extra complexity is probably not
worth it.

Fixes PR45572.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78298
2020-04-18 10:23:49 +01:00
Nikita Popov f715eda604 [LVI] Cleanup/unify cache access
This patch combines the "has" and "get" parts of the cache access.
getCachedValueInfo() now both sets the BBLV return argument, and
returns whether the value was found.

Additionally, the management of the work stack is now integrated
into getBlockValue(). If the value is not cached yet, we try to
push to the stack (and return false, indicating that we need to
solve first), or return overdefined in case of a cycle.

These changes a) avoid a duplicate cache lookup for has & get and
b) ensure that the logic is uniform everywhere. For this reason
this change is also not quite NFC, because previously overdefined
values from the cache, and overdefined values from a cycle received
different treatment when it came to assumption intersection.

Differential Revision: https://reviews.llvm.org/D76788
2020-04-17 18:44:38 +02:00
Benjamin Kramer 166467e822 [VectorUtils] Create shufflevector masks as int vectors instead of Constants
No functionality change intended.
2020-04-17 15:28:00 +02:00
Max Kazantsev 72c13446ce [NFC] Add missing 'const' notion to LCSSA-related functions
These functions don't really do any changes to loop info or
dominator tree. We should state this explicitly using 'const'.
2020-04-17 17:49:34 +07:00
Johannes Doerfert df675890b7 [CallGraphUpdater][NFC] Minor updates to D77855
I uploaded the old version accidentally instead of the one with these
minor adjustments requested by the reviewers.

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 21:26:35 -05:00
Johannes Doerfert 937025757c [CallGraphUpdater] Remove nodes from their SCC (old PM)
Summary:
We can and should remove deleted nodes from their respective SCCs. We
did not do this before and this was a potential problem even though I
couldn't locally trigger an issue. Since the `DeleteNode` would assert
if the node was not in the SCC, we know we only remove nodes from their
SCC and only once (when run on all the Attributor tests).

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 18:38:50 -05:00
Johannes Doerfert 1b34b84ddd [CallGraphUpdater] Update the ExternalCallingNode for node replacements
Summary:
While it is uncommon that the ExternalCallingNode needs to be updated,
it can happen. It is uncommon because most functions listed as callees
have external linkage, modifying them is usually not allowed. That said,
there are also internal functions that have, or better had, their
"address taken" at construction time. We conservatively assume various
uses cause the address "to be taken". Furthermore, the user might have
become dead at some point. As a consequence, transformations, e.g., the
Attributor, might be able to replace a function that is listed
as callee of the ExternalCallingNode.

Since there is no function corresponding to the ExternalCallingNode, we
did just remove the node from the callee list if we replaced it (so
far). Now it would be preferable to replace it if needed and remove it
otherwise. However, removing the node has implications on the CGSCC
iteration. Locally, that caused some other nodes to be never visited
but it is for sure possible other (bad) side effects can occur. As it
seems conservatively safe to keep the new node in the callee list we
will do that for now.

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77854
2020-04-15 18:38:50 -05:00
Simon Moll 2a0a26bd98 [nfc] clang-format TargetTransformInfo.cpp 2020-04-15 14:43:26 +02:00
Mircea Trofin 447e2c3067 [llvm][NFC][CallSite] Remove Implementation uses of CallSite
Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78142
2020-04-14 14:49:47 -07:00
Juneyoung Lee 994543abc9 [ValueTracking] Implement canCreatePoison
Summary:
This PR adds `canCreatePoison(Instruction *I)` which returns true if `I` can generate poison from non-poison
operands.

Reviewers: spatel, nikic, lebedev.ri

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits, regehr, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77890
2020-04-15 05:58:06 +09:00
Aaron Puchert e833e58300 [ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC
Reviewed By: fhahn, echristo

Differential Revision: https://reviews.llvm.org/D78061
2020-04-14 13:32:53 +02:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Mehdi Amini 384ca190ae Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object"
This reverts commit 10df1563d6.

Some buildbots are broken.
2020-04-14 00:27:08 +00:00
Mehdi Amini 10df1563d6 Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.

Differential Revision: https://reviews.llvm.org/D77994
2020-04-13 23:12:11 +00:00
Vedant Kumar 122a6bfb07 [Debugify] Strip added metadata in the -debugify-each pipeline
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.

As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77915
2020-04-13 10:55:17 -07:00
Simon Pilgrim ad57286232 CodeMetrics.h - include and forward declaration cleanup. NFC.
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
2020-04-13 13:09:39 +01:00
Tyker 813f438baa [AssumeBundles] adapt Assumption cache to assume bundles
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77402
2020-04-13 12:04:51 +02:00
Huihui Zhang 4bde7c5986 [NFC] Use VectorType::isScalable to align with ongoing VectorType refactor. 2020-04-12 15:39:13 -07:00
Sanjay Patel c23cbefd9d [VectorUtils] add IR-level analysis for widening of shuffle mask
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:

1. The existing x86 helper (canWidenShuffleElements) always tries to
   divide-by-2, so it gets called iteratively and wouldn't handle the
   general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
   that in IR, but this code should be safe to use with that or other
   special (negative) values.

The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.

Differential Revision: https://reviews.llvm.org/D77881
2020-04-12 10:14:19 -04:00
Sanjay Patel 1318ddbc14 [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Huihui Zhang 6c989d0248 [BasicAA] Fix aliasGEP/DecomposeGEPExpression for scalable type.
Summary:
Don't attempt to analyze the decomposed GEP for scalable type.
GEP index scale is not compile-time constant for scalable type.
Be conservative, return MayAlias.

Explicitly call TypeSize::getFixedSize() to assert on places where
scalable type doesn't make sense.

Add unit tests to check functionality of -basicaa for scalable type.

This patch is needed for D76944.

Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77828
2020-04-10 16:58:26 -07:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Christopher Tetreault b96558f5e5 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sunfish, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77273
2020-04-09 12:41:28 -07:00
Mircea Trofin b4924f01a4 [llvm][nfc] InstructionCostDetail encapsulation
Ensured initialized fields; encapsulad delta calulations and evaluation
of threshold having had changed; assertion for CostThresholdMap
dereference, to indicate design intent.

Differential Revision: https://reviews.llvm.org/D77762
2020-04-09 08:21:18 -07:00
Jay Foad c63aed890e [KnownBits] Move AND, OR and XOR logic into KnownBits
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.

This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74060
2020-04-09 10:10:37 +01:00
Jay Foad 94cc9eccf6 [ValueTracking] Simplify KnownBits construction
Use the simpler BitWidth constructor instead of the copy constructor to
make it clear when we don't actually need to copy an existing KnownBits
value. Split out from D74539. NFC.
2020-04-09 09:27:22 +01:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Kirill Naumov 8b67853a83 [CFGPrinter] Adding heat coloring to CFGPrinter
This patch introduces the heat coloring of the Control Flow Graph which is based
on the relative "hotness" of each BB. The patch is a part of sequence of three
patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77161
2020-04-08 19:59:51 +00:00
Nikita Popov fe8abbf442 [BPI] Clear handles when releasing memory (NFC)
This reduces max-rss of sqlite compilation by 2.5%.
2020-04-07 22:51:01 +02:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Kirill Naumov 3f995ce8b5 [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D76820
2020-04-06 17:42:54 +00:00
Sanjay Patel fbb1b43f13 [ValueTracking] enhance matching of umin/umax with 'not' operands
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0

This improves on the equivalent signed min/max change:
rG867f0c3c4d8c

The underlying icmp equivalence is:
  ~X pred ~Y --> Y pred X

For an icmp with constant, canonicalization results in a swapped pred:
  ~X < C -->  X > ~C
2020-04-06 11:51:59 -04:00
Sanjay Patel 867f0c3c4d [ValueTracking] enhance matching of smin/smax with 'not' operands
The cmyk tests are based on the known regression that resulted from:
rGf2fbdf76d8d0

So this improvement in analysis might be enough to restore that commit.
2020-04-05 08:54:12 -04:00
Florian Hahn 47ee404075 [ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC).
D51664 added Instruction::comesBefore which should provide better
performance than the manual check.

Reviewers: rnk, nikic, spatel

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76228
2020-04-05 12:38:04 +01:00
Nikita Popov a5eb1236e3 [IVDescriptors] Remove unnecessary DemandedBits.h include; NFC
Forward declare DemandedBits in IVDescriptors, and move include
into the cpp file. Also drop the include from LoopUtils, which
does not need it at all.
2020-04-04 12:07:57 +02:00
Alina Sbirlea 688450c7f0 [GraphDiff] Extend GraphDiff to track a list of updates.
Summary:
This patch includes two extensions:
1. It extends the GraphDiff to also keep the original list of updates
after legalization, not just the deletes/insert vectors.
It also provides an API to pop the first update (the updates are store
in reverse, such that the first update is at the end of the list)
2. It adds a bool to mark whether the given updates should be applied as
given, or applied in reverse. This moves the task of reversing the
updates (when the caller needs this) to a functionality inside
GraphDiff, versus having the caller do this.

The two changes could be split into two patches, but they seemed
reasonably small to be reviewed together.

Reviewers: kuhar, dblaikie

Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77167
2020-04-03 12:10:36 -07:00
Tyker c00cb76274 [NFC] Split Knowledge retention and place it more appropriatly
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77171
2020-04-02 15:01:41 +02:00
Jonas Paulsson 36d4421f50 [LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop.
This patch adds

- New arguments to getMinPrefetchStride() to let the target decide on a
  per-loop basis if software prefetching should be done even with a stride
  within the limit of the hw prefetcher.

- New TTI hook enableWritePrefetching() to let a target do write prefetching
  by default (defaults to false).

- In LoopDataPrefetch:

  - A search through the whole loop to gather information before emitting any
    prefetches. This way the target can get information via new arguments to
    getMinPrefetchStride() and emit prefetches more selectively. Collected
    information includes: Does the loop have a call, how many memory
    accesses, how many of them are strided, how many prefetches will cover
    them. This is NFC to before as long as the target does not change its
    definition of getMinPrefetchStride().

  - If a previous access to the same exact address was 'read', and the
    current one is 'write', make it a 'write' prefetch.

  - If two accesses that are covered by the same prefetch do not dominate
    each other, put the prefetch in a block that dominates both of them.

  - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.

- A SystemZ implementation of getMinPrefetchStride().

Review: Ulrich Weigand, Michael Kruse

Differential Revision: https://reviews.llvm.org/D70228
2020-04-02 14:57:46 +02:00
shchenz e344f8b9db Revert "[LSR] re-add testcase for wrongly phi node elimination - NFC"
This reverts commit f25a1b4f58.
ARM and hexagon fail at the new added case.
2020-04-01 12:58:06 +00:00
shchenz f25a1b4f58 [LSR] re-add testcase for wrongly phi node elimination - NFC
Retest the case on X86/SystemZ/AArch64/PowerPC
2020-04-01 11:11:17 +00:00
shchenz 8b8cd150a4 Revert "[LSR] add testcase for wrongly phi node elimination - NFC"
This reverts commit dbf5e4f6c7.
The testcase has different behaviour on PowerPC and X86.
2020-04-01 10:28:43 +00:00
shchenz dbf5e4f6c7 [LSR] add testcase for wrongly phi node elimination - NFC 2020-04-01 09:58:58 +00:00
Sam Parker 2641a19981 [TTI] Remove getCallCost
getCallCost is only used within the different layers of TTI, with no
backend implementing it so fold the base implementation into
getUserCost. I think this is an NFC.

Differential Revision: https://reviews.llvm.org/D77050
2020-04-01 09:05:25 +01:00
Craig Topper f92563f907 [VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template

Reviewers: spatel, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77183
2020-04-01 00:46:48 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Florian Hahn b37543750c [ValueLattice] Distinguish between constant ranges with/without undef.
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.

A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below

define i32 @f(i32 %a, i1 %c) {
  br i1 %c, label %true, label %false
true:
  %a.255 = and i32 %a, 255
  br label %exit
false:
  br label %exit
exit:
  %p = phi i32 [ %a.255, %true ], [ undef, %false ]
  %f.1 = icmp eq i32 %p, 300
  call void @use(i1 %f.1)
  %res = and i32 %p, 255
  ret i32 %res
}

In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the  second use might not be
< 256.

Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.

Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76931
2020-03-31 12:50:20 +01:00
Denis Antrushin 06c58f11a9 [SCEV] Use backedge SCEV of PHI only if its input is loop invariant
For the PHI node

      %1 = phi [%A, %entry], [%X, %latch]

it is incorrect to use SCEV of backedge val %X as an exit value
of PHI unless %X is loop invariant.
This is because exit value of %1 is value of %X at one-before-last
iteration of the loop.

Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D73181
2020-03-31 18:39:24 +07:00
Thomas Raoux 3ea0774b13 [ConstantFold][NFC] Compile time optimization for large vectors
Optimize the common case of splat vector constant. For large vector
going through all elements is expensive. For splatr/broadcast cases we
can skip going through all elements.

Differential Revision: https://reviews.llvm.org/D76664
2020-03-30 11:27:09 -07:00
Max Kazantsev 4e0d9925d6 [NFC] Remove obsolete checks followed by fix of isGuaranteedToTransferExecutionToSuccessor
In past, isGuaranteedToTransferExecutionToSuccessor contained some weird logic
for volatile loads/stores that was ultimately removed by patch D65375. It's time to
remove a piece of dependent logic that used to be a workaround for the code which
is now deleted.

Reviewed By: uenoku
Differential Revision: https://reviews.llvm.org/D76918
2020-03-30 12:24:41 +07:00
Uday Bondhugula c0955edfd6 Introduce support for lib function aligned_alloc in TLI / memory builtins
Aligned_alloc is a standard lib function and has been in glibc since
2.16 and in the C11 standard. It has semantics similar to malloc/calloc
for several analyses/transforms. This patch introduces aligned_alloc
in target library info and memory builtins. Subsequent ones will
make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062

This change will also be useful to LLVM generators that need to allocate
buffers of vector elements larger than 16 bytes (for eg. 256-bit ones),
element boundary alignment for which is not typically provided by glibc malloc.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76970
2020-03-29 23:36:24 +05:30
Serge Pavlov f398739152 [FEnv] Constfold some unary constrained operations
This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Differential Revision: https://reviews.llvm.org/D72930
2020-03-28 12:28:33 +07:00
Francesco Petrogalli c66d1f38f6 [llvm][Support] Add isZero method for TypeSize. [NFC]
Summary:
The method is used where TypeSize is implicitly cast to integer for
being checked against 0.

Reviewers: sdesmalen, efriedma

Reviewed By: sdesmalen, efriedma

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76748
2020-03-27 21:03:44 +00:00
Alina Sbirlea 3abcbf9903 [CFG/BasicBlock] Rename succ_const to const_succ. [NFC]
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.

Reviewers: nicholas, dblaikie, nlewycky

Subscribers: hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75952
2020-03-25 12:40:55 -07:00
Nikita Popov ec184dd548 [LVI] Convert some checks to assertions; NFC
solveBlockValue() should only be called if the value isn't cached
yet. Similarly, it does not make sense to "solve" a constant.
2020-03-24 23:11:13 +01:00
Sanjay Patel 88b493a838 [ValueTracking] improve undef/poison analysis for constant vectors
Differential Revision: https://reviews.llvm.org/D76702
2020-03-24 13:35:47 -04:00
Yaxun (Sam) Liu 314deab9af Add Triple::isAMDGPU
Differential Revision: https://reviews.llvm.org/D57707
2020-03-22 14:20:28 -04:00
Bjorn Pettersson d077d678d3 [ValueTracking] Avoid blind cast from Operator to Instruction
Summary:
Avoid blind cast from Operator to ExtractElementInst in
computeKnownBitsFromOperator. This resulted in some crashes
in downstream fuzzy testing. Instead we use getOperand directly
on the Operator when accessing the vector/index operands.

Haven't seen any problems with InsertElement and ShuffleVector,
but I believe those could be used in constant expressions as well.
So the same kind of fix as for ExtractElement was also applied for
InsertElement.

When it comes to ShuffleVector we now simply bail out if a dynamic
cast of the Operator to ShuffleVectorInst fails. I've got no
reproducer indicating problems for ShuffleVector, and a fix would be
slightly more complicated as getShuffleDemandedElts is involved.

Reviewers: RKSimon, nikic, spatel, efriedma

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76564
2020-03-22 14:45:31 +01:00
Nikita Popov dbf78ae128 [LVI] Use SmallDenseMap for getValueFromCondition(); NFC
For the common case, we will only insert one value into this map.
2020-03-22 10:38:44 +01:00
Nikita Popov 7a62ea3889 [ValueTracking] Short-circuit computeKnownBitsAddSub(); NFCI
If one operand is unknown (and we don't have nowrap), don't compute
the second operand.

Also don't create an unnecessary extra KnownBits variable, it's
okay to reuse KnownOut.

This reduces instructions on libclamav_md5.c by 40%.
2020-03-21 13:42:10 +01:00
Huihui Zhang 4f5af9d70d [ValueTracking] Fix usage of DataLayout::getTypeStoreSize()
Summary:
DataLayout::getTypeStoreSize() returns TypeSize.

For cases where it can not be scalable vector (e.g., GlobalVariable),
explicitly call TypeSize::getFixedSize().

For cases where scalable property doesn't matter, (e.g., check for
zero-sized type), use TypeSize::isNonZero().

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76454
2020-03-20 16:52:15 -07:00
Huihui Zhang 1993f95f2b [ValueTracking][SVE] Fix getOffsetFromIndex for scalable vector.
Summary:
Return None if GEP index type is scalable vector. Size of scalable vectors
are multiplied by a runtime constant.

Avoid transforming:
  %a = bitcast i8* %p to <vscale x 16 x i8>*
  %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0
  store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp0
  %tmp1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 1
  store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp1

into:
  %a = bitcast i8* %p to <vscale x 16 x i8>*
  %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0
  %1 = bitcast <vscale x 16 x i8>* %tmp0 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %1, i8 0, i64 32, i1 false)

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, arphaman, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76464
2020-03-20 14:48:29 -07:00
Nikita Popov 417d69595f [InstSimplify] Reorder checks to be more efficient; NFC
First check whether the RHS is a null pointer, and only then
perform a potentially expensive non-zero query.
2020-03-20 22:05:38 +01:00
Simon Pilgrim 34659de5fd [InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by scalar amounts to generic shifts (PR40391)
The sll/srl/sra scalar vector shifts can be replaced with generic shifts if the shift amount is known to be in range.

This also required public DemandedElts variants of llvm::computeKnownBits to be exposed (PR36319).
2020-03-20 15:48:06 +00:00
Simon Pilgrim 7f764fa18f [ValueTracking] Add some initial isKnownNonZero DemandedElts support (PR36319) 2020-03-20 13:29:00 +00:00
Simon Pilgrim c1efdbcbe0 [ValueTracking] Add computeKnownBits DemandedElts support to shift instructions (PR36319) 2020-03-20 11:08:08 +00:00
Simon Pilgrim 0b458d4dca [ValueTracking] Add computeKnownBits DemandedElts support to ADD/SUB/MUL instructions (PR36319) 2020-03-19 12:41:29 +00:00
Simon Pilgrim 99336bf95a [ValueTracking] Add computeKnownBits DemandedElts support to masked add instructions (PR36319) 2020-03-18 21:50:56 +00:00
Simon Pilgrim 9d40292a64 [ValueTracking] Add computeKnownBits DemandedElts support to XOR instructions (PR36319) 2020-03-18 20:24:14 +00:00
Eli Friedman ebec984e14 [AliasAnalysis] Misc fixes for checking aliasing with scalable types.
This is fixing up various places that use the implicit
TypeSize->uint64_t conversion.

The new overloads in MemoryLocation.h are already used in various places
that construct a MemoryLocation from a TypeSize, including MemorySSA.
(They were using the implicit conversion before.)

Differential Revision: https://reviews.llvm.org/D76249
2020-03-18 12:28:47 -07:00
Simon Pilgrim 1010c44b4c [ValueTracking] Add computeKnownBits DemandedElts support to EXTRACTELEMENT/OR/BSWAP/BITREVERSE instructions (PR36319)
These are all covered by the bswap/bitreverse vector tests.
2020-03-18 18:49:58 +00:00
Simon Pilgrim 06150e8356 [ValueTracking] Add computeKnownBits DemandedElts support to AND instructions (PR36319) 2020-03-18 15:38:15 +00:00
Roman Lebedev 85334b030a
[NFCI][SCEV] Avoid recursion in SCEVExpander::isHighCostExpansion*()
Summary:
As noted in [[ https://bugs.llvm.org/show_bug.cgi?id=45201 | PR45201 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=10090 | PR10090 ]] SCEV doesn't
always avoid recursive algorithms, and that causes issues with
large expression depths and/or smaller stack sizes.

In `SCEVExpander::isHighCostExpansion*()` case, the refactoring to avoid
recursion is rather idiomatic. We simply need to place the root expr
into a vector, and iterate over vector elements accounting for the cost
of each one, adding new exprs at the end of the vector,
thus achieving recursion-less traversal.

The order in which we will visit exprs doesn't matter here,
so we will be fine with the most basic approach of using SmallVector
and inserting/extracting from the back, which accidentally is the same
depth-first traversal that we were doing previously recursively.

Reviewers: mkazantsev, reames, wmi, ekatz

Reviewed By: mkazantsev

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76273
2020-03-18 17:10:54 +03:00
Huihui Zhang 1bf0c99375 [ValueTracking][SVE] Fix isGEPKnownNonNull for scalable vector.
Summary:
DataLayout::getTypeAllocSize() return TypeSize. For cases where the
scalable property doesn't matter, we should explicitly call getKnownMinSize()
to avoid implicit type conversion to uint64_t, which is not valid for scalable
vector type.

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76260
2020-03-17 11:31:30 -07:00
Evgenii Stepanov 2a3723ef11 [memtag] Plug in stack safety analysis.
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.

Reviewers: pcc, vitalybuka, ostannard

Reviewed By: vitalybuka

Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73513
2020-03-16 16:35:25 -07:00
Nico Weber 623cb95eb3 Revert "[InstSimplify] Simplify calls with "returned" attribute"
This reverts commit 45555c3819.
Causes clang crashes in some causes, see comments on
https://reviews.llvm.org/D75815 for details (including
repro steps).
2020-03-16 15:21:30 -04:00
Huihui Zhang 0616e9964b [InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector.
Summary:
Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable
vector, only minimal type alloc size is known at compile-time.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75892
2020-03-16 11:46:12 -07:00
Matt Arsenault 05e7d8d6ce TTI: Add addrspace parameters to memcpy lowering functions 2020-03-16 14:34:29 -04:00
Florian Hahn 650f363bd7 [ValueLattice] Add singlecrfromundef lattice value.
This patch adds a new singlecrfromundef lattice value, indicating a
single element constant range which was merge with undef at some point.
Merging it with another constant range results in overdefined, as we
won't be able to replace all users with a single value.

This patch uses a ConstantRange instead of a Constant*, because regular
integer constants are represented as single element constant ranges as
well and this allows the existing code working without additional
changes.

Reviewers: efriedma, nikic, reames, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75845
2020-03-15 11:23:46 +00:00
Florian Hahn 4878aa36d4 [ValueLattice] Add new state for undef constants.
This patch adds a new undef lattice state, which is used to represent
UndefValue constants or instructions producing undef.

The main difference to the unknown state is that merging undef values
with constants (or single element constant ranges) produces  the
constant/constant range, assuming all uses of the merge result will be
replaced by the found constant.

Contrary, merging non-single element ranges with undef needs to go to
overdefined. Using unknown for UndefValues currently causes mis-compiles
in CVP/LVI (PR44949) and will become problematic once we use
ValueLatticeElement for SCCP.

Reviewers: efriedma, reames, davide, nikic

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75120
2020-03-14 17:19:59 +00:00