Commit Graph

8986 Commits

Author SHA1 Message Date
Kirill Naumov e33c9bb245 Flags for displaying only hot nodes in CFGPrinter graph
Added two flags to omit uncommon or dead paths in the CFG graphs:
  -cfg-hide-unreachable-paths
  -cfg-hide-deoptimize-paths

The main purpose is performance analysis when such block are not
"interesting" from perspective of common path performance.

Reviewed By: apilipenko, davidxl

Differential Revision: https://reviews.llvm.org/D74346
2020-02-21 17:20:00 -08:00
Evgeniy Brevnov b0761bbc76 [DependenceAnalysis] Memory dependence analysis internal caching mechanism is broken in presence of TBAA (PR42733).
Summary:
There is a flaw in memory dependence analysis caching mechanism when memory accesses with TBAA are involved. Assume we first analysed and cached results for access with TBAA. Later we request dependence for the same memory but without TBAA (or different TBAA). By design these two queries should share one entry in the internal cache which corresponds to a general access (without TBAA).  Thus upon second request internal cached is cleared and we continue analysis for access as if there is no TBAA.

The problem is that even though internal cache is cleared the set of visited nodes is not. That means we won't traverse visited nodes again and populate internal cache with the corresponding dependence results. So we end up  with internal cache in an incomplete state. Current implementation tries to signal that situation by resetting CacheInfo->Pair at line 1104. But that doesn't actually help since later code ignores this invalidation and relies on 'Cache->empty()' property to decide on cache completeness.

Reviewers: reames, hfinkel, chandlerc, fedor.sergeev, asbirlea, fhahn, john.brawn, Prazek, sunfish

Reviewed By: john.brawn

Subscribers: DaniilSuchkov, kosarev, jfb, dantrushin, hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73032
2020-02-21 20:20:36 +07:00
Hideto Ueno e253cdda35 [MustExecute] Add backward exploration for must-be-executed-context
Summary:
As mentioned in D71974, it is useful for must-be-executed-context to explore CFG backwardly.
This patch is ported from parts of D64975. We use a dominator tree to find the previous context if
a dominator tree is available.

Reviewers: jdoerfert, hfinkel, baziotis, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74817
2020-02-20 14:49:30 +09:00
Bardia Mahjour 0a2626d0cd [DDG] Data Dependence Graph - Graph Simplification
Summary:
This is the last functional patch affecting the representation of DDG.
Here we try to simplify the DDG to reduce the number of nodes and edges by
iteratively merging pairs of nodes that satisfy the following conditions,
until no such pair can be identified. A pair of nodes consisting of a and b
can be merged if:

    1. the only edge from a is a def-use edge to b and
    2. the only edge to b is a def-use edge from a and
    3. there is no cyclic edge from b to a and
    4. all instructions in a and b belong to the same basic block and
    5. both a and b are simple (single or multi instruction) nodes.

These criteria allow us to fold many uninteresting def-use edges that
commonly exist in the graph while avoiding the risk of introducing
dependencies that didn't exist before.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72350
2020-02-19 13:41:51 -05:00
Jonas Paulsson 0eddeeab29 [ValueTracking] Improve isKnownNonNaN() to recognize zero splats.
isKnownNonNaN() could not recognize a zero splat because that is a
ConstantAggregateZero which is-a ConstantData but not a ConstantDataVector.

Patch makes a ConstantAggregateZero return true.

Review: Thomas Lively

Differential Revision: https://reviews.llvm.org/D74263
2020-02-19 09:35:36 -08:00
Jay Foad b329d1b06e [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fmul.legacy intrinsic
Reviewers: arsenm, rampitec, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74835
2020-02-19 16:01:30 +00:00
Brian Gesiak 26f356350b [LazyCallGraph] Fix ambiguous index value
After having committed https://reviews.llvm.org/D72226, 2 buildbots
running GCC 5.4.0 began failing. The cause was the order in which those
compilers evaluated the left- and right-hand sides of the expression
`RC.SCCIndices[C] = RC.SCCIndices.size();`. This commit splits the
expression into multiple statements to avoid ambiguity, and adds a test
case that exercises the code that caused the test failures on those
older compilers (which was originally included in the reviewed patch,
https://reviews.llvm.org/D72226).
2020-02-18 23:32:55 -05:00
Reid Kleckner 0c2b09a9b6 [IR] Lazily number instructions for local dominance queries
Essentially, fold OrderedBasicBlock into BasicBlock, and make it
auto-invalidate the instruction ordering when new instructions are
added. Notably, we don't need to invalidate it when removing
instructions, which is helpful when a pass mostly delete dead
instructions rather than transforming them.

The downside is that Instruction grows from 56 bytes to 64 bytes.  The
resulting LLVM code is substantially simpler and automatically handles
invalidation, which makes me think that this is the right speed and size
tradeoff.

The important change is in SymbolTableTraitsImpl.h, where the numbering
is invalidated. Everything else should be straightforward.

We probably want to implement a fancier re-numbering scheme so that
local updates don't invalidate the ordering, but I plan for that to be
future work, maybe for someone else.

Reviewed By: lattner, vsk, fhahn, dexonsmith

Differential Revision: https://reviews.llvm.org/D51664
2020-02-18 14:44:24 -08:00
Nikita Popov f37e899fd7 [VectorUtils] Accept IRBuilderBase; NFC 2020-02-18 18:02:04 +01:00
Jim Lin 466f8843f5 [NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
2020-02-18 10:49:13 +08:00
Brian Gesiak 0deef2e164 Re-land "Add LazyCallGraph API to add function to RefSCC"
This re-commits https://reviews.llvm.org/D70927, which I reverted in
https://reviews.llvm.org/rG28213680b2a7d1fdeea16aa3f3a368879472c72a due
to a buildbot error:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13251

I no longer include a test case that appears to crash when built with the
buildbot's compiler, GCC 5.4.0.
2020-02-17 16:59:25 -05:00
Brian Gesiak 28213680b2 Revert "Add LazyCallGraph API to add function to RefSCC"
This reverts commit https://reviews.llvm.org/rG449a13509190b1c57e5fcf5cd7e8f0f647f564b4,
due to buildbot failures such as
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13251.
2020-02-17 14:25:10 -05:00
Nikita Popov 3eaa53e805 Reapply "[IRBuilder] Virtualize IRBuilder"
Relative to the original commit, this fixes some warnings,
and is based on the deletion of the IRBuilder copy constructor
in D74693. The automatic copy constructor would no longer be
safe.

-----

Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-17 19:04:11 +01:00
Brian Gesiak 449a135091 Add LazyCallGraph API to add function to RefSCC
Summary:
Depends on https://reviews.llvm.org/D70927.

`LazyCallGraph::addNewFunctionIntoSCC` allows users to insert a new
function node into a call graph, into a specific, existing SCC.

Extend this interface such that functions can be added even when they do
not belong in any existing SCC, but instead in a new SCC within an
existing RefSCC.

The ability to insert new functions as part of a RefSCC is necessary for
outlined functions that do not form a strongly connected cycle with the
function they are outlined from. An example of such a function would be the
coroutine funclets 'f.resume', etc., which are outlined from a coroutine 'f'.
Coroutine 'f' only references the funclets' addresses, it does not call
them directly.

Reviewers: jdoerfert, chandlerc, wenlei, hfinkel

Reviewed By: jdoerfert

Subscribers: hfinkel, JonChesterfield, mehdi_amini, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72226
2020-02-17 12:56:38 -05:00
Nikita Popov af480e8c63 Revert "[IRBuilder] Virtualize IRBuilder"
This reverts commit 0765d3824d.
This reverts commit 1b04866a3d.

Relevant looking crashes observed on:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
2020-02-16 17:01:10 +01:00
Nikita Popov 0765d3824d [IRBuilder] Virtualize IRBuilder
Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-16 13:48:55 +01:00
Evgeniy Brevnov cae643d596 Reverting D73027 [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151). 2020-02-14 22:57:23 +07:00
Evgeniy Brevnov 5573abceab [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151).
Summary:
This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516.

The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache  (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well.

Reviewers: jdoerfert, reames, hfinkel, efriedma

Reviewed By: jdoerfert

Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73027
2020-02-14 12:18:31 +07:00
Nikita Popov f0b57d8071 [MemorySSA] Don't verify MemorySSA unless VerifyMemorySSA enabled
MemorySSA is often taking up an unreasonable fraction of runtime in
assertion enabled builds. Turns out that there is one code-path that
runs verifyMemorySSA() even if VerifyMemorySSA is not enabled. This
patch makes it conditional as well.

Differential Revision: https://reviews.llvm.org/D74505
2020-02-13 18:46:58 +01:00
Jay Foad 32aac25637 [KnownBits] Introduce anyext instead of passing a flag into zext
Summary:
This was a very odd API, where you had to pass a flag into a zext
function to say whether the extended bits really were zero or not. All
callers passed in a literal true or false.

I think it's much clearer to make the function name reflect the
operation being performed on the value we're tracking (rather than on
the KnownBits Zero and One fields), so zext means the value is being
zero extended and new function anyext means the value is being extended
with unknown bits.

NFC.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74482
2020-02-12 19:06:53 +00:00
Huihui Zhang 5350a48931 [ConstantFold][SVE] Fix constant fold for FoldReinterpretLoadFromConstPtr.
Summary:
Bail out early for scalable vectors. As global variables are not expected
to be scalable.

Use explicit call of getFixedSize() to assert on places where scalable size
doesn't make sense.

Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74424
2020-02-12 10:24:50 -08:00
Alina Sbirlea 4f33a68973 Compute ORE, BPI, BFI in Loop passes.
Summary:
Passes ORE, BPI, BFI are not being preserved by Loop passes, hence it
is incorrect to retrieve these passes as cached.
This patch makes the loop passes in question compute a new instance.

In some of these cases, however, it may be beneficial to change the Loop pass to
a Function pass instead, similar to the change for LoopUnrollAndJam.

Reviewers: chandlerc, dmgreen, jdoerfert, reames

Subscribers: mehdi_amini, hiraditya, zzheng, steven_wu, dexonsmith, Whitney, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72891
2020-02-12 09:15:18 -08:00
Ehud Katz 2470d2988a [ConstantFolding] Fold calls to FP remainder function
With the fixed implementation of the "remainder" operation in
rG9d0956ebd471, we can now add support to folding calls to it.

Differential Revision: https://reviews.llvm.org/D69777
2020-02-12 13:21:18 +02:00
Huihui Zhang 88de9338f2 [ConstantFold][SVE] Fix constand fold for vector call.
Summary:
Do not iterate on scalable vectors.

Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74419
2020-02-11 14:06:15 -08:00
Alina Sbirlea 0cecafd647 [BasicAA] Make BasicAA a cfg pass.
Summary:
Part of the changes in D44564 made BasicAA not CFG only due to it using
PhiAnalysisValues which may have values invalidated.
Subsequent patches (rL340613) appear to have addressed this limitation.

BasicAA should not be invalidated by non-CFG-altering passes.
A concrete example is MemCpyOpt which preserves CFG, but we are testing
it invalidates BasicAA.

llvm-dev RFC: https://groups.google.com/forum/#!topic/llvm-dev/eSPXuWnNfzM

Reviewers: john.brawn, sebpop, hfinkel, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74353
2020-02-11 11:30:08 -08:00
Rachel Craik 1f55420065 [LoopCacheAnalysis]: Add support for negative stride
LoopCacheAnalysis currently assumes the loop will be iterated over in
a forward direction. This patch addresses the issue by using the
absolute value of the stride when iterating backwards.

Note: this patch will treat negative and positive array access the
same, resulting in the same cost being calculated for single and
bi-directional access patterns. This should be improved in a
subsequent patch.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D73064
2020-02-10 13:22:35 -05:00
Johannes Doerfert 72277ecd62 Introduce a CallGraph updater helper class
The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.

The uses are contained in different commits, e.g. D70767.

More functionality is added as we need it.

Reviewed By: modocache, hfinkel

Differential Revision: https://reviews.llvm.org/D70927
2020-02-08 14:16:48 -06:00
George Burgess IV f8c9ceb1ce [SimplifyLibCalls] Add __strlen_chk.
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.

Differential Revision: https://reviews.llvm.org/D74079
2020-02-08 11:51:00 -08:00
Florian Hahn 14ef87bda6 [ValueTracking] usub(a, b) cannot overflow if a >= b.
If we know that a >= b (unsigned), usub.with.overflow(a, b) cannot
overflow. Similarly, if b > a, the same expression overflows.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: nikic, Gerolf

Differential Revision: https://reviews.llvm.org/D74066
2020-02-07 10:41:18 +00:00
Florian Hahn 8d5e76ac30 [ValueTracking] Update implied reasoning to accept expanded cmp (NFC).
This patch adds versions of isImpliedCondition and
isImpliedByDomCondition that take a predicate, LHS and RHS operands as
instead of a Value representing the condition.

This allows using those functions to check conditions without having a
concrete ICmp instruction.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D74065
2020-02-07 10:27:29 +00:00
Sanjay Patel 686a038ed8 [Analysis] add query to get splat value from array of ints
I was debug stepping through an x86 shuffle lowering and
noticed we were doing an N^2 search for splat index. I
didn't find the equivalent functionality anywhere else in
LLVM, so here's a helper that takes an array of int and
returns a splatted index while ignoring undefs (any
negative value).

This might also be used inside existing
ShuffleVectorInst/ShuffleVectorSDNode functions and/or
help with D72467.

Differential Revision: https://reviews.llvm.org/D74064
2020-02-05 14:55:02 -05:00
Christopher Tetreault b03f3fbd6a Reapply: [SVE] Fix bug in simplification of scalable vector instructions
This reverts commit a05441038a, reapplying
commit 31574d38ac
2020-02-05 10:00:09 -08:00
Teresa Johnson 7f37a8026f [InlineCost] Add flag to allow changing the default inline cost
Summary:
It can be useful to tune the default inline threshold without overriding other inlining thresholds (e.g. in code compiled for size).

The existing `-inline-threshold` flag overrides other thresholds, so it is insufficient in codebases where there is a mix of code compiled for size and speed.

Patch by Michael Holman <michael.holman@microsoft.com>

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mtrofin, davidxl, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73217
2020-02-04 12:06:20 -08:00
Hiroshi Yamauchi 803dd6fe6b [BFI] Add a debug check for unknown block queries.
Summary:
Add a debug check for frequency queries for unknown blocks (typically blocks
that are created after BFI is computed but their frequencies are not
communicated to BFI.)

This is useful for detecting and debugging missed BFI updates.

This is debug build only and disabled behind a flag.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73920
2020-02-04 10:05:28 -08:00
Juneyoung Lee dd7d610262 [ValueTracking] Let isGuaranteedToBeUndefOrPoison look into operands of icmp 2020-02-04 17:16:32 +09:00
Juneyoung Lee 36272d5f00 Let isGuaranteedNotToBeUndefOrPoison consider PHINode with constant values 2020-02-04 16:46:54 +09:00
Reid Kleckner 105642af5e Add PassManagerImpl.h to hide implementation details
ClangBuildAnalyzer results show that a lot of time is spent
instantiating AnalysisManager::getResultImpl across the code base:

**** Templates that took longest to instantiate:
 50445 ms: llvm::AnalysisManager<llvm::Function>::getResultImpl (412 times, avg 122 ms)
 47797 ms: llvm::AnalysisManager<llvm::Function>::getResult<llvm::TargetLibraryAnalysis> (389 times, avg 122 ms)
 46894 ms: std::tie<const unsigned long long, const bool> (2452 times, avg 19 ms)
 43851 ms: llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096, 4096>::Allocate (3228 times, avg 13 ms)
 33911 ms: std::tie<const unsigned int, const unsigned int, const unsigned int, const unsigned int> (897 times, avg 37 ms)
 33854 ms: std::tie<const unsigned long long, const unsigned long long> (1897 times, avg 17 ms)
 27886 ms: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (11156 times, avg 2 ms)

I mentioned this result to @chandlerc, and he suggested this direction.

AnalysisManager is already explicitly instantiated, and getResultImpl
doesn't need to be inlined. Move the definition to an Impl header, and
include that header in files that explicitly instantiate
AnalysisManager. There are only four (real) IR units:
- function
- module
- loop
- cgscc

Looking at a specific transform (ArgumentPromotion.cpp), here are three
compilations before & after this change:

BEFORE:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 258.15MB
real: 0m6.297s
peak memory: 257.54MB
real: 0m5.906s
peak memory: 257.47MB
real: 0m6.219s

AFTER:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 235.35MB
real: 0m5.454s
peak memory: 234.72MB
real: 0m5.235s
peak memory: 234.39MB
real: 0m5.469s

The 20MB of memory saved seems real, and the time improvement seems like
it is there.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73817
2020-02-03 11:15:55 -08:00
Reid Kleckner a05441038a Revert "[SVE] Fix bug in simplification of scalable vector instructions"
This reverts commit 31574d38ac.

The newly added shufflevector test does not pass locally on either of my
workstations.
2020-02-03 11:12:09 -08:00
Christopher Tetreault 31574d38ac [SVE] Fix bug in simplification of scalable vector instructions
Summary:
* Most of the simplifications in SimplifyShuffleVectorInst depend on the
concrete value of, or the length of the mask vector. For scalable
vectors, this cannot be known at compile time.
** for these tests, detect if the vector is scalable before attempting
the transformation
* The functions ShuffleVectorInst::getMaskValue and
ShuffleVectorInst::getShuffleMask access the value of the constant mask.
However, since the length of the mask is unknown at compile time, these
function do not work for scalable vectors. Add asserts to ensure that
the input mask is not scalable

Reviewers: efriedma, sdesmalen, apazos, chrisj, huihuiz

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73555
2020-02-03 10:15:56 -08:00
Martin Storsjö f867c8e81f [PM][CGSCC] Add parentheses to avoid a GCC warning. NFC.
This avoids a warning about "suggest parentheses around && within ||".
2020-02-03 09:55:02 +02:00
Johannes Doerfert 0137745308 [PM][CGSCC] Add a helper to update the call graph from SCC passes
With this patch new trivial edges can be added to an SCC in a CGSCC
pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares
almost all the code with the existing
updateCGAndAnalysisManagerForFunctionPass method but it implements the
first step towards the TODOs.

This was initially part of D70927.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D72025
2020-02-02 23:32:18 -06:00
Sanjay Patel 9b9e2da07d [Analysis] add optional index parameter to isSplatValue()
We want to allow splat value transforms to improve PR44588 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=44588
...but to do that, we need to know if values are splatted from the same,
specific index (lane) rather than splatted from an arbitrary index.

We can improve the undef handling with 1-liner follow-ups because the
Constant API optionally allow undefs now.

Differential Revision: https://reviews.llvm.org/D73549
2020-02-02 10:52:00 -05:00
Simon Pilgrim a3485301d4 Remove unused function. NFCI. 2020-02-01 13:01:58 +00:00
Simon Pilgrim 105e5c940c [ValueTracking] Add DemandedElts support to computeKnownBits/ComputeNumSignBits (PR36319)
This patch adds initial support for a DemandedElts mask to the internal computeKnownBits/ComputeNumSignBits methods, matching the SelectionDAG and GlobalISel equivalents.

So far only a couple of instructions have been setup to handle the DemandedElts, the remainder still using the existing 'all elements' default. The plan is to extend support as we have test coverage.

Differential Revision: https://reviews.llvm.org/D73435
2020-02-01 12:45:46 +00:00
Francesco Petrogalli 623cff81fe [llvm][VectorUtils] Tweak VFShape for scalable vector functions.
Summary:
This patch makes sure that the field VFShape.VF is greater than zero
when demangling the vector function name of scalable vector functions
encoded in the "vector-function-abi-variant" attribute.

This change is required to be able to provide instances of VFShape
that can be used to query the VFDatabase for the vectorization passes,
as such passes always require a positive value for the Vectorization Factor (VF)
needed by the vectorization process.

It is not possible to extract the value of VFShape.VF from the mangled
name of scalable vector functions, because it is encoded as
`x`. Therefore, the VFABI demangling function has been modified to
extract such information from the IR declaration of the vector
function, under the assumption that _all_ vectors in the signature of
the vector function have the same number of lanes. Such assumption is
valid because it is also assumed by the Vector Function ABI
specifications supported by the demangling function (x86, AArch64, and
LLVM internal one).

The unit tests that demangle scalable names have been modified by
adding the IR module that carries the declaration of the vector
function name being demangled.

In particular, the demangling function fails in the following cases:

1. When the declaration of the scalable vector function is not
    present in the module.

2. When the value of VFSHape.VF is not greater than 0.

Reviewers: jdoerfert, sdesmalen, andwar

Reviewed By: jdoerfert

Subscribers: mgorny, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73286
2020-01-30 05:53:56 +00:00
Mircea Trofin 14a16fae43 [llvm][NFC] Rename CallAnalyzer::onCommonInstructionSimplification
Summary:
It is called when instructions aren't simplified, and the implementation
is expected to account for a penalty. Renamed to
onCommonInstructionMissedSimplification.

Reviewers: davidxl, eraman

Reviewed By: davidxl

Subscribers: hiraditya, baloghadamsoftware, haicheng, a.sidorin, Szelethus, donat.nagy, dkrupp, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73662
2020-01-29 21:07:36 -08:00
Hiroshi Yamauchi 24962ced81 [Loads] Handle simple cases with same base pointer with constant offsets in FindAvailableLoadedValue when AA is null.
Summary:
This will help with devirtualization (store forwarding with vtable pointers in
the presence of other stores into members in the constructor.) During inlining,
we don't have AA.

Reviewers: davidxl

Subscribers: mgorny, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71307
2020-01-29 13:05:46 -08:00
Matt Arsenault a9af1dc34d Analysis: Add max recursison to isDereferenceableAndAlignedPointer
Fixes stack overflow in test/CodeGen/X86/large-gep-chain.ll when store
lowering starts adding dereferenceable flags.
2020-01-29 06:48:24 -08:00
Eli Friedman 2f6b9edfa8 [AliasAnalysis] Add missing FMRB_* enums.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).

Fixes regression in polly from recent patch to add writeonly to memset.

While I'm here, also fix a few dubious uses of the FMRB_* enum values.

Differential Revision: https://reviews.llvm.org/D73154
2020-01-28 15:47:08 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00