Commit Graph

24241 Commits

Author SHA1 Message Date
Mircea Trofin d6695e1876 [llvm] Add interface to drive inlining decision using ML model
Summary:

This change introduces InliningAdvisor (and related APIs), the interface
that abstracts decision making away from the inlining pass. We will use
this interface to delegate decision making to a trained ML model,
subsequently (see referenced RFC).

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, eraman, dblaikie

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79042
2020-05-13 13:27:29 -07:00
Alina Sbirlea bd541b217f [NewPassManager] Add assertions when getting statefull cached analysis.
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.

Reviewers: chandlerc

Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72893
2020-05-13 12:38:38 -07:00
Alina Sbirlea db04ff4b6b [SimpleLoopUnswitch] Add non-empty unreachable block check to exit cases removed.
Summary:
Update check to include the check for unreachable.

Basic blocks ending in unreachable are special cased, as these blocks may be already unswitched. Before this patch this check is only done for the default destination.
The condition for the exit cases and the default case must be the same, because we should never leave edges from the switch instruction to a basic block that we are unswitching. In PR45355 we still have a remaining edge (that we're attempting to remove from the DT) because its the default edge to an unreachable-terminated block where we unswitch a case edge to that block.

Resolves PR45355.

Reviewers: chandlerc

Subscribers: hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78279
2020-05-13 12:38:37 -07:00
Eli Friedman fcfb3170a7 [SROA] Clean up some uses of MaybeAlign in SROA.
Use Align instead of using MaybeAlign; all the operations in question
have known alignment.

For getSliceAlign() in particular, in the cases where we used to return
None, it would be converted back to an Align by IRBuilder, so there's no
functional change there.

Split off from D77454.

Differential Revision: https://reviews.llvm.org/D79205
2020-05-13 11:23:29 -07:00
Huber, Joseph 4d4ea9ac59 OpenMPOpt Remarks Support
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D79359
2020-05-13 12:20:40 -05:00
Reid Kleckner 1370757dd0 Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC."
This reverts commit eef95f2746.

The new assertion about branch propability sums does not hold.
2020-05-13 08:23:09 -07:00
Pierre-vh 2668775f66 [LSR][ARM] Add new TTI hook to mark some LSR chains as profitable
This patch adds a new TTI hook to allow targets to tell LSR that
a chain including some instruction is already profitable and
should not be optimized. This patch also adds an implementation
of this TTI hook for ARM so LSR doesn't optimize chains that include
the VCTP intrinsic.

Differential Revision: https://reviews.llvm.org/D79418
2020-05-13 14:18:28 +01:00
Sjoerd Meijer 9529597cf4 Recommit #2: "[LV] Induction Variable does not remain scalar under tail-folding."
This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed llvm regression test too. This
one-line follow up fix/recommit will splat the IV, which is what we are trying
to avoid if unnecessary in general, if tail-folding is requested even if all
users are scalar instructions after vectorisation. Because with tail-folding,
the splat IV will be used by the predicate of the masked loads/stores
instructions. The previous version omitted this, which caused the
miscompilation. The original commit message was:

If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.
2020-05-13 13:50:09 +01:00
Ehud Katz 897d8ee5cd [StructurizeCFG] Fix region nodes ordering
This is a reimplementation of the `orderNodes` function, as the old
implementation didn't take into account all cases.

Fix PR41509

Differential Revision: https://reviews.llvm.org/D79037
2020-05-13 15:33:36 +03:00
Yevgeny Rouban eef95f2746 [BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-13 13:55:36 +07:00
KAWASHIMA Takahiro 272bc25bc1 [LoopReroll] Fix rerolling loop with use outside the loop
Fixes PR41696

The loop-reroll pass generates an invalid IR (or its assertion
fails in debug build) if values of the base instruction and
other root instructions (terms used in the loop-reroll pass)
are used outside the loop block. See IRs written in PR41696
as examples.

The current implementation of the loop-reroll pass can reroll
only loops that don't have values that are used outside the
loop, except reduced values (the last values of reduction chains).
This is described in the comment of the `LoopReroll::reroll`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600

This is checked in the `LoopReroll::DAGRootTracker::validate`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393

However, the base instruction and other root instructions skip
this check in the validation loop.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229

Moving the check in front of the skip is the logically simplest
fix. However, inserting the check in an earlier stage is better
in terms of compilation time of unrerollable loops. This fix
inserts the check for the base instruction into the function
to validate possible base/root instructions. Check for other
root instructions is unnecessary because they don't match any
base instructions if they have uses outside the loop.

Differential Revision: https://reviews.llvm.org/D79549
2020-05-13 13:03:03 +09:00
Johannes Doerfert af48351cc8 [Attributor][FIX] Stabilize the state of AAReturnedValues each update
For AAReturnedValues we treated new and existing information differently
in the updateImpl. Only the latter was properly analyzed and
categorized. The former was thought to be analyzed in the subsequent
update. Since the Attributor does not support "self-updates" we need to
make sure the state is "stable" after each updateImpl invocation. That
is, if the surrounding information does not change, the state is valid.
Now we make sure all return values have been handled and properly
categorized each iteration. We might not update again if we have not
requested a non-fix attribute so we cannot "wait" for the next update to
analyze a new return value.

Bug reported by @sdmitriev.
2020-05-12 21:00:30 -05:00
Zequan Wu cb22ab7403 Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.

Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.

`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.

This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.

Reviewed By: asbirlea, rnk

Differential Revision: https://reviews.llvm.org/D78659
2020-05-12 16:49:20 -07:00
Sergey Dmitriev 32f5ee830b [Attributor] Fixup block addresses after rewriting function signature
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79801
2020-05-12 13:53:04 -07:00
Juneyoung Lee e5f602d82c [ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.
Summary:
This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
a19edc4d15 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy

Reviewed By: spatel, nikic

Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78615
2020-05-13 02:51:42 +09:00
Fangrui Song b56b1e67e3 [gcov] Default coverage version to '408*' and delete CC1 option -coverage-exit-block-before-body
gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but

* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
  exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
  and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.

Also, rename "return block" to "exit block" because the latter is the
appropriate term.
2020-05-12 09:14:03 -07:00
Eric Christopher a42e53cccf Fix typos encountered while working on pass pipeline for O1. 2020-05-12 00:45:15 -07:00
Johannes Doerfert 8d94d3c3b4 [Attributor][FIX] Disallow function signature rewrite for casted calls
We will now ensure ensure the return type of called function is the type
of all call sites we are going to rewrite. This avoids a problem
partially fixed by D79680. The part that was not covered is a use of
this "weird" casted call site (see `@func3` in `misc_crash.ll`).

misc_crash.ll checks are auto-generated now.
2020-05-11 15:32:47 -05:00
Johannes Doerfert c115a78f0d [Attributor] Make AAIsDead dependences optional to prevent top state
We should never give up on AAIsDead as it guards other AAs from
unreachable code (in which SSA properties are meaningless). We did
however use required dependences on some queries in AAIsDead which
caused us to invalidate AAIsDead if the queried AA got invalidated.
We now use optional dependences instead. The bug that exposed this is
added to the liveness.ll test and other test changes show the impact.

Bug report by @sdmitriev.
2020-05-11 15:32:47 -05:00
Johannes Doerfert c86fd3333d [Attributor] Force update of "newly live" abstract attributes
During an update of AAIsDead, new instructions become live. If we query
information from them, the result is often just the initial state, e.g.,
for call site `noreturn` and `nounwind`. We will now trigger an update
for cached attributes during the AAIsDead update, though other AAs might
later use the same API.
2020-05-11 15:32:47 -05:00
Sanjay Patel 5f730b645d [VectorCombine] account for extra uses in scalarization cost
Follow-up to D79452.
Mimics the extra use cost formula for the inverse transform with extracts.
2020-05-11 15:20:57 -04:00
Mircea Trofin 48fa355ed4 [llvm][NFC] Move inlining decision-related APIs in InliningAdvisor.
Summary: Factoring out in preparation to https://reviews.llvm.org/D79042

Reviewers: dblaikie, davidxl

Subscribers: mgorny, eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79613
2020-05-11 09:00:59 -07:00
Sergey Dmitriev 3df40007e6 [Attributor] Fix for a crash on RAUW when rewriting function signature
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79680
2020-05-11 08:06:19 -07:00
Tyker 78d85c2091 [AssumeBundles] fix crashes
Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79458
2020-05-11 11:52:21 +02:00
OCHyams da100de0a6 [NFC][DwarfDebug] Add test for variables with a single location which
don't span their entire scope.

The previous commit (6d1c40c171) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573
2020-05-11 11:49:11 +02:00
Xun Li 44e5aaf911 Remove an unused Module param
Summary:
In D65848 the function getFuncNameInModule was refactored to no longer use module.
This diff removes the parameter and rename the function name to avoid confusion.

Reviewers: wenlei, wmi, davidxl

Reviewed By: wenlei

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79310
2020-05-10 22:09:55 -07:00
Johannes Doerfert 3a8740bdd5 [Attributor] Merge the query set into AbstractAttribute
The old QuerriedAAs contained two vectors, one for required one for
optional dependences (=queries). We now use a single vector and encode
the kind directly in the pointer.

This reduces memory consumption and makes the connection between
abstract attributes and their dependences clearer.

No functional change is intended, changes in the test are due to
different order in the query map. Neither the order before nor now is in
any way special.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 543734 (329735/s)
temporary memory allocations: 105895 (64217/s)
peak heap memory consumption: 19.19MB
peak RSS (including heaptrack overhead): 102.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 513292 (341511/s)
temporary memory allocations: 106028 (70544/s)
peak heap memory consumption: 13.35MB
peak RSS (including heaptrack overhead): 95.64MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -30442 (208506/s)
temporary memory allocations: 133 (-910/s)
peak heap memory consumption: -5.84MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78729
2020-05-10 22:27:00 -05:00
Johannes Doerfert 5e06b2514a [Attributor][FIX] Carefully handle/ignore/forget `argmemonly`
When we have an existing `argmemonly` or `inaccessiblememorargmemonly`
we used to "know" that information. However, interprocedural constant
propagation can invalidate these attributes. We now ignore and remove
these attributes for internal functions (which may be affected by IP
constant propagation), if we are deriving new attributes for the
function.
2020-05-10 19:06:11 -05:00
Johannes Doerfert 713ee3aa77 [Attributor] Use "simplify to constant" in genericValueTraversal
As we replace values with constants interprocedurally, we also need to
do this "look-through" step during the generic value traversal or we
would derive properties from replaced values. While this is often not
problematic, it is when we use the "kind" of a value for reasoning,
e.g., accesses to arguments allow `argmemonly`.
2020-05-10 19:06:11 -05:00
Johannes Doerfert 513ac6e9b0 [Attributor] Ignore illegal accesses to `null`
When we categorize a pointer value we bailed at `null` before. If we
know `null` is not a valid memory location we can ignore it as there
won't be an access at all.
2020-05-10 19:06:10 -05:00
Johannes Doerfert 31c03b9223 [Attributor] Use existing helpers to determine IR facts
We now use getPointerDereferenceableBytes to determine `nonnull` and
`dereferenceable` facts from the IR. We also use getPointerAlignment in
AAAlign for the same reason. The latter can interfere with callbacks so
we do restrict it to non-function-pointers for now.
2020-05-10 19:06:10 -05:00
Johannes Doerfert a9ee8b492c [Attributor][NFC] Clang format Attributor*.cpp 2020-05-10 19:06:10 -05:00
Fangrui Song 25544ce2df [gcov] Default coverage version to '407*' and delete CC1 option -coverage-cfg-checksum
Defaulting to -Xclang -coverage-version='407*' makes .gcno/.gcda
compatible with gcov [4.7,8)

In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum.
We can infer the information from the version.

With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7.
We don't need other -Xclang -coverage* options.
There may be a mismatching version warning, though.

(Note, GCC r173147 "split checksum into cfg checksum and line checksum"
 made gcov 4.7 incompatible with previous versions.)
2020-05-10 16:14:07 -07:00
Fangrui Song 13a633b438 [gcov] Delete CC1 option -coverage-no-function-names-in-data
rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION
(this might be part of the reasons the header says
"We emit files in a corrupt version of GCOV's "gcda" file format").

rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data
to work around the issue. (However, the description is wrong.
libgcov never writes function names, even before GCC 4.2).

In reality, the linker command line has to look like:

clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data

Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7
either produce wrong results (for one gcov-4.9 program, I see "No executable lines")
or segfault (gcov-7).
(gcov-8 uses an incompatible format.)

This patch deletes -coverage-no-function-names-in-data and the related
function names support from libclang_rt.profile
2020-05-10 12:37:44 -07:00
Tyker 5957e058e4 [AssumeBundles] Remove non-determinisme from assume builder
Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78616
2020-05-10 21:18:33 +02:00
Tyker 821a0f23d8 [AssumeBundles] Prevent generation of some redundant assumes
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78014
2020-05-10 19:23:59 +02:00
Florian Hahn 8528186b9b [LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460
2020-05-10 17:39:26 +01:00
Sanjay Patel 856cc60bc1 [InstCombine] canonicalize bitcast after insertelement into undef
We have a transform in the opposite direction only for the x86 MMX type,
Other types are not handled either way before this patch.

The motivating case from PR45748:
https://bugs.llvm.org/show_bug.cgi?id=45748
...is the last test diff. In that example, we are triggering an existing
bitcast transform, so we reduce the number of casts, and that should give
us the ideal x86 codegen.

Differential Revision: https://reviews.llvm.org/D79171
2020-05-10 11:37:47 -04:00
Simon Pilgrim bab44a698e [InstCombine] matchOrConcat - match BITREVERSE
Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2))

Practically this is the same as the BSWAP pattern so we might as well handle it.
2020-05-10 16:00:29 +01:00
Florian Hahn 96c63f544f Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."
The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f28.
2020-05-10 15:19:57 +01:00
Florian Hahn fc44617f28 Revert "[LAA] Remove one addRuntimeChecks function (NFC)."
This reverts commit c28114c8ff.

This causes some bots to fail:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio
2020-05-10 13:28:00 +01:00
Florian Hahn c28114c8ff [LAA] Remove one addRuntimeChecks function (NFC).
In order to reduce the API surface area (preparation for D78460), remove
a addRuntimeChecks() function and do the additional check in the single
caller.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79679
2020-05-10 12:48:55 +01:00
Sanjay Patel a62533c29f [InstCombine] fold fpext into exact integer-to-FP cast
We can combine a floating-point extension cast with a conversion
from integer if we know the earlier cast is exact.

This is an optimization suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19

However, this patch does not change the example suggested there.
This patch only uses the existing analysis to handle cases where
the integer source value magnitude is narrower than the
intermediate FP mantissa (guarantees that the conversion to FP is
exact). Follow-up patches to the analysis function can enable
more cases.

Differential Revision: https://reviews.llvm.org/D79116
2020-05-10 07:04:54 -04:00
Arthur Eubanks 73a9b7dee0 Add missing pass initialization
Summary: This was preventing MemorySanitizerLegacyPass from appearing in --print-after-all.

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79661
2020-05-09 21:31:52 -07:00
Jinsong Ji a72b9dfd45 [sanitizer] Enable whitelist/blacklist in new PM
https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist`
and `-fsanitize-coverage-blacklist` for clang.

However, it was done only for legacy pass manager.
This patch enable it for new pass manager as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D79653
2020-05-10 02:34:29 +00:00
Matt Arsenault 16295d521e InstCombine: Broaden copy-constant-to-alloca optimization
Consider any constant memory type, not just global constants. AMDGPU
kernel parameters are effectively global constants, but appear as
either reads from an intrinsic derived pointer or function argument.
2020-05-09 16:00:27 -04:00
Evgenii Stepanov 68a9308a0b [hwasan] Allow -hwasan-globals flag to appear more than once. 2020-05-08 16:35:48 -07:00
Layton Kifer 23cbea9a04 [TRE][NFC] Refactor shared state into member variables.
Separate functions that require shared state into a class to avoid
needing to pass them though multiple functions just to be available
where needed.

The main motivation for this is that we would like to remove the
limitation that accumulator values be dynamic constant, which would
require additional shared state between call eliminations in the same
function, compounding this issue.

Differential Revision: https://reviews.llvm.org/D79299
2020-05-08 14:36:02 -07:00
Sanjay Patel 0d2a0b44c8 [VectorCombine] scalarize binop of inserted elements into vector constants
As with the extractelement patterns that are currently in vector-combine,
there are going to be several possible variations on this theme. This
should be the clearest, simplest example.

Scalarization is the right direction for target-independent canonicalization,
and InstCombine has some of those folds already, but it doesn't do this.
I proposed a similar transform in D50992. Here in vector-combine, we can
check the cost model to be sure it's profitable, so there should be less risk.

Differential Revision: https://reviews.llvm.org/D79452
2020-05-08 16:31:12 -04:00
Sanjay Patel 46d6f76be3 [InstCombine] fix typo in comment; NFC 2020-05-08 15:43:14 -04:00
zoecarver f65f566aeb Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-08 12:24:10 -07:00
Sanjay Patel 5cf17034e5 [InstCombine] add helper for known exact cast to FP; NFC
As suggested in D79116 - there's shared logic between the
existing code and potential new folds. This could go in
ValueTracking if it seems generally useful.
2020-05-08 15:22:36 -04:00
Ricky Zhou b38d77f185 [SimplifyCFG] Remap rewritten debug intrinsic operands.
FoldBranchToCommonDest clones instructions to a different basic block,
but handles debug intrinsics in a separate path. Previously, when
cloning debug intrinsics, their operands were not updated to reference
the correct cloned values. As a result, we would emit debug.value
intrinsics with broken operand references which are discarded in later
passes. This leads to incorrect debuginfo that reports incorrect values
for variables.

Fix this by remapping debug intrinsic operands when cloning them.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45667.

Differential Revision: https://reviews.llvm.org/D79602
2020-05-08 11:10:25 -07:00
Sanjay Patel ff9045dc9c [InstCombine] clean up foldItoFPtoI; NFC
Mostly cosmetic improvements to variable names and logic to ease
refactoring suggested in D79116.
2020-05-08 12:13:42 -04:00
Sanjay Patel 09d70e0588 [InstCombine] simplify code for FP to integer casts; NFCI
FoldIToFPtoI() returns immediately if the operand is not
an opposite cast instruction, so the extra checks in the
callers are redundant.
2020-05-08 10:14:03 -04:00
Benjamin Kramer f936457f80 Revert "Recommit "[LV] Induction Variable does not remain scalar under tail-folding.""
This reverts commit ae45b4dbe7. It
causes miscompilations, test case on the mailing list.
2020-05-08 14:49:10 +02:00
Diego Caballero f5224d437e [LoopFusion] Remove unreachable blocks from DT and LI after fusion
This patch removes FC0.ExitBlock and FC1GuardBlock from DT and LI
after fusion of guarded loops. They become unreachable and LI
verification failed when they happened to be inside another loop.

Reviewed By: kbarton

Differential Revision: https://reviews.llvm.org/D78679
2020-05-07 16:44:40 -07:00
Johannes Doerfert edf0391491 [Attributor][FIX] Record dependences for assumed dead abstract attributes
In a recent patch we introduced a problem with abstract attributes that
were assumed dead at some point. Since `Attributor::updateAA` was
introduced in 95e0d28b71, we did not
remember the dependence on the liveness AA when an abstract attribute
was assumed dead and therefore not updated.

Explicit reproducer added in liveness.ll.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 509242 (345483/s)
temporary memory allocations: 98666 (66937/s)
peak heap memory consumption: 18.60MB
peak RSS (including heaptrack overhead): 103.29MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 529332 (355494/s)
temporary memory allocations: 102107 (68574/s)
peak heap memory consumption: 19.40MB
peak RSS (including heaptrack overhead): 102.79MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 20090 (1339333/s)
temporary memory allocations: 3441 (229400/s)
peak heap memory consumption: 801.45KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-07 17:00:50 -05:00
Johannes Doerfert 675334daef [Attributor] Mark dependence as optional 2020-05-07 17:00:50 -05:00
Alina Sbirlea 6227f021ad [SimpleLoopUnswitch] Update DefaultExit condition to check unreachable is not empty.
Summary:
Update the check for the default exit block to not only check that the
terminator is not unreachable, but also check that unreachable block has
*only* the unreachable instruction.

Reviewers: chandlerc

Subscribers: hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78277
2020-05-07 13:48:30 -07:00
Huihui Zhang 1ec0cc0f02 [InstCombine][SVE] Fix visitExtractElementInst for scalable type.
Summary:
This patch fix the following issues with visitExtractElementInst:

      1. Restrict VectorUtils::findScalarElement to fixed-length vector.
         For scalable type, the number of elements in shuffle mask is
         unknown at compile-time.
      2. Fix out-of-range calculation for fixed-length vector.
      3. Skip scalable type when analysis rely on fixed number of elements.
      4. Add unit tests to check functionality of extractelement for scalable type.

Reviewers: sdesmalen, efriedma, spatel, nikic

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78267
2020-05-07 13:03:52 -07:00
Huihui Zhang 08c9c13749 [InstCombine][SVE] Fix visitInsertElementInst for scalable type.
Summary:
This patch fixes the following issues in visitInsertElementInst:

      1. Bail out for scalable type when analysis requires fixed size number of vector elements.
      2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion
          on scalable vector type.
      3. For scalable type, avoid folding a chain of insertelement into splat:
            insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ...
              ->
            shufflevector(insertelt(X, %k, 0), undef, zero)
          The length of scalable vector is unknown at compile-time, therefore we don't know if
          given insertelement sequence is valid for splat.

Reviewers: sdesmalen, efriedma, spatel, nikic

Reviewed By: sdesmalen, efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78895
2020-05-07 12:44:52 -07:00
Sanjay Patel 02051c7f3a [SLP] add another bailout for load-combine patterns (2nd try)
The original patch (rG86dfbc676ebe) exposed an existing bug:
we could wrongly cast a constant expression to BinaryOperator
because the pattern matching allows that. This adds a check
for that case, and there's a reduced test case to verify no
crashing.

Original commit message:

This builds on the or-reduction bailout that was added with D67841.
We still do not have IR-level load combining, although that could
be a target-specific enhancement for -vector-combiner.

The heuristic is narrowly defined to catch the motivating case from
PR39538:
https://bugs.llvm.org/show_bug.cgi?id=39538
...while preserving existing functionality.

That is, there's an unmodified test of pure load/zext/store that is
not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll.
That's the reason for the logic difference to require the 'or'
instructions. The chances that vectorization would actually help a
memory-bound sequence like that seem small, but it looks nicer with:

  vpmovzxwd     (%rsi), %xmm0
  vmovdqu       %xmm0, (%rdi)

rather than:

  movzwl        (%rsi), %eax
  movl  %eax, (%rdi)
  ...

In the motivating test, we avoid creating a vector mess that is
unrecoverable in the backend, and SDAG forms the expected bswap
instructions after load combining:

  movzbl (%rdi), %eax
  vmovd %eax, %xmm0
  movzbl 1(%rdi), %eax
  vmovd %eax, %xmm1
  movzbl 2(%rdi), %eax
  vpinsrb $4, 4(%rdi), %xmm0, %xmm0
  vpinsrb $8, 8(%rdi), %xmm0, %xmm0
  vpinsrb $12, 12(%rdi), %xmm0, %xmm0
  vmovd %eax, %xmm2
  movzbl 3(%rdi), %eax
  vpinsrb $1, 5(%rdi), %xmm1, %xmm1
  vpinsrb $2, 9(%rdi), %xmm1, %xmm1
  vpinsrb $3, 13(%rdi), %xmm1, %xmm1
  vpslld $24, %xmm0, %xmm0
  vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
  vpslld $16, %xmm1, %xmm1
  vpor %xmm0, %xmm1, %xmm0
  vpinsrb $1, 6(%rdi), %xmm2, %xmm1
  vmovd %eax, %xmm2
  vpinsrb $2, 10(%rdi), %xmm1, %xmm1
  vpinsrb $3, 14(%rdi), %xmm1, %xmm1
  vpinsrb $1, 7(%rdi), %xmm2, %xmm2
  vpinsrb $2, 11(%rdi), %xmm2, %xmm2
  vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
  vpinsrb $3, 15(%rdi), %xmm2, %xmm2
  vpslld $8, %xmm1, %xmm1
  vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero
  vpor %xmm2, %xmm1, %xmm1
  vpor %xmm1, %xmm0, %xmm0
  vmovdqu %xmm0, (%rsi)

  movl  (%rdi), %eax
  movl  4(%rdi), %ecx
  movl  8(%rdi), %edx
  movbel        %eax, (%rsi)
  movbel        %ecx, 4(%rsi)
  movl  12(%rdi), %ecx
  movbel        %edx, 8(%rsi)
  movbel        %ecx, 12(%rsi)

Differential Revision: https://reviews.llvm.org/D78997
2020-05-07 15:04:37 -04:00
Christopher Tetreault b6c6bab9a5 [SVE] Fix incorrect usage of getNumElements() in InstCombineCalls
Summary:
Remove incorrect usage of getNumElements() from visitCallInst(). The
number of elements was being used to construct a DemandedElts bitfield.
This operation does not make sense for scalable vectors. Cast to
FixedVectorType

Identified by test case Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_mla.c

Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, david-arm

Reviewed By: david-arm

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79524
2020-05-07 08:46:51 -07:00
Hans Wennborg c54c6ee1a7 Revert "[SLP] add another bailout for load-combine patterns"
It caused asserts building Chromium, see discussion on
https://reviews.llvm.org/D78997

This reverts commit 86dfbc676e.
2020-05-07 16:31:52 +02:00
Sjoerd Meijer 3bbc71d6c9 [LV] Fix typo in variable name. NFC. 2020-05-07 13:53:44 +01:00
Calixte Denizet bec223a9bc [profile] Don't crash when forking in several threads
Summary:
When forking in several threads, the counters were written out in using the same global static variables (see GCDAProfiling.c): that leads to crashes.
So when there is a fork, the counters are resetted in the child process and they will be dumped at exit using the interprocess file locking.
When there is an exec, the counters are written out and in case of failures they're resetted.

Reviewers: jfb, vsk, marco-c, serge-sans-paille

Reviewed By: marco-c, serge-sans-paille

Subscribers: llvm-commits, serge-sans-paille, dmajor, cfe-commits, hiraditya, dexonsmith, #sanitizers, marco-c, sylvestre.ledru

Tags: #sanitizers, #clang, #llvm

Differential Revision: https://reviews.llvm.org/D78477
2020-05-07 14:13:11 +02:00
Sjoerd Meijer ae45b4dbe7 Recommit "[LV] Induction Variable does not remain scalar under tail-folding."
With 3 llvm regr tests fixed/updated that I had missed.
2020-05-07 11:52:20 +01:00
Yevgeny Rouban b921543c49 SplitIndirectBrCriticalEdges: Fix Branch Probability update
Splitting critical edges for indirect branches
the SplitIndirectBrCriticalEdges() function may break branch
probabilities if target basic block happens to have unset
a probability for any of its successors. That is because in
such cases the getEdgeProbability(Target) function returns
probability 1/NumOfSuccessors and it is called after Target
was split (thus Target has a single successor). As the result
the correspondent successor of the split block gets
probability 100% but 1/NumOfSuccessors is expected (or better
be left unset).

Reviewers: yamauchi
Differential Revision: https://reviews.llvm.org/D78806
2020-05-07 15:31:44 +07:00
Sjoerd Meijer 20d67ffeae Revert "[LV] Induction Variable does not remain scalar under tail-folding."
This reverts commit 617aa64c84.

while I investigate buildbot failures.
2020-05-07 09:29:56 +01:00
Sjoerd Meijer 617aa64c84 [LV] Induction Variable does not remain scalar under tail-folding.
If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.

Differential Revision: https://reviews.llvm.org/D78911
2020-05-07 09:15:23 +01:00
Whitney Tsang 0a52401ad6 [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels
loop nest.

Summary: As discussed in https://reviews.llvm.org/D73129.

Example
Before unroll and jam:

for
  A
  for
    B
    for
      C
    D
  E
After unroll and jam (currently):

for
  A
  A'
  for
    B
    for
      C
    D
    B'
    for
      C'
    D'
  E
  E'
After unroll and jam (Ideal):

for
  A
  A'
  for
    B
    B'
    for
      C
      C'
    D
    D'
  E
  E'
This is the first patch to change unroll and jam to work in the ideal
way.
This patch change the safety checks needed to make sure is safe to
unroll and jam in the ideal way.

Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: Meinersbur
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D76132
2020-05-06 21:47:44 +00:00
zoecarver 1998e796e9 Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic."
This reverts commit 95aa28cc8f.
2020-05-06 11:07:22 -07:00
zoecarver 95aa28cc8f Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-06 10:58:08 -07:00
Sanjay Patel 2058c98715 [InstCombine] limit bitcast+insertelement transform to x86 MMX type
This is unusual for the general case because we are replacing
1 instruction with 2.

Splitting from a potential conflicting transform in D79171
2020-05-06 13:12:36 -04:00
Matt Arsenault 59bc99a08a InstCombine: Fix return after else 2020-05-06 11:53:26 -04:00
Benjamin Kramer d5ea89f891 Quiet some -Wdocumentation warnings. 2020-05-06 11:23:13 +02:00
Vitaly Buka 04bd2c37ca [local-bounds] Ignore volatile operations
Summary:
-fsanitize=local-bounds is very similar to ``object-size`` and
should also ignore volatile pointers.
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#volatile

Reviewers: chandlerc, rsmith

Reviewed By: rsmith

Subscribers: cfe-commits, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D78607
2020-05-05 23:08:08 -07:00
Johannes Doerfert f014972446 [Attributor][NFC] Cleanup some AAMemoryLocation code
This is the first step to resolve a TODO in AAMemoryLocation and to fix
a bug we have when handling `byval` arguments of `readnone` call sites.

No functional change intended.
2020-05-05 23:15:33 -05:00
Johannes Doerfert 0cc9c02255 [Attributor][NFC] Minor code cleanups to minimize follow up diffs 2020-05-05 23:14:23 -05:00
Johannes Doerfert 094137a6c6 [Attributor][NFC] Avoid dependences on known information 2020-05-05 23:14:23 -05:00
Christopher Tetreault 855e02e799 [SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem
Summary:
getLogBase2 tries to iterate over the number of vector elements. Since
the number of elements of a scalable vector is unknown at compile time,
we must return null if the input type is scalable.

Identified by test LLVM.Transforms/InstCombine::nsw.ll

Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel

Reviewed By: efriedma, fpetrogalli

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79197
2020-05-05 15:19:01 -07:00
Kazu Hirata e8984fe65b [Inlining] Teach shouldBeDeferred to take the total cost into account
Summary:
This patch teaches shouldBeDeferred to take into account the total
cost of inlining.

Suppose we have a call hierarchy {A1,A2,A3,...}->B->C.  (Each of A1,
A2, A3, ... calls B, which in turn calls C.)

Without this patch, shouldBeDeferred essentially returns true if

  TotalSecondaryCost < IC.getCost()

where TotalSecondaryCost is the total cost of inlining B into As.
This means that if B is a small wraper function, for example, it would
get inlined into all of As.  In turn, C gets inlined into all of As.
In other words, shouldBeDeferred ignores the cost of inlining C into
each of As.

This patch adds an option, inline-deferral-scale, to replace the
expression above with:

  TotalCost < Allowance

where

- TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and
- Allowance is IC.getCost() * Scale

For now, the new option defaults to -1, disabling the new scheme.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79138
2020-05-05 11:02:06 -07:00
Sanjay Patel 86dfbc676e [SLP] add another bailout for load-combine patterns
This builds on the or-reduction bailout that was added with D67841.
We still do not have IR-level load combining, although that could
be a target-specific enhancement for -vector-combiner.

The heuristic is narrowly defined to catch the motivating case from
PR39538:
https://bugs.llvm.org/show_bug.cgi?id=39538
...while preserving existing functionality.

That is, there's an unmodified test of pure load/zext/store that is
not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll.
That's the reason for the logic difference to require the 'or'
instructions. The chances that vectorization would actually help a
memory-bound sequence like that seem small, but it looks nicer with:

  vpmovzxwd	(%rsi), %xmm0
  vmovdqu	%xmm0, (%rdi)

rather than:

  movzwl	(%rsi), %eax
  movl	%eax, (%rdi)
  ...

In the motivating test, we avoid creating a vector mess that is
unrecoverable in the backend, and SDAG forms the expected bswap
instructions after load combining:

  movzbl (%rdi), %eax
  vmovd %eax, %xmm0
  movzbl 1(%rdi), %eax
  vmovd %eax, %xmm1
  movzbl 2(%rdi), %eax
  vpinsrb $4, 4(%rdi), %xmm0, %xmm0
  vpinsrb $8, 8(%rdi), %xmm0, %xmm0
  vpinsrb $12, 12(%rdi), %xmm0, %xmm0
  vmovd %eax, %xmm2
  movzbl 3(%rdi), %eax
  vpinsrb $1, 5(%rdi), %xmm1, %xmm1
  vpinsrb $2, 9(%rdi), %xmm1, %xmm1
  vpinsrb $3, 13(%rdi), %xmm1, %xmm1
  vpslld $24, %xmm0, %xmm0
  vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
  vpslld $16, %xmm1, %xmm1
  vpor %xmm0, %xmm1, %xmm0
  vpinsrb $1, 6(%rdi), %xmm2, %xmm1
  vmovd %eax, %xmm2
  vpinsrb $2, 10(%rdi), %xmm1, %xmm1
  vpinsrb $3, 14(%rdi), %xmm1, %xmm1
  vpinsrb $1, 7(%rdi), %xmm2, %xmm2
  vpinsrb $2, 11(%rdi), %xmm2, %xmm2
  vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
  vpinsrb $3, 15(%rdi), %xmm2, %xmm2
  vpslld $8, %xmm1, %xmm1
  vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero
  vpor %xmm2, %xmm1, %xmm1
  vpor %xmm1, %xmm0, %xmm0
  vmovdqu %xmm0, (%rsi)

  movl	(%rdi), %eax
  movl	4(%rdi), %ecx
  movl	8(%rdi), %edx
  movbel	%eax, (%rsi)
  movbel	%ecx, 4(%rsi)
  movl	12(%rdi), %ecx
  movbel	%edx, 8(%rsi)
  movbel	%ecx, 12(%rsi)

Differential Revision: https://reviews.llvm.org/D78997
2020-05-05 12:44:38 -04:00
Simon Pilgrim 4e3c005554 [TTI] getScalarizationOverhead - use explicit VectorType operand
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions).

Followup to D78357

Reviewed By: @samparker

Differential Revision: https://reviews.llvm.org/D79341
2020-05-05 16:59:23 +01:00
Arthur Eubanks d056c0c71f Remove unnecessary check for inalloca in IPConstantPropagation
Summary:
This was added in https://reviews.llvm.org/D2449, but I'm not sure it's
necessary since an inalloca value is never a Constant (should be an
AllocaInst).

Reviewers: hans, rnk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79350
2020-05-05 08:26:11 -07:00
Jay Foad 22829ab5fa [InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.

Differential Revision: https://reviews.llvm.org/D79413
2020-05-05 16:25:48 +01:00
David Green 146d44c251 [LSR] Don't require register reuse under postinc
LSR has some logic that tries to aggressively reuse registers in
formula. This can lead to sub-optimal decision in complex loops where
the backend it trying to use shouldFavorPostInc. This disables the
re-use in those situations.

Differential Revision: https://reviews.llvm.org/D79301
2020-05-05 16:04:50 +01:00
Jay Foad fa2783d79a [InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x)
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.

Differential Revision: https://reviews.llvm.org/D79407
2020-05-05 14:46:08 +01:00
Simon Pilgrim 5c91aa6603 [InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2))
This adds a general combine that can be used to fold:

  or(zext(OP(x)), shl(zext(OP(y)),bw/2))
-->
  OP(or(zext(x), shl(zext(y),bw/2)))

Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance).

We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y))

Fixes PR45715

Reviewed By: @lebedev.ri

Differential Revision: https://reviews.llvm.org/D79041
2020-05-05 12:30:10 +01:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Pratyai Mazumder 08032e7192 [SanitizerCoverage] Replace the unconditional store with a load, then a conditional store.
Reviewers: vitalybuka, kcc

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79392
2020-05-05 02:25:05 -07:00
Sergey Dmitriev f637334df9 [CallGraphUpdater] Removed references to calles when deleting function
Summary: Otherwise we can get unaccounted references to call graph nodes.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79382
2020-05-04 18:59:47 -07:00
Zola Bridges 8d8fda49c9 [llvm][dfsan][NFC] Factor out fcn initialization
Summary:
Moving these function initializations into separate functions makes it easier
to read the runOnModule function. There is also precedent in the sanitizer code:
asan has a function ModuleAddressSanitizer::initializeCallbacks(Module &M). I
thought it made sense to break the initializations into two sets. One for the
compiler runtime functions and one for the event callbacks.

Tested with: check-all

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D79307
2020-05-04 10:01:40 -07:00
Simon Pilgrim 940061438e [InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476)
This patch adds support for discarding integer absolutes (abs + nabs variants) from self-multiplications.

ABS Alive2: http://volta.cs.utah.edu:8080/z/rwcc8W
NABS Alive2: http://volta.cs.utah.edu:8080/z/jZXUwQ

This is an InstCombine version of D79304 - I'm not sure yet if we'll need that after this.

Reviewed By: @lebedev.ri and @xbolva00

Differential Revision: https://reviews.llvm.org/D79319
2020-05-04 15:21:52 +01:00
Jay Foad e737847b8f [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func
optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.

Based on a patch by Tim Renouf.

Differential Revision: https://reviews.llvm.org/D68231
2020-05-04 10:54:07 +01:00
Florian Hahn 935685f420 [SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC).
There's no need to duplicate the logic to push to the different
work-lists.
2020-05-04 10:19:39 +01:00
Johannes Doerfert 14cb0bdf2b [Attributor][NFC] Replace the nested AAMap with a key pair
No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 512375 (362871/s)
temporary memory allocations: 98746 (69933/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.78MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 509833 (338534/s)
temporary memory allocations: 98902 (65671/s)
peak heap memory consumption: 18.71MB
peak RSS (including heaptrack overhead): 103.00MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -2542 (-27042/s)
temporary memory allocations: 156 (1659/s)
peak heap memory consumption: -3.83MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-03 22:10:47 -05:00
Johannes Doerfert 95e0d28b71 [Attributor] Remember only necessary dependences
Before we eagerly put dependences into the QueryMap as soon as we
encountered them (via `Attributor::getAAFor<>` or
`Attributor::recordDependence`). Now we will wait to see if the
dependence is useful, that is if the target is not already in a fixpoint
state at the end of the update. If so, there is no need to record the
dependence at all.

Due to the abstraction via `Attributor::updateAA` we will now also treat
the very first update (during attribute creation) as we do subsequent
updates.

Finally this resolves the problematic usage of QueriedNonFixAA.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 554675 (389245/s)
temporary memory allocations: 101574 (71280/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 512465 (345559/s)
temporary memory allocations: 98832 (66643/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.58MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -42210 (-727758/s)
temporary memory allocations: -2742 (-47275/s)
peak heap memory consumption: -5.92MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-03 22:01:51 -05:00
Johannes Doerfert 231026a508 [Attributor] Inititialize "value attributes" w/ must-be-executed-context info
Attributes that only depend on the value (=bit pattern) can be
initialized from uses in the must-be-executed-context (MBEC). We did use
`AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before
to do this for some positions of these attributes but not for all. This
was fairly complicated and also problematic as we did run it in every
`updateImpl` call even though we only use known information. The new
implementation removes `AAComposeTwoGenericDeduction`* and
`AAFromMustBeExecutedContext` in favor of a simple interface
`AddInformation::fromMBEContext(...)` which we call from the
`initialize` methods of the "value attribute" `Impl` classes, e.g.
`AANonNullImpl:initialize`.

There can be two types of test changes:
  1) Artifacts were we miss some information that was known before a
     global fixpoint was reached and therefore available in an update
     but not at the beginning.
  2) Deduction for values we did not derive via the MBEC before or which
     were not found as the `AAFromMustBeExecutedContext::updateImpl` was
     never invoked.

* An improved version of AAComposeTwoGenericDeduction can be found in
  D78718. Once we find a new use case that implementation will be able
  to handle "generic" AAs better.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 468428 (328952/s)
temporary memory allocations: 77480 (54410/s)
peak heap memory consumption: 32.71MB
peak RSS (including heaptrack overhead): 122.46MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 554720 (351310/s)
temporary memory allocations: 101650 (64376/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.75MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 86292 (556722/s)
temporary memory allocations: 24170 (155935/s)
peak heap memory consumption: -4.25MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78719
2020-05-03 21:41:22 -05:00
Johannes Doerfert 87f1e93945 [Attributor][NFC] Use reference instead of pointer 2020-05-03 21:38:06 -05:00
Johannes Doerfert 2f97b8b891 [Attributor][NFC] Proactively ask for `nocapure` on call site arguments
This minimizes test noise later on and is in line with other attributes
we derive proactively.
2020-05-03 21:38:06 -05:00
Sergey Dmitriev 0f70f73308 [Attributor] Bitcast constant to the returned value type if it has different type
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79277
2020-05-03 11:46:13 -07:00
Hongtao Yu 911e06f5eb [ICP] Handling must tail calls in indirect call promotion
Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2202:                                   ; preds = %605, %2201, %2199
      ret void, !dbg !229485

This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      ret void, !dbg !229485

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      ret void, !dbg !229485

Differential Revision: https://reviews.llvm.org/D79258
2020-05-03 10:42:22 -07:00
Mircea Trofin bec4ab95a4 [llvm][NFC] Inliner: factor cost and reporting out of inlining process
Summary:
This factors cost and reporting out of the inlining workflow, thus
making it easier to reuse when driving inlining from the upcoming
InliningAdvisor.

Depends on: D79215

Reviewers: davidxl, echristo

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79275
2020-05-03 10:38:28 -07:00
Florian Hahn bbdfcf8f69 [VPlan] Remove unused & undefined print method (NFC). 2020-05-03 18:36:20 +01:00
Johannes Doerfert 8228153f87 [Attributor][NFC] Encode IRPositions in the bits of a single pointer
This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722
2020-05-03 12:15:19 -05:00
Johannes Doerfert 6bf16ee4c5 [Attributor][NFC] Let AbstractAttribute be an IRPosition
Since every AbstractAttribute so far, and for the foreseeable future,
corresponds to a single IRPosition we can simplify the class structure.
We already did this for IRAttribute but there is no reason to stop
there.
2020-05-03 12:13:40 -05:00
Mircea Trofin 667f558c3f [llvm][NFC] Inliner.cpp shouldInline post-commit feedback
Discussion is in https://reviews.llvm.org/D79215
2020-05-03 09:31:31 -07:00
Sanjay Patel 682f0b366b [InstCombine] use select-of-constants with set/clear bit mask patterns
Cond ? (X & ~C) : (X | C) --> (X & ~C) | (Cond ? 0 : C)
Cond ? (X | C) : (X & ~C) --> (X & ~C) | (Cond ? C : 0)

The select-of-constants form results in better codegen.
There's an existing test diff that shows a transform that
results in an extra IR instruction, but that's an existing
problem.

This is motivated by code seen in LLVM itself - see PR37581:
https://bugs.llvm.org/show_bug.cgi?id=37581

define i8 @src(i8 %x, i8 %C, i1 %b)  {
  %notC = xor i8 %C, -1
  %and = and i8 %x, %notC
  %or = or i8 %x, %C
  %cond = select i1 %b, i8 %or, i8 %and
  ret i8 %cond
}

define i8 @tgt(i8 %x, i8 %C, i1 %b)  {
  %notC = xor i8 %C, -1
  %and = and i8 %x, %notC
  %mul = select i1 %b, i8 %C, i8 0
  %or = or i8 %mul, %and
  ret i8 %or
}

http://volta.cs.utah.edu:8080/z/Vt2WVm

Differential Revision: https://reviews.llvm.org/D78880
2020-05-03 09:44:43 -04:00
Nikita Popov b7e2358220 Remove getNumUses() comparisons (NFC)
getNumUses() scans the full use list. Don't use it is we only want
to check if there's zero or one uses.
2020-05-02 11:05:19 +02:00
Nikita Popov 60e9ee16b4 [MergeFuncs] Don't merge shufflevectors with different masks
When the shufflevector mask operand was converted into special
instruction data, the FunctionComparator was not updated to
account for this. As such, MergeFuncs will happily merge
shufflevectors with different masks.

This fixes https://bugs.llvm.org/show_bug.cgi?id=45773.

Differential Revision: https://reviews.llvm.org/D79261
2020-05-02 10:21:14 +02:00
Mircea Trofin 3dbc612cf2 [llvm][NFC] Rename variable as per https://reviews.llvm.org/D79215
Operator error - performed the rename and didn't save.
2020-05-01 16:30:41 -07:00
Mircea Trofin e1c4a7cb16 [llvm][NFC] Inliner: simplify inlining decision logic
Summary:
shouldInline makes a decision based on the InlineCost of a call site, as
well as an evaluation on whether the site should be deferred. This means
it's possible for the decision to be not to inline, even for an
InlineCost that would otherwise allow it.

Both uses of shouldInline performed the exact same logic after calling
it. In addition, the decision on whether to inline or not was
communicated through two values of the Option<InlineCost> return value:
None, or an InlineCost evaluating to false.

Simplified by:
- encapsulating the decision in the return object. The bool it evaluates
to communicates unambiguously the decision. The InlineCost is also
available.
- encapsulated the common post-shouldInline code into shouldInline.

Reviewers: davidxl, echristo, eraman

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79215
2020-05-01 16:18:59 -07:00
Christopher Tetreault beeabe382d [SVE] Fix invalid usage of VectorType::getNumElements() in InstCombine
Summary:
Make foldVectorBinop return null if the instruction type is a scalable
vector. It is unclear what, if any, of this function works with scalable
vectors.

Identified by test LLVM.Transforms/InstCombine::nsw.ll

Reviewers: efriedma, david-arm, fpetrogalli, spatel

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79196
2020-05-01 10:56:29 -07:00
Sanjay Patel 7fa150203f [InstCombine] fix miscompile from multi-use cttz/ctlz transform
PR45762:
https://bugs.llvm.org/show_bug.cgi?id=45762
2020-05-01 13:52:24 -04:00
Florian Hahn d911c17596 [SCCP] Get a copy of the state of CopyOf once.
This fixes potential reference invalidations, when no lattice value is
assigned for CopyOf. As the state of CopyOf won't change while in
handleCallResult, we can get a copy once and use that.

Should fix PR45749.
2020-05-01 14:46:35 +01:00
Benjamin Kramer 7a5a1e9460 [IR] AttributeList::getContext has a single user, remove it. 2020-05-01 14:18:29 +02:00
Florian Hahn 19ab53f1e2 [LoopVersioning] Update setAliasChecks to take ArrayRef argument (NFC).
This cleanup was suggested as part of D78458.
2020-04-30 22:17:12 +01:00
Nikita Popov b74c6d2c9d [InlineFunction] Disable emission of alignment assumptions by default
In D74183 clang started emitting alignment for sret parameters
unconditionally. This caused a 1.5% compile-time regression on
tramp3d-v4. The reason is that we now generate many instance of IR like

    %ptrint = ptrtoint %class.GuardLayers* %guards_m to i64
    %maskedptr = and i64 %ptrint, 3
    %maskcond = icmp eq i64 %maskedptr, 0
    tail call void @llvm.assume(i1 %maskcond)

to preserve the alignment information during inlining. Based on IR
analysis, these assumptions also regress optimization. The attached
phase ordering test case illustrates two issues: One are instruction
count based optimization heuristics, which are affected by the four
additional instructions of the assumption. The other is blocking of
SROA due to ptrtoint casts (PR45763).

We already encountered the same problem in Rust, where we (unlike
Clang) generally prefer to emit alignment information absolutely
everywhere it is available. We were only able to do this after
hardcoding -preserve-alignment-assumptions-during-inlining=false,
because we were seeing significant optimization and compile-time
regressions otherwise.

This patch disables -preserve-alignment-assumptions-during-inlining
by default, because we should not be punishing people for adding
more alignment annotations.

Once the assume bundle work shakes out and we can represent (and use)
alignment assumptions using assume bundles, it should be possible to
re-enable this with reduced overhead.

Differential Revision: https://reviews.llvm.org/D76886
2020-04-30 23:12:54 +02:00
Arthur Eubanks a90948fd6e [NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79152
2020-04-30 09:42:13 -07:00
Jann Horn a22685885d [AddressSanitizer] Instrument byval call arguments
Summary:
In the LLVM IR, "call" instructions read memory for each byval operand.
For example:

```
$ cat blah.c
struct foo { void *a, *b, *c; };
struct bar { struct foo foo; };
void func1(const struct foo);
void func2(struct bar *bar) { func1(bar->foo); }
$ [...]/bin/clang -S -flto -c blah.c -O2 ; cat blah.s
[...]
define dso_local void @func2(%struct.bar* %bar) local_unnamed_addr #0 {
entry:
  %foo = getelementptr inbounds %struct.bar, %struct.bar* %bar, i64 0, i32 0
  tail call void @func1(%struct.foo* byval(%struct.foo) align 8 %foo) #2
  ret void
}
[...]
$ [...]/bin/clang -S -c blah.c -O2 ; cat blah.s
[...]
func2:                                  # @func2
[...]
        subq    $24, %rsp
[...]
        movq    16(%rdi), %rax
        movq    %rax, 16(%rsp)
        movups  (%rdi), %xmm0
        movups  %xmm0, (%rsp)
        callq   func1
        addq    $24, %rsp
[...]
        retq
```

Let ASAN instrument these hidden memory accesses.

This is patch 4/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments

Reviewers: kcc, glider

Reviewed By: glider

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77619
2020-04-30 17:09:13 +02:00
Jann Horn cfe36e4c6a [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
Summary:
Refactor getInterestingMemoryOperands() so that information about the
pointer operand is returned through an array of structures instead of
passing each piece of information separately by-value.

This is in preparation for returning information about multiple pointer
operands from a single instruction.

A side effect is that, instead of repeatedly generating the same
information through isInterestingMemoryAccess(), it is now simply collected
once and then passed around; that's probably more efficient.

HWAddressSanitizer has a bunch of copypasted code from AddressSanitizer,
so these changes have to be duplicated.

This is patch 3/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments

[glider: renamed llvm::InterestingMemoryOperand::Type to OpType to fix
GCC compilation]

Reviewers: kcc, glider

Reviewed By: glider

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77618
2020-04-30 17:09:13 +02:00
Jann Horn 223a95fdf0 [AddressSanitizer] Split out memory intrinsic handling
Summary:
In both AddressSanitizer and HWAddressSanitizer, we first collect
instructions whose operands should be instrumented and memory intrinsics,
then instrument them. Both during collection and when inserting
instrumentation, they are handled separately.

Collect them separately and instrument them separately. This is a bit
more straightforward, and prepares for collecting operands instead of
instructions in a future patch.

This is patch 2/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments

Reviewers: kcc, glider

Reviewed By: glider

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77617
2020-04-30 17:09:13 +02:00
Jann Horn e29996c9a2 [AddressSanitizer] Refactor ClDebug{Min,Max} handling
Summary:
A following commit will split the loop over ToInstrument into two.
To avoid having to duplicate the condition for suppressing instrumentation
sites based on ClDebug{Min,Max}, refactor it out into a new function.

While we're at it, we can also avoid the indirection through
NumInstrumented for setting FunctionModified.

This is patch 1/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments

Reviewers: kcc, glider

Reviewed By: glider

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77616
2020-04-30 17:09:13 +02:00
Alexander Potapenko 7e7754df32 Revert an accidental commit of four AddressSanitizer refactor CLs
I couldn't make arc land the changes properly, for some reason they all got
squashed. Reverting them now to land cleanly.

Summary: This reverts commit cfb5f89b62.

Reviewers: kcc, thejh

Subscribers:
2020-04-30 16:15:43 +02:00
Jann Horn cfb5f89b62 [AddressSanitizer] Refactor ClDebug{Min,Max} handling
Summary:
A following commit will split the loop over ToInstrument into two.
To avoid having to duplicate the condition for suppressing instrumentation
sites based on ClDebug{Min,Max}, refactor it out into a new function.

While we're at it, we can also avoid the indirection through
NumInstrumented for setting FunctionModified.

This is patch 1/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments

Reviewers: kcc, glider

Reviewed By: glider

Subscribers: jfb, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77616
2020-04-30 15:30:46 +02:00
David Spickett 3929429347 [globalopt] Don't emit DWARF fragments for members
of a struct that cover the whole struct

This can happen when the rest of the
members of are zero length. Following
the same pattern applied to the SROA
pass in:
d7f6f1636d

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45335

Differential Revision: https://reviews.llvm.org/D78720
2020-04-30 11:36:55 +01:00
Evgeniy Brevnov 3acf62f3ad [BPI][NFC] IRCE shoud qequest BPI through analysis manager.
Summary: There is no need to create BPI explicitly. It should be requested through AM in a normal way.

Reviewers: skatkov

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79080
2020-04-30 16:04:06 +07:00
Evgeniy Brevnov 3e68a66704 [BPI][NFC] Reuse post dominantor tree from analysis manager when available
Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager.

Reviewers: skatkov, taewookoh, yrouban

Reviewed By: skatkov

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78987
2020-04-30 11:31:03 +07:00
Mircea Trofin 3ab319b295 [llvm][NFC] Use CallBase explicitly instead of Instruction in FunctionComparator
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79098
2020-04-29 15:37:46 -07:00
Mircea Trofin 2c7ff270d2 [llvm][NFC] Inliner: rename call site variables.
Summary:
Renamed 'CS' to 'CB', and, in one case, to a more specific name to avoid
naming collision with outer scope (a maintainability/readability reason,
not correctness)

Also updated comments.

Reviewers: davidxl, dblaikie, jdoerfert

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79101
2020-04-29 15:36:29 -07:00
Anh Tuyen Tran c7878ad231 [VFDatabase] Scalar functions are vector functions with VF =1
Summary:
Return scalar function when VF==1. The new trivial mapping scalar --> scalar when VF==1 to prevent false positive for "isVectorizable" query.

Author: masoud.ataei (Masoud Ataei)

Reviewers: Whitney (Whitney Tsang), fhahn (Florian Hahn), pjeeva01 (Jeeva P.), fpetrogalli (Francesco Petrogalli), rengolin (Renato Golin)

Reviewed By: fpetrogalli (Francesco Petrogalli)

Subscribers: hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D78054
2020-04-29 17:20:37 +00:00
Mircea Trofin 4632b7292a [llvm][NFC] Removed addressed fixme; formatting.
Removed already-addressed fixme, and updated formatting of a few lines
that were triggering Harbormaster.
2020-04-29 09:06:01 -07:00
Hiroshi Yamauchi 1831986826 [PGO][PGSO] Prep for enabling non-cold code size opts under non-partial-profile sample PGO.
Summary:
- Distinguish between partial-profile and non-partial-profile sample PGO.
- Add a flag for partial-profile sample PGO.
- Tune the sample PGO cutoff.
- No default behavior change (yet).

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78949
2020-04-29 08:57:47 -07:00
Mircea Trofin e61247c0a8 [llvm][NFC] Change parameter type to more specific CallBase in IndirectCallPromotion
Reviewers: dblaikie, craig.topper, wmi

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79047
2020-04-29 08:42:32 -07:00
Simon Pilgrim 090cae8491 [TTI] Add DemandedElts to getScalarizationOverhead
The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.

This patch does 2 things:
1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.

This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.

A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.

Reviewed By: @craig.topper

Differential Revision: https://reviews.llvm.org/D78216
2020-04-29 12:00:38 +01:00
Florian Hahn e89379856a Recommit "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)."
The crash that caused the original revert has been fixed in
a3c964a278. I also added a reduced version of the crash reproducer.

This reverts the revert commit 2107af9ccf.
2020-04-29 11:40:39 +01:00
Florian Hahn 616657b39c [LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.

Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78458
2020-04-28 21:47:31 +01:00
Mircea Trofin 8a7cf11f92 [llvm][NFC] Refactor APIs operating on CallBase
Summary:
Refactored the parameter and return type where they are too generally
typed as Instruction.

Reviewers: dblaikie, wmi, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79027
2020-04-28 13:23:47 -07:00
David Blaikie 95e570725a OpenMPOpt::RuntimeFunctionInfo::UsesMap: Use unique_ptr for values to simplify memory management 2020-04-28 12:26:53 -07:00
David Blaikie 3c89256d71 Attributor::ArgumentReplacementMap: Use unique_ptr to simplify memory management 2020-04-28 12:26:52 -07:00
Roman Lebedev a0004358a8
[InstCombine] Negator: 'or' with no common bits set is just 'add'
In `InstCombiner::visitAdd()`, we have
```
  // A+B --> A|B iff A and B have no bits set in common.
  if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT))
    return BinaryOperator::CreateOr(LHS, RHS);
```
so we should handle such `or`'s here, too.
2020-04-28 19:16:32 +03:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Mircea Trofin cb56e9b923 [llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo
Summary:
getProfileCount requires the parameter be a valid CallBase, and its uses
reflect that.

Reviewers: dblaikie, craig.topper, wmi

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78940
2020-04-27 20:47:52 -07:00
Arthur Eubanks 3b0450acec Add IR constructs for preallocated (inalloca replacement)
Add llvm.call.preallocated.{setup,arg} instrinsics.
Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup.
Add "preallocated" parameter attribute, which is like byval but without the copy.

Verifier changes for these IR constructs.

See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74651
2020-04-27 16:15:50 -07:00
Sanjay Patel 21acc0612a [SLP] refactor load-combine logic; NFC
We may want to identify sequences that are not
reductions, but still qualify as load-combines
in the back-end, so make most of the body a
helper function.
2020-04-27 16:02:37 -04:00
Sameer Sahasrabuddhe 8488763682 [NFC] UnifyLoopExits: correctly skip expensive checks 2020-04-27 15:10:35 +05:30
Ayal Zaks a3c964a278 [LV] Fix recording of BranchTakenCount for FoldTail
When folding tail, branch taken count is computed during initial VPlan execution
and recorded to be used by the compare computing the loop's mask. This recording
should directly set the State, instead of reusing Value2VPValue mapping which
serves original Values present prior to vectorization.
The branch taken count may be a constant Value, which may be used elsewhere in
the loop; trying to employ Value2VPValue for both leads to the issue reported in
https://reviews.llvm.org/D76992#inline-721028

Differential Revision: https://reviews.llvm.org/D78847
2020-04-26 20:13:10 +03:00
Florian Hahn 2f3e86b318 [DSE,MSSA] Continue checking more remaining candidates with dbgcnt.
After changing the candidate iteration strategy, we should continue with
the next candidate, rather than breaking out of the loop.
2020-04-26 16:59:32 +01:00
Florian Hahn 7d57d22baa [SCCP] Support ranges for loads and stores.
Integer ranges can be used for loaded/stored values. Note that widening
can be disabled for loads/stores, as we only rely on instructions that
cause continued increases to ranges to be widened (like binary
operators).

Reviewers: efriedma, mssimpso, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78433
2020-04-26 13:16:47 +01:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Nikita Popov 164845cd92 [GVN] Reduce expression size (NFC)
Reduce size of GVN::Expression by reordering fields to reduce padding.
2020-04-26 09:43:35 +02:00
Sergei Trofimovich 09684b08d3 llvm: IPO: handle IRMover error handling, bug #45636
Summary:
Missing error mangling is noticed in
https://bugs.llvm.org/show_bug.cgi?id=45636
where inconsistent profiling input caused
llvm/lld to crash as:

```
Program aborted due to an unhandled Error:
linking module flags 'ProfileSummary':
  IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o'
```

The change does not change the fact that LLVM crashes
but changes error output to say what was incorrect:

```
LLVM ERROR: Function Import: link error:
  linking module flags 'ProfileSummary':
    IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o'
```

Actual crash has yet to be fixed.

Reviewers: lattner

Reviewed By: lattner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78676
2020-04-25 19:16:01 +01:00
Sergey Dmitriev 67aed1469b [Attributor] Do not set 'returned' attribute for arguments that cannot be bitcasted to function result
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78828
2020-04-25 09:49:40 -07:00
Sanjay Patel 4abab5c5ca [InstCombine] generalize canonicalization of masked equality comparisons
(X | MaskC) == C --> (X & ~MaskC) == C ^ MaskC
  (X | MaskC) != C --> (X & ~MaskC) != C ^ MaskC

We have more analyis for 'and' patterns and already lean this way
in the existing code, so this should be neutral or better in IR.

If this does not do as well in codegen, the problem already exists
and we should fix that based on target costs/heuristics.

http://volta.cs.utah.edu:8080/z/oP3ecL

define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) {
  %or = or i8 %x, %OrC
  %eq = icmp eq i8 %or, %C
  store i1 %eq, i1* %p0

  %ne = icmp ne i8 %or, %C
  store i1 %ne, i1* %p1
  ret void
}

define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) {
  %NotOrC = xor i8 %OrC, -1
  %a = and i8 %x, %NotOrC
  %NewC = xor i8 %C, %OrC
  %eq = icmp eq i8 %a, %NewC
  store i1 %eq, i1* %p0

  %ne = icmp ne i8 %a, %NewC
  store i1 %ne, i1* %p1
  ret void
}
2020-04-25 11:31:57 -04:00
Florian Hahn 46a04940e8 [DSE] Add stat for remaining stores after DSE.
Using the existing NumFastStores statistic can be misleading when
comparing the impact of DSE patches.

For example, consider the case where a store gets removed from a
function before it is inlined into another function. A less
powerful DSE might only remove the store from functions it has
been inlined into, which will result in more stores being removed, but
no difference in the actual number of stores after DSE.

The new stat provides the absolute number of stores surviving after
DSE.

Reviewers: dmgreen, bryant, asbirlea, jfb

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D78830
2020-04-25 16:12:55 +01:00
Tyker e5f8a77c19 [AssumeBundles] Refactor asssume builder
Summary:
refactor assume bulider for the next patch.
the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78013
2020-04-25 13:43:52 +02:00
Benjamin Kramer 1d42764df7 Give helpers internal linkage. NFC. 2020-04-25 11:50:52 +02:00
Ehud Katz 64249f177e [CodeExtractor] Fix extraction of a value used only by intrinsics outside of region
We should only skip `lifetime` and `dbg` intrinsics when searching for users.
Other intrinsics are legit users that can't be ignored.

Without this fix, the testcase would result in an invalid IR. `memcpy`
will have a reference to the, now, external value (local to the
extracted loop function).

Fix PR42194

Differential Revision: https://reviews.llvm.org/D78749
2020-04-25 11:44:47 +03:00
Craig Topper 2c24051bac [CallSite removal] Rename CallSite.h to AbstractCallSite.h. NFC
The CallSite and ImmutableCallSite were removed in a previous
commit. So rename the file to match the remaining class and
the name of the cpp that implements it.
2020-04-24 22:12:25 -07:00
Tyker 97ecd91e20 [NFC] Refactor SimplifyCFG to make propagating information easier.
Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77742
2020-04-24 22:22:20 +02:00
Michael Liao 495bb8feb9 Fix `-Wparentheses` warnings. NFC. 2020-04-24 15:04:01 -04:00
Tyker 42431da895 [AssumeBundles] Use assume bundles in isKnownNonZero
Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero

Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1

Reviewed By: jdoerfert

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76149
2020-04-24 20:41:51 +02:00
Florian Hahn e1235831c4 [DSE,MSSA] Improve debug output (NFC).
This patch slightly improves the formatting of the debug output, adds a
few missing outputs and makes some existing outputs more consistent with
the rest.
2020-04-24 17:50:08 +01:00
Florian Hahn 44ce588670 [DSE,MSSA] Skip checking write clobber for DomAccess (NFC).
There is no need to check if the starting access for is a write clobber
and all of its uses have already been checked.
2020-04-24 17:16:22 +01:00
Sanjay Patel e4175ff525 [InstCombine] intersect FMF when reassociating FP min/max intrinsics
As discussed in PR45478:
https://bugs.llvm.org/show_bug.cgi?id=45478
...propagating FMF from the outer (second) call is not correct,
so intersect them instead.
I suspect we could do better (see TODO comment), but mismatched
FMF is probably too rare to care about.

Differential Revision: https://reviews.llvm.org/D78631
2020-04-24 12:14:03 -04:00
Simon Pilgrim 27ad103a3a ARCRuntimeEntryPoints.h - remove unnecessary includes. NFC. 2020-04-24 14:32:45 +01:00
Max Kazantsev 9cd4debd5a [LoopVectorize] Preserve CFG analyses if CFG wasn't modified
One of transforms the loop vectorizer makes is LCSSA formation. In some cases it
is the only transform it makes. We should not drop CFG analyzes if only LCSSA was
formed and no actual CFG changes was made.

We should think of expanding this logic to other passes as well, and maybe make
it a part of PM framework.

Reviewed By: Florian Hahn
Differential Revision: https://reviews.llvm.org/D78360
2020-04-24 17:22:24 +07:00
Johannes Doerfert 1dfc473177 Revert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer"
A dependent patch has been reverted [0]. Until it goes back in this one
has to stay out.

[0] ebdb893994

This reverts commit d254b50b2b.
2020-04-24 02:53:51 -05:00
Johannes Doerfert d254b50b2b [Attributor][NFC] Encode IRPositions in the bits of a single pointer
This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722
2020-04-24 01:58:47 -05:00
Mircea Trofin b8960b5d81 [llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78789
2020-04-23 22:19:39 -07:00
Mehdi Amini 2107af9ccf Revert "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)."
This reverts commit 9245c7ac13.

This is triggering a segfault in XLA downstream, we'll follow-up with
a reproducer, it is likely influenced by TTI/TLI settings or other
options as a simple `opt -loop-vectorize` invocation on the IR
before the crash does not reproduce immediately.
2020-04-24 05:07:32 +00:00
Mircea Trofin 2059a6e3ef [llvm][NFC][CallSite] Remove ImmutableCallSite from a few locations
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78783
2020-04-23 21:18:44 -07:00
Craig Topper cbe77ca9bd [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-23 21:01:48 -07:00
Craig Topper 81c5e83f7d [CallSite removal][Transform] Replace CallSite with CallBase in Utils. NFC
Differential Revision: https://reviews.llvm.org/D78780
2020-04-23 20:49:33 -07:00
Roman Lebedev 5a159ed2a8
[InstCombine] Negator: don't negate multi-use `sub`
While we can do that, it doesn't increase instruction count,
if the old `sub` sticks around then the transform is not only
not a unlikely win, but a likely regression, since we likely
now extended live range and use count of both of the `sub` operands,
as opposed to just the result of `sub`.

As Kostya Serebryany notes in post-commit review in
https://reviews.llvm.org/D68408#1998112
this indeed can degrade final assembly,
increase register pressure, and spilling.

This isn't what we want here,
so at least for now let's guard it with an use check.
2020-04-23 23:59:15 +03:00
Christopher Tetreault 7ca56c90bd [SVE] Remove calls to isScalable from Transforms
Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77756
2020-04-23 13:50:07 -07:00
Mircea Trofin ceb7f308b8 [llvm][NFC][CallSite] Removed CallSite from few implementation details
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78724
2020-04-23 10:36:36 -07:00
Mircea Trofin cea6f4d5f8 [llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78666
2020-04-23 08:23:16 -07:00
Sanjay Patel 62da6ecea2 [InstCombine] substitute equivalent constant to reduce logic-of-icmps
(X == C) && (Y Pred1 X) --> (X == C) && (Y Pred1 C)
(X != C) || (Y Pred1 X) --> (X != C) || (Y Pred1 C)

This cooperates/overlaps with D78430, but it is a more general transform
that gets us most of the expected simplifications and several other
improvements.
http://volta.cs.utah.edu:8080/z/5gxjjc

PR45618:
https://bugs.llvm.org/show_bug.cgi?id=45618

Differential Revision: https://reviews.llvm.org/D78582
2020-04-23 10:19:16 -04:00
Simon Pilgrim 7a8b1096be [ObjCARC] Remove unused forward declarations. NFC. 2020-04-23 13:52:49 +01:00
Simon Pilgrim b108a457e1 [VPlan] Remove unused forward declarations. NFC.
Move VPlan.h include from VPlanVerifier.h down to VPlanVerifier.cpp
2020-04-23 12:34:20 +01:00
Serguei Katkov c0d2bbb1d4 [CaptureTracking] Replace hardcoded constant to option. NFC.
The motivation is to be able to play with the option and change if it is required.

Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert
Reviewed By: fedor.sergeev
Subscribers: hiraditya, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D78624
2020-04-23 18:23:35 +07:00
Florian Hahn 9245c7ac13 [VPlan] Add & use VPValue operands for VPWidenRecipe (NFC).
This patch adds VPValue version of the instruction operands to
VPWidenRecipe and uses them during code-generation.

Similar to D76373 this reduces ingredient def-use usage by ILV as
a step towards full VPlan-based def-use relations.

Reviewers: rengolin, Ayal, gilr

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D76992
2020-04-23 12:16:46 +01:00
Craig Topper 25807452ac [ArgumentPromotion] Remove unnecessary getScalarType() before casting to PointerType. NFC
I don't believe this pass deals with vectors of pointers. I think
this getScalarType() was added during a mechanical opaque pointer
change of the interface to GetElementPtrInst::getIndexedType.
2020-04-22 22:51:41 -07:00
Vedant Kumar 2fa656cdfd [Debugify] Do not require named metadata to be present when stripping
This allows -mir-strip-debug to be run without -debugify having run
before.
2020-04-22 17:03:39 -07:00
Vedant Kumar 2a5675f11d [MachineDebugify] Insert synthetic DBG_VALUE instructions
Summary:
Teach MachineDebugify how to insert DBG_VALUE instructions.  This can
help find bugs causing CodeGen differences when debug info is present.
DBG_VALUE instructions are only emitted when -debugify-level is set to
locations+variables.

There is essentially no attempt made to match up DBG_VALUE register
operands with the local variables they ought to correspond to. I'm not
sure how to improve the situation. In some cases (MachineMemOperand?)
it's possible to find the IR instruction a MachineInstr corresponds to,
but in general this seems to call for "undoing" the work done by ISel.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78135
2020-04-22 17:03:39 -07:00
Juneyoung Lee aca335955c [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 08:08:53 +09:00
Juneyoung Lee 5ceef26350 Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison"
This reverts commit 80faa8c3af.
2020-04-23 08:07:09 +09:00
Juneyoung Lee 80faa8c3af RFC: [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 07:57:12 +09:00
Florian Hahn 352b612a71 [SCCP] Drop unnecessary early exit for ExtractValueInst.
visitExtractValueInst uses mergeInValue, so it already can handle
constant ranges. Initially the early exit was using isOverdefined to
keep things as NFC during the initial move to ValueLatticeElement.
As the function already supports constant ranges, it can just use
ValueState[&I].isOverdefined.

Reviewers: efriedma, mssimpso, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78393
2020-04-22 22:07:59 +01:00
Craig Topper be04aba6fc [CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC
Differential Revision: https://reviews.llvm.org/D78613
2020-04-22 12:06:58 -07:00
Christopher Tetreault 2dea3f1298 [SVE] Add new VectorType subclasses
Summary:
Introduce new types for fixed width and scalable vectors.

Does not remove getNumElements yet so as to not break code during transition
period.

Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr

Reviewed By: sdesmalen

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D77587
2020-04-22 08:59:01 -07:00
Mircea Trofin 1b6b05a250 [llvm][NFC][CallSite] Remove CallSite from a few trivial locations
Summary: Implementation details and internal (to module) APIs.

Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78610
2020-04-22 08:39:21 -07:00
Dmitry Vyukov 5a2c31116f [TSAN] Add optional support for distinguishing volatiles
Add support to optionally emit different instrumentation for accesses to
volatile variables. While the default TSAN runtime likely will never
require this feature, other runtimes for different environments that
have subtly different memory models or assumptions may require
distinguishing volatiles.

One such environment are OS kernels, where volatile is still used in
various places for various reasons, and often declare volatile to be
"safe enough" even in multi-threaded contexts. One such example is the
Linux kernel, which implements various synchronization primitives using
volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency
Sanitizer (KCSAN) [1], is a runtime that uses TSAN instrumentation but
otherwise implements a very different approach to race detection from
TSAN.

While in the Linux kernel it is generally discouraged to use volatiles
explicitly, the topic will likely come up again, and we will eventually
need to distinguish volatile accesses [2]. The other use-case is
ignoring data races on specially marked variables in the kernel, for
example bit-flags (here we may hide 'volatile' behind a different name
such as 'no_data_race').

[1] https://github.com/google/ktsan/wiki/KCSAN
[2] https://lkml.kernel.org/r/CANpmjNOfXNE-Zh3MNP=-gmnhvKbsfUfTtWkyg_=VqTxS4nnptQ@mail.gmail.com

Author: melver (Marco Elver)
Reviewed-in: https://reviews.llvm.org/D78554
2020-04-22 17:27:09 +02:00
Roman Lebedev 67266d879c
[InstCombine] Negator: shufflevector is negatible
All these folds are correct as per alive-tv
2020-04-22 15:14:23 +03:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Johannes Doerfert ca59ff5af9 [Attributor] Replace AccessKind2Accesses map with an "array map"
The number of different access location kinds we track is relatively
small (8 so far). With this patch we replace the DenseMap that mapped
from index (0-7) to the access set pointer with an array of access set
pointers. This reduces memory consumption.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 472499 (215654/s)
temporary memory allocations: 77794 (35506/s)
peak heap memory consumption: 35.28MB
peak RSS (including heaptrack overhead): 125.46MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 472270 (308673/s)
temporary memory allocations: 77578 (50704/s)
peak heap memory consumption: 32.70MB
peak RSS (including heaptrack overhead): 121.78MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -229 (346/s)
temporary memory allocations: -216 (326/s)
peak heap memory consumption: -2.58MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---
2020-04-22 01:35:27 -05:00