Commit Graph

456 Commits

Author SHA1 Message Date
Nico Weber 085f078307 Revert "Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`.""
This reverts commit 859ebca744.
The change contained many unrelated changes and e.g. restored
unit test failes for the old lld port.
2022-01-05 13:10:25 -05:00
David Salinas 859ebca744 Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`."
This reverts commit 640beb38e7.

That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort).
Reverting until we have a better solution to s_cselect_b64 codegen cleanup

Change-Id: Ibf8e397df94001f248fba609f072088a46abae08

Reviewed By: kzhuravl

Differential Revision: https://reviews.llvm.org/D115960

Change-Id: Id169459ce4dfffa857d5645a0af50b0063ce1105
2022-01-05 17:57:32 +00:00
Djordje Todorovic 0b23b80985 [NFC][LICM] Update the comment in the scalar-promote.ll
The comment was stale after the https://reviews.llvm.org/D113289
was committed.
2021-12-06 03:00:39 -08:00
Philip Reames 1a25d0bfbb [LICM] Remove profile driven restriction on hoisting
This reverts change 2c391a5/D87551.  As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause.

A profile driven LICM is not an unreasonable design, it simply is not what we have.  Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole.

Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink.  As such, we have no public examples which motivate a profit based hoisting approach.
2021-12-03 17:19:25 -08:00
Djordje Todorovic 2cdc6f2ca6 Reland "[LICM] Hoist LOAD without sinking the STORE"
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.

Please consider this small example:

loop {
  var = *ptr;
  if (var) break;
  *ptr= var + 1;
}
After this patch, it will be:

var0 = *ptr;
loop {
  var1 = phi (var0, var2);
  if (var1) break;
  var2 = var1 + 1;
  *ptr = var2;
}
This addresses some problems from [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=51193

Differential revision: https://reviews.llvm.org/D113289
2021-12-02 03:53:50 -08:00
Djordje Todorovic 8e757b2383 [LICM] Adding the test as a precommit for the D113289 2021-12-02 03:38:14 -08:00
Djordje Todorovic 72f9f066df Revert "[LICM] Hoist LOAD without sinking the STORE"
This reverts commit ecb9d8e4e3.

I'll reland this as soon as the failing tests are fixed/updated.
2021-12-01 04:39:26 -08:00
Djordje Todorovic ecb9d8e4e3 [LICM] Hoist LOAD without sinking the STORE
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.

Please consider this small example:

loop {
  var = *ptr;
  if (var) break;
  *ptr= var + 1;
}
After this patch, it will be:

var0 = *ptr;
loop {
  var1 = phi (var0, var2);
  if (var1) break;
  var2 = var1 + 1;
  *ptr = var2;
}
This addresses some problems from [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=51193

Differential revision: https://reviews.llvm.org/D113289
2021-12-01 04:27:50 -08:00
Nikita Popov 1e1a8be21f [LICM] Support opaque pointers in scalar promotion
Make sure that all pointers have the same load/store access type,
rather than comparing pointer element types.
2021-12-01 12:56:24 +01:00
Nikita Popov eee035235e [LICM] Regenerate test checks (NFC) 2021-11-29 21:28:33 +01:00
Bjorn Pettersson a413663d8f [NewPM][test] Avoid using -enable-new-pm=1 since -passes implies new PM 2021-10-20 15:16:17 +02:00
Arthur Eubanks 7f28b4d5b7 [LICM] Bail if checking a global/constant for invariant.start
When we check if a load is loop invariant by finding a dominating
invariant.start call, we strip bitcasts until we get to an i8* Value,
and look for an invariant.start use of the i8* Value.

We may accidentally end up at an i8 global and look at a global's uses,
which we shouldn't do in a loop pass. Although we could make this
logic work with globals, that's not currently intended.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D111098
2021-10-04 14:14:25 -07:00
Arthur Eubanks bb69f1dcf9 [test] Precommit test about hoisting invariant loads from globals
And run update_test_checks.py on hoisting.ll.
2021-10-04 13:46:47 -07:00
Simon Pilgrim 993f3c61b3 [TTI] getUserCost - Ensure a vector insert/extract index is in unsigned 32-bit range
Otherwise fallback to the generic 'unknown index' path

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=29050
2021-09-25 10:50:54 +01:00
Arthur Eubanks 37e6a27da7 [test] Fixup tests with -analyze in llvm/test/Transforms 2021-09-04 16:45:51 -07:00
Nikita Popov 3dd8c9176b [LICM] Remove AST-based implementation
MSSA-based LICM has been enabled by default for a few years now.
This drops the old AST-based implementation. Using loop(licm) will
result in a fatal error, the use of loop-mssa(licm) is required
(or just licm, which defaults to loop-mssa).

Note that the core canSinkOrHoistInst() logic has to retain AST
support for now, because it is shared with LoopSink.

Differential Revision: https://reviews.llvm.org/D108244
2021-08-18 20:21:53 +02:00
Nikita Popov e918ba6958 [LICM] Drop -licm-n2-threshold option
This was a diagnostic option used to demonstrate a weakness in
the AST-based LICM implementation. This problem does not exist
in the MSSA-based LICM implementation, which has been enabled
for a long time now. As such, this option is no longer relevant.
2021-08-17 22:41:31 +02:00
Whitney Tsang a41c95c0e3 [LNICM] Fix infinite loop
There is a bug introduced by https://reviews.llvm.org/D107219 which causes an infinite loop, when there are more than 2 levels PHINode chain.

Reviewed By: uint256_t

Differential Revision: https://reviews.llvm.org/D108166
2021-08-17 12:55:22 +09:00
Nikita Popov 735a590471 [MemorySSA] Remove -enable-mssa-loop-dependency option
This option has been enabled by default for quite a while now.
The practical impact of removing the option is that MSSA use
cannot be disabled in default pipelines (both LPM and NPM) and
in manual LPM invocations. NPM can still choose to enable/disable
MSSA using loop vs loop-mssa.

The next step will be to require MSSA for LICM and drop the
AST-based implementation entirely.

Differential Revision: https://reviews.llvm.org/D108075
2021-08-16 20:59:37 +02:00
Nikita Popov e11354c0a4 [Tests] Remove explicit -enable-mssa-loop-dependency options (NFC)
This is enabled by default. Drop explicit uses in preparation for
removing the option.

Also drop RUN lines that are now the same (typically modulo a
-verify-memoryssa option).
2021-08-14 21:21:07 +02:00
maekawatoshiki dd3eea6566 [LICM] Support sinking in LNICM
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D107219
2021-08-13 00:56:26 +09:00
Anna Thomas 8ee5759fd5 Strip undef implying attributes when moving calls
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.

This patch introduces such an API and uses it in relevant passes. See
updated tests.

Fix for PR50744.

Reviewed By: nikic, jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104641
2021-07-27 10:57:05 -04:00
Eli Friedman 5c486ce04d [LLVM IR] Allow volatile stores to trap.
Proposed alternative to D105338.

This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).

Differential Revision: https://reviews.llvm.org/D106309
2021-07-26 10:51:00 -07:00
Nikita Popov 33146857e9 [IR] Consider non-willreturn as side effect (PR50511)
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.

I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.

Differential Revision: https://reviews.llvm.org/D106749
2021-07-26 16:35:14 +02:00
Nikita Popov c7e69e46c8 [Tests] Add additional tests for incorrect willreturn handling (NFC)
Highlight a few of the places that don't handle non-willreturn
calls correctly right now.
2021-07-24 17:27:29 +02:00
Nikita Popov baa51a0cef [Tests] Add missing willreturn attributes (NFC)
To retain the spirit of these tests after an upcoming change
to mayHaveSideEffect(), add willreturn attributes to a number
of functions.
2021-07-24 17:17:48 +02:00
Nikita Popov 0339fcc728 [LICM] Extract debugify test (NFC)
Only one of the tests in the file wants to check debug info, so
move it into a separate file. This allows update_test_checks to
work.
2021-07-24 17:04:42 +02:00
Nikita Popov 4294657bd5 [LICM][SCCP] Regenerate test checks (NFC) 2021-07-22 21:37:21 +02:00
Eli Friedman 4f993463ca [NFC] Run -instnamer on test Transforms/LICM/sink-debuginfo-preserve.ll 2021-07-19 11:00:47 -07:00
maekawatoshiki 74f0f9a455 [LICM] Create LoopNest Invariant Code Motion (LNICM) pass
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D104180
2021-07-20 00:31:18 +09:00
Philip Reames e75a2dfe20 [tests] Stablize tests for possible change in deref semantics
There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.
2021-07-14 13:05:43 -07:00
Nikita Popov ff8b1b1b9c Reapply [IR] Don't mark mustprogress as type attribute
Reapply with fixes for clang tests.

-----

This is a simple enum attribute. Test changes are because enum
attributes are sorted before type attributes, so mustprogress is
now in a different position.
2021-07-09 20:57:44 +02:00
Nikita Popov 23dd750279 Revert "[IR] Don't mark mustprogress as type attribute"
This reverts commit 84ed3a794b.

A number of clang tests are also affected by this change. Revert
until I can update them.
2021-07-09 18:46:00 +02:00
Nikita Popov 84ed3a794b [IR] Don't mark mustprogress as type attribute
This is a simple enum attribute.

Test changes are because enum attributes are sorted before type
attributes.
2021-07-09 18:24:16 +02:00
Anna Thomas e9a3637c0c Precommit tests for context senstive attribute dropping
Precommit tests from D104641.
The patch will fix the callsites by dropping the context-sensitive
attributes.

Reviewed-By: Self
2021-06-24 13:18:16 -04:00
Bjorn Pettersson 4c7f820b2b Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.

The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.

One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.

Differential Revision: https://reviews.llvm.org/D99439
2021-06-17 09:38:28 +02:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Nikita Popov e81334a754 [LICM] Remove MaybePromotable set (PR50367)
The MaybePromotable set keeps track of loads/stores for which
promotion was not attempted yet. Normally, any load/stores that
are promoted in the current iteration will be removed from this
set, because they naturally MustAlias with the promoted value.
However, if the source program has UB with metadata claiming that
a store is NoAlias, while it is actually MustAlias, and multiple
different pointers are promoted in the same iteration, it can
happen that a store is removed that is still in the MaybePromotable
set, causing a use-after-free.

While this could be fixed by explicitly invalidating values in
MaybePromotable in the LoopPromoter, I'm going with the more
radical option of dropping the set entirely here and check all
load/stores on each promotion iteration. As promotion, and especially
repeated promotion, are quite rare, this doesn't seem to have any
impact on compile-time.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50367.
2021-05-18 20:26:01 +02:00
Roman Lebedev 1acd9a1a29
Revert "[LICM] Hoist loads with invariant.group metadata"
This appears to miscompile google benchmark's GetCacheSizesFromKVFS()
when compiling with -fstrict-vtable-pointers.
Runnable reproducer: https://godbolt.org/z/f9ovKqTzb
The "f.fail()" crashes with BUS error, it is compiled into testb,
and the adress it is testing is non-sensical.

This reverts commit 4c89bcadf6.
2021-05-08 15:44:49 +03:00
Nikita Popov d440f9a326 [LICM] Make capture check more precise
During store promotion, we check whether the pointer was captured
to exclude potential reads from other threads. However, we're only
interested in captures before or inside the loop. Check this using
PointerMayBeCapturedBefore against the loop header.

Differential Revision: https://reviews.llvm.org/D100706
2021-04-19 20:34:23 +02:00
Nikita Popov ae2da68da6 [LICM] Add more tests for promotion and capture (NFC)
We could optimize the first case, as the pointer is captured only
after the loop.
2021-04-17 16:57:15 +02:00
Philip Reames ff55d01a8e [nofree] Restrict semantics to memory visible to caller
This patch clarifies the semantics of the nofree function attribute to make clear that it provides an "as if" semantic. That is, a nofree function is guaranteed not to free memory which existed before the call, but might allocate and then deallocate that same memory within the lifetime of the callee.

This is the result of the discussion on llvm-dev under the thread "Ambiguity in the nofree function attribute".

The most important part of this change is the LangRef wording. The rest is minor comment changes to emphasize the new semantics where code was accidentally consistent, and fix one place which wasn't consistent. That one place is currently narrowly used as it is primarily part of the ongoing (and not yet enabled) deref-at-point semantics work.

Differential Revision: https://reviews.llvm.org/D100141
2021-04-16 11:38:55 -07:00
Philip Reames dd985551c2 Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches.
This reverts commit ab98f2c712 and 98eea392cd.

It includes a fix for the clang test which triggered the revert.  I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message.  Odd.  Since only one build bot reported the clang test, I didn't notice that one.
2021-04-14 16:38:07 -07:00
Nico Weber ab98f2c712 Revert "[InferAttributes] Materialize all infered attributes for declaration"
Breaks check-clang, see comments on D100400

Also revert follow-up "[NFC] Move a recently added utility into a location to enable reuse"

This reverts commit 3ce61fb6d6.
This reverts commit 61a85da882.
2021-04-14 18:41:20 -04:00
Philip Reames 61a85da882 [InferAttributes] Materialize all infered attributes for declaration
We have some cases today where attributes can be inferred from another on access, but the result is not explicitly materialized in IR. This change is a step towards changing that.

Why? Two main reasons:

* Human clarity. It's really confusing trying to figure out why a transform is triggering when the IR doesn't appear to have the required attributes.
* This avoids the need to special case declarations in e.g. functionattrs. Since we can assume the attribute is present, we can work directly from attributes (and only attributes) without also needing to query accessors on Function to avoid missing cases due to unannotated (but infered on use) declarations. (This piece will appear must easier to follow once D100226 also lands.)

Differential Revision: https://reviews.llvm.org/D100400
2021-04-14 14:45:24 -07:00
Arthur Eubanks 4c89bcadf6 [LICM] Hoist loads with invariant.group metadata
Previously loading the vtable used in calling a virtual method in a loop
was not hoisted out of the loop. This fixes that.

canSinkOrHoistInst() itself doesn't check that the load operands are
loop invariant, callers also check that separately.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99784
2021-04-08 21:57:37 -07:00
Philip Reames db357891f0 Infer dereferenceability from malloc and friends
Hookup TLI when inferring object size from allocation calls. This allows the analysis to prove dereferenceability for known allocation functions (such as malloc/new/etc) in addition to those marked explicitly with the allocsize attribute.

This is a follow up to 0129cd5 now that the bug fixed by e2c6621e6 is resolved.

As noted in the test, this relies on being able to prove that there is no free between allocation and context (e.g. hoist location). At the moment, this is handled conservatively. I'm working strengthening out ability to reason about no-free regions separately.

Differential Revision: https://reviews.llvm.org/D99737
2021-04-01 11:33:35 -07:00
Philip Reames e2c6621e63 [deref-at-point] restrict inference of dereferenceability based on allocsize attribute
Support deriving dereferenceability facts from allocation sites with known object sizes while correctly accounting for any possibly frees between allocation and use site. (At the moment, we're conservative and only allowing it in functions where we know we can't free.)

This is part of the work on deref-at-point semantics. I'm making the change unconditional as the miscompile in this case is way too easy to trip by accident, and the optimization was only recently added (by me).

There will be a follow up patch wiring through TLI since that should now be doable without introducing widespread miscompiles.

Differential Revision: https://reviews.llvm.org/D95815
2021-04-01 08:34:40 -07:00
Stefan Gränitz c42c67ad60 Re-apply "[lli] Make -jit-kind=orc the default JIT engine"
MCJIT served well as the default JIT engine in lli for a long time, but the code is getting old and maintenance efforts don't seem to be in sight. In the meantime Orc became mature enough to fill that gap. The newly added greddy mode is very similar to the execution model of MCJIT. It should work as a drop-in replacement for common JIT tasks.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D98931
2021-03-30 12:08:26 +02:00
Philip Reames 88d0f47b4f [test] Add test for hoisting to custom allocation function using allocsize
The first is currently demonstrating a miscompile.
2021-03-25 14:31:51 -07:00
Stefan Gränitz 581adb4f1a Temporarily revert "[lli] Make -jit-kind=orc the default JIT engine"
This reverts commit eaee4f2696.
2021-03-23 12:01:30 +01:00
Stefan Gränitz eaee4f2696 [lli] Make -jit-kind=orc the default JIT engine
MCJIT served well as the default JIT engine in lli for a long time, but the code is getting old and maintenance efforts don't seem to be in sight. In the meantime Orc became mature enough to fill that gap. The newly added greddy mode is very similar to the execution model of MCJIT. It should work as a drop-in replacement for common JIT tasks.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D98931
2021-03-23 10:22:34 +01:00
Philip Reames f24175fcb9 Autogen some tests for ease of update 2021-03-22 11:06:29 -07:00
Philip Reames 7c7f4676cd [LICM] Fix a crash when sinking instructions w/token operands
It is not legal to form a phi node with token type. The generic LCSSA construction code handles this correctly - by not forming LCSSA for such cases - but the adhoc fixup implementation in LICM did not.

This was noticed in the context of PR49607, but can be demonstrated on ToT with the tweaked test case. This is not specific to gc.relocate btw, it also applies to usage of the preallocated family of intrinsics as well.

Differential Revision: https://reviews.llvm.org/D98728
2021-03-17 11:18:46 -07:00
Serguei Katkov cfe8f8e0f0 Revert "Mark gc.relocate and gc.result as readnone"
As readnone function they become movable and LICM can hoist them
out of a loop. As a result in LCSSA form phi node of type token
is created. No one is ready that GCRelocate first operand is phi node
but expects to be token.

GVN test were also updated, it seems it does not do what is expected.
Test for LICM is also added.

This reverts commit f352463ade.
2021-03-12 16:59:17 +07:00
Nikita Popov 403da6a69a Reapply [LICM] Make promotion faster
Relative to the previous implementation, this always uses
aliasesUnknownInst() instead of aliasesPointer() to correctly
handle atomics. The added test case was previously miscompiled.

-----

Even when MemorySSA-based LICM is used, an AST is still populated
for scalar promotion. As the AST has quadratic complexity, a lot
of time is spent in this step despite the existing access count
limit. This patch optimizes the identification of promotable stores.

The idea here is pretty simple: We're only interested in must-alias
mod sets of loop invariant pointers. As such, only populate the AST
with loop-invariant loads and stores (anything else is definitely
not promotable) and then discard any sets which alias with any of
the remaining, definitely non-promotable accesses.

If we promoted something, check whether this has made some other
accesses loop invariant and thus possible promotion candidates.

This is much faster in practice, because we need to perform AA
queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
instead of O(NumTotal^2), and NumPromotable tends to be small.
Additionally, promotable accesses have loop invariant pointers,
for which AA is cheaper.

This has a signicant positive compile-time impact. We save ~1.8%
geomean on CTMark at O3, with 6% on lencod in particular and 25%
on individual files.

Conceptually, this change is NFC, but may not be so in practice,
because the AST is only an approximation, and can produce
different results depending on the order in which accesses are
added. However, there is at least no impact on the number of promotions
(licm.NumPromoted) in test-suite O3 configuration with this change.

Differential Revision: https://reviews.llvm.org/D89264
2021-03-11 10:50:28 +01:00
Xun Li 03f668613c [LICM][Coroutine] Don't sink stores from loops with coro.suspend instructions
See pr46990(https://bugs.llvm.org/show_bug.cgi?id=46990). LICM should not sink store instructions to loop exit blocks which cross coro.suspend intrinsics. This breaks semantic of coro.suspend intrinsic which return to caller directly. Also this leads to use-after-free if the coroutine is freed before control returns to the caller in multithread environment.

This patch disable promotion by check whether loop contains coro.suspend intrinsics.
This is a resubmit of D86190.
Disabling LICM for loops with coroutine suspension is a better option not only for correctness purpose but also for performance purpose.
In most cases LICM sinks memory operations. In the case of coroutine, sinking memory operation out of the loop does not improve performance since coroutien needs to get data from the frame anyway. In fact LICM would hurt coroutine performance since it adds more entries to the frame.

Differential Revision: https://reviews.llvm.org/D96928
2021-03-03 15:21:57 -08:00
Philip Reames 73ef96c49c [tests] highlight cornercase w/deref hoisting from D95815
The main point of committing this early is to have a negative test in tree.  Nothing fails in the current tests if we implement this (currently unsound) optimization.
2021-02-01 13:32:39 -08:00
Florian Hahn e6d758de82
[InferAttrs] Mark some library functions as willreturn.
This patch marks some library functions as willreturn. On the first pass, I
excluded most functions that interact with streams/the filesystem.

Along with willreturn, it also adds nounwind to a set of math functions.
There probably are a few additional attributes we can add for those, but
that should be done separately.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94684
2021-01-18 13:40:21 +00:00
Roman Lebedev 49dac4aca0
[SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
2020-12-17 01:03:49 +03:00
Mircea Trofin f9a27df16b [FileCheck] Enforce --allow-unused-prefixes=false for llvm/test/Transforms
Explicitly opt-out llvm/test/Transforms/Attributor.

Verified by flipping the default value of allow-unused-prefixes and
observing that none of the failures were under llvm/test/Transforms.

Differential Revision: https://reviews.llvm.org/D92404
2020-12-09 08:51:38 -08:00
Jamie Schmeiser 7f6360cdc6 Reland: Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager.  Enable memoryssa for loopsink with new pass manager.  This
combination exposed a bug that was previously fixed for loopsink
without memoryssa.  When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation.  This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.

Respond to review comments.  Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control.  Update tests accordingly.

Respond to review comments.  Add options controlling whether memoryssa is
used for loop sink, defaulting to off.  Expand testing based on these
options.

Respond to review comments.  Properly indicated preserved analyses.

This relanding addresses a compile-time performance problem by moving
test for profile data earlier to avoid unnecessary computations.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
2020-11-20 10:26:33 -05:00
Jamie Schmeiser cff479b145 Revert "Revert "Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug."""
This reverts commit e29292969b.

This apparently causes a regression in compile time (ie, it slows down).
2020-11-18 16:07:16 -05:00
Jamie Schmeiser e29292969b Revert "Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.""
This reverts commit 562addba65.

Reverted change too quickly, the failing test cases passed on the next build.
So reverting revert (to include the changes).
2020-11-18 15:33:02 -05:00
Jamie Schmeiser 562addba65 Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug."
This reverts commit d4ba28bddc.
2020-11-18 15:17:53 -05:00
Jamie Schmeiser d4ba28bddc Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager.  Enable memoryssa for loopsink with new pass manager.  This
combination exposed a bug that was previously fixed for loopsink
without memoryssa.  When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation.  This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.

Respond to review comments.  Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control.  Update tests accordingly.

Respond to review comments.  Add options controlling whether memoryssa is
used for loop sink, defaulting to off.  Expand testing based on these
options.

Respond to review comments.  Properly indicated preserved analyses.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
2020-11-18 14:08:42 -05:00
Atmn Patel 04a0896487 Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
This reverts commit 0b17c6e447. This patch
causes a compile-time error in SCEV.
2020-11-07 00:32:12 -05:00
Atmn Patel 0b17c6e447 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-11-06 22:06:58 -05:00
Quentin Colombet a585228027 Prevent LICM and machineLICM from hoisting convergent operations
Results of convergent operations are implicitly affected by the
enclosing control flows and should not be hoisted out of arbitrary
loops.

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D90361
2020-11-06 10:26:39 -08:00
Eli Friedman 4600e21051 [AArch64][SVE] Drop "argmemonly" from gather/scatter with vector base.
The intrinsics don't have any pointer arguments, so "argmemonly" makes
optimizations think they don't write to memory at all.

Differential Revision: https://reviews.llvm.org/D88186
2020-09-25 16:01:05 -07:00
Vedant Kumar dfc5a9eb57 [Instruction] Add dropLocation and updateLocationAfterHoist helpers
Introduce a helper which can be used to update the debug location of an
Instruction after the instruction is hoisted. This can be used to safely
drop a source location as recommended by the docs.

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-09-24 15:00:04 -07:00
Arthur Eubanks 6700b9de16 [NewPM][MSSA] Fix failures under NPM due to -enable-mssa-loop-dependency
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88128
2020-09-23 15:17:43 -07:00
Arthur Eubanks d6746ecb73 [test][NewPM] Fix update-scev.ll under NPM 2020-09-22 19:26:30 -07:00
Arthur Eubanks 5249e6f248 [LoopSimplifyCFG][NewPM] Rename simplify-cfg -> loop-simplifycfg
This matches the legacy PM name and makes all tests in
Transforms/LoopSimplifyCFG pass under NPM.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87948
2020-09-21 08:27:19 -07:00
Wenlei He 2c391a5a14 [LICM] Make Loop ICM profile aware again
D65060 was reverted because it introduced non-determinism by using BFI counts from already freed blocks. The parent of this revision fixes that by using a VH callback on blocks to prevent this from happening and makes sure BFI data is passed correctly in LoopStandardAnalysisResults.

This re-introduces the previous optimization of using BFI data to prevent LICM from hoisting/sinking if the instruction will end up moving to a colder block.

Internally at Facebook this change results in a ~7% win in a CPU related metric in one of our big services by preventing hoisting cold code into a hot pre-header like the added test case demonstrates.

Testing:
ninja check

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87551
2020-09-15 17:21:58 -07:00
David Sherwood 69cccb3189 [SVE] Fix isLoadInvariantInLoop for scalable vectors
I've amended the isLoadInvariantInLoop function to bail out for
scalable vectors for now since the invariant.start intrinsic is only
ever generated by the clang frontend for thread locals or struct
and class constructors, neither of which support sizeless types.
In addition, the intrinsic itself does not currently support the
concept of a scaled size, which makes it impossible to compare
the sizes of different scalable objects, e.g. <vscale x 32 x i8>
and <vscale x 16 x i8>.

Added new tests here:

  Transforms/LICM/AArch64/sve-load-hoist.ll
  Transforms/LICM/hoisting.ll

Differential Revision: https://reviews.llvm.org/D87227
2020-09-15 08:30:19 +01:00
Vedant Kumar 30c1633386 Revert "[Instruction] Add updateLocationAfterHoist helper"
This reverts commit 4a646ca9e2.

This is causing some bots to fail with "!dbg attachment points at wrong
subprogram for function", like:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/67958/steps/stage%201%20check/logs/stdio
2020-08-11 14:54:09 -07:00
Vedant Kumar 4a646ca9e2 [Instruction] Add updateLocationAfterHoist helper
Introduce a helper on Instruction which can be used to update the debug
location after hoisting.

Use this in GVN and LICM, where we were mistakenly introducing new line
0 locations after hoisting (the docs recommend dropping the location in
this case).

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-08-11 14:05:20 -07:00
Arthur Eubanks d0acd97c68 [NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch
As mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html,
loop-unswitch has not been ported to the NPM. Instead people are using
simple-loop-unswitch.

Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all
other uses of loop-unswitch with simple-loop-unswitch.

One test that didn't fit into the above was
2014-06-21-congruent-constant.ll which seems to only pass with
loop-unswitch. That is also pinned to legacy PM.

Now all tests containing "-loop-unswitch" anywhere in the test succeed with
NPM turned on by default.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85360
2020-08-06 10:56:00 -07:00
Arthur Eubanks 47acbcf09a [tbaa] Rename type-based-aa -> tbaa
For consistency with legacy pass name.
Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D84967
2020-07-30 19:51:35 -07:00
Yuanfang Chen 8224c5047e For some tests targeting SystemZ, -march=z13 ---> -mcpu=z13
z13 is not a target. It is a CPU.
2020-07-29 19:18:01 -07:00
Arthur Eubanks 9bb6ce78be Rename scoped-noalias -> scoped-noalias-aa
Summary: To match NewPM name. Also the new name is clearer and more consistent.

Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84542
2020-07-24 12:14:27 -07:00
Tim Northover 5165b2b5fd AArch64+ARM: make LLVM consider system registers volatile.
Some of the system registers readable on AArch64 and ARM platforms
return different values with each read (for example a timer counter),
these shouldn't be hoisted outside loops or otherwise interfered with,
but the normal @llvm.read_register intrinsic is only considered to read
memory.

This introduces a separate @llvm.read_volatile_register intrinsic and
maps all system-registers on ARM platforms to use it for the
__builtin_arm_rsr calls. Registers declared with asm("r9") or similar
are unaffected.
2020-07-15 09:47:36 +01:00
Fangrui Song 4cd19a6e15 [BasicAA] Rename -disable-basicaa to -disable-basic-aa to be consistent with the canonical name "basic-aa" 2020-06-26 20:55:44 -07:00
Fangrui Song f31811f2dc [BasicAA] Rename deprecated -basicaa to -basic-aa
Follow-up to D82607
Revert an accidental change (empty.ll) of D82683
2020-06-26 20:41:37 -07:00
Tyker b7338fb1a6 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-19 10:32:26 +02:00
Tyker d7deef1206 Revert "[AssumeBundles] add cannonicalisation to the assume builder"
This reverts commit 90c50cad19.
2020-06-16 14:34:55 +02:00
Tyker 90c50cad19 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-16 13:12:35 +02:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Eli Friedman f26bdb539e Make Value::getPointerAlignment() return an Align, not a MaybeAlign.
If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.

Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer.  This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated.  I
updated clang's handling of C++ references, and added a release note for
this.

Differential Revision: https://reviews.llvm.org/D80072
2020-05-20 16:37:20 -07:00
Davide Italiano da52aa2c33 [LICM] When promoting loads to the preheader, drop the location.
It's really almost going to be misleading, see the example in
https://bugs.llvm.org/show_bug.cgi?id=45820

Maybe at some point we can do something fancier, but at least
this will fix a bug where we step on dead code while debugging.
2020-05-14 17:05:23 -07:00
Eli Friedman 4532a50899 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
Tyker 5957e058e4 [AssumeBundles] Remove non-determinisme from assume builder
Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78616
2020-05-10 21:18:33 +02:00
Sam Parker f35ccfa2af [NFC] Update tests
Run the update script on a couple of tests.
2020-05-05 15:28:40 +01:00
Davide Italiano 5f87415efc [LICM] Try to merge debug locations when sinking.
The current strategy LICM uses when sinking for debuginfo is
that of picking the debug location of one of the uses.
This causes stepping to be wrong sometimes, see, e.g. PR45523.

This patch introduces a generalization of getMergedLocation(),
that operates on a vector of locations instead of two, and try
to merge all them together, and use the new API in LICM.

<rdar://problem/61750950>
2020-04-15 12:29:34 -07:00
Tyker c35194b800 [AssumeBundles] preserve information in LICM
Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77407
2020-04-14 12:48:14 +02:00
Juneyoung Lee 9f1f244d3c [LICM] Allow freeze to hoist/sink out of a loop
Summary: This patch allows LICM to hoist/sink freeze instructions out of a loop.

Reviewers: reames, fhahn, efriedma

Reviewed By: reames

Subscribers: jfb, lebedev.ri, hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75400
2020-03-03 12:29:39 +09:00
Philip Reames 14845b2c45 Revert "[LICM] Support hosting of dynamic allocas out of loops"
This reverts commit 8d22100f66.

There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996).  I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly.
2020-02-25 09:05:31 -08:00
Bill Wendling 2fe457690d Filter callbr insts from critical edge splitting
Similarly to how splitting predecessors with an indirectbr isn't handled
in the generic way, we also shouldn't split callbrs, for similar
reasons.
2020-02-20 16:24:42 -08:00