This adds a new option to control AllowSpeculation added in D119965 when
using `-passes=...`.
This allows reproducing #54023 using opt.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D121944
Skip phi nodes in the preheader. They may not be considered loop
invariant by the assertion below.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D121010
This extract a common isNotVisibleOnUnwind() helper into
AliasAnalysis, which handles allocas, byval arguments and noalias
calls. After D116998 this could also handle sret arguments. We
have similar logic in DSE and MemCpyOpt, which will be switched
to use this helper as well.
The noalias call case is a bit different from the others, because
it also requires that the object is not captured. The caller is
responsible for doing the appropriate check.
Differential Revision: https://reviews.llvm.org/D117000
In D115311, we're looking to modify clang to emit i constraints rather
than X constraints for callbr's indirect destinations. Prior to doing
so, update all of the existing tests in llvm/ to match.
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D115410
When determining whether the memory is local to the function (and
we can thus introduce spurious writes without thread-safety issues),
check for a noalias call rather than the hardcoded list of memory
allocation functions. Noalias calls are the more general way to
determine allocation functions, as long as we're only interested
in the property that the returned value is distinct from any other
accessible memory.
Differential Revision: https://reviews.llvm.org/D116728
This reverts commit 640beb38e7.
That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort).
Reverting until we have a better solution to s_cselect_b64 codegen cleanup
Change-Id: Ibf8e397df94001f248fba609f072088a46abae08
Reviewed By: kzhuravl
Differential Revision: https://reviews.llvm.org/D115960
Change-Id: Id169459ce4dfffa857d5645a0af50b0063ce1105
This reverts change 2c391a5/D87551. As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause.
A profile driven LICM is not an unreasonable design, it simply is not what we have. Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole.
Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink. As such, we have no public examples which motivate a profit based hoisting approach.
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289
When we check if a load is loop invariant by finding a dominating
invariant.start call, we strip bitcasts until we get to an i8* Value,
and look for an invariant.start use of the i8* Value.
We may accidentally end up at an i8 global and look at a global's uses,
which we shouldn't do in a loop pass. Although we could make this
logic work with globals, that's not currently intended.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D111098
MSSA-based LICM has been enabled by default for a few years now.
This drops the old AST-based implementation. Using loop(licm) will
result in a fatal error, the use of loop-mssa(licm) is required
(or just licm, which defaults to loop-mssa).
Note that the core canSinkOrHoistInst() logic has to retain AST
support for now, because it is shared with LoopSink.
Differential Revision: https://reviews.llvm.org/D108244
This was a diagnostic option used to demonstrate a weakness in
the AST-based LICM implementation. This problem does not exist
in the MSSA-based LICM implementation, which has been enabled
for a long time now. As such, this option is no longer relevant.
This option has been enabled by default for quite a while now.
The practical impact of removing the option is that MSSA use
cannot be disabled in default pipelines (both LPM and NPM) and
in manual LPM invocations. NPM can still choose to enable/disable
MSSA using loop vs loop-mssa.
The next step will be to require MSSA for LICM and drop the
AST-based implementation entirely.
Differential Revision: https://reviews.llvm.org/D108075
This is enabled by default. Drop explicit uses in preparation for
removing the option.
Also drop RUN lines that are now the same (typically modulo a
-verify-memoryssa option).
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D107219
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.
This patch introduces such an API and uses it in relevant passes. See
updated tests.
Fix for PR50744.
Reviewed By: nikic, jdoerfert, lebedev.ri
Differential Revision: https://reviews.llvm.org/D104641
Proposed alternative to D105338.
This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).
Differential Revision: https://reviews.llvm.org/D106309
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.
I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.
Differential Revision: https://reviews.llvm.org/D106749
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D104180
There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.
This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR.
This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.
Reapply with fixes for clang tests.
-----
This is a simple enum attribute. Test changes are because enum
attributes are sorted before type attributes, so mustprogress is
now in a different position.
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
The MaybePromotable set keeps track of loads/stores for which
promotion was not attempted yet. Normally, any load/stores that
are promoted in the current iteration will be removed from this
set, because they naturally MustAlias with the promoted value.
However, if the source program has UB with metadata claiming that
a store is NoAlias, while it is actually MustAlias, and multiple
different pointers are promoted in the same iteration, it can
happen that a store is removed that is still in the MaybePromotable
set, causing a use-after-free.
While this could be fixed by explicitly invalidating values in
MaybePromotable in the LoopPromoter, I'm going with the more
radical option of dropping the set entirely here and check all
load/stores on each promotion iteration. As promotion, and especially
repeated promotion, are quite rare, this doesn't seem to have any
impact on compile-time.
Fixes https://bugs.llvm.org/show_bug.cgi?id=50367.
This appears to miscompile google benchmark's GetCacheSizesFromKVFS()
when compiling with -fstrict-vtable-pointers.
Runnable reproducer: https://godbolt.org/z/f9ovKqTzb
The "f.fail()" crashes with BUS error, it is compiled into testb,
and the adress it is testing is non-sensical.
This reverts commit 4c89bcadf6.
During store promotion, we check whether the pointer was captured
to exclude potential reads from other threads. However, we're only
interested in captures before or inside the loop. Check this using
PointerMayBeCapturedBefore against the loop header.
Differential Revision: https://reviews.llvm.org/D100706