Somehow the test introduced in https://reviews.llvm.org/D118006 produces the expected result but running
through lli with Intel SDE activated sneaks in an error code 2 (before this commit) or an error code 10
(after this commit).
The test as is is still meaningful in that the LLVMIR generation would crash if the `elementtype` is set
improperly.
Still, this should run with lli turned on.
Inspired by a recent Discourse post on undef vs. poison usage, this
series of patches should reduce the number of undefs in LLVM tests by
around 10%.
Only undef vector operands to insertelement/shufflevector have been
handled, which are by far the most common we've got.
The switchover is split into 3 fairly arbitrary clusters to make it
slightly more manageable: vector predication, fixed-length vectors,
scalable vectors.
This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D118006
Extract the division representation from equality constraints.
For example:
32*k == 16*i + j - 31 <-- k is the localVariable
expr = 16*i + j - 31, divisor = 32
k = (16*i + j - 32) floordiv 32
The dividend of the division is set to [16, 1, -32] and the divisor is set
to 32.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117959
Currently, ARMBaseInstrInfo::getInstSizeInBytes() uses hard-coded
instruction size for some pseudo-instructions, while this
information should ideally be found in ARMInstrInfo.td,
ARMInstrThumb(2).td files (which can be accessed via MCInstrDesc). Hence,
the .td files should be updated and no hard-coded instruction sizes
should be used by getInstSizeInBytes() anymore.
Differential Revision: https://reviews.llvm.org/D118009
Currently, AArch64InstrInfo::getInstSizeInBytes() uses hard-coded
instruction size for some pseudo-instructions, while this
information should ideally be found in AArch64InstrInfo.td file (which
can be accessed via MCInstrDesc). Hence, the .td file should be updated
and no hard-coded instruction sizes should be used by
getInstSizeInBytes() anymore.
Differential Revision: https://reviews.llvm.org/D117970
This test shows a loop, whose preheader uses a SEW=64, LMUL=1 vector
operation. The loop body starts off with another SEW=64, LMUL=1 VADD
vector operation, before switching to a SEW=32, LMUL=1/2 vector store
instruction.
We can see that the VSETVLI insertion pass omits a VSETVLI before the
VADD (thinking it inherits its configuration from the preheader) but
does place a SEW=32, LMUL=1/2 VSETVLI before the store. This results in
a miscompilation as when the loop comes back around, the VADD is
incorrectly configured with SEW=32, LMUL=1/2.
It appears to be a bad load/store optimization, as replacing the vector
store with an SEW=32, LMUL=1/2 VADD does correctly insert a VSETVLI. The
issue is therefore possibly arising from canSkipVSETVLIForLoadStore.
Differential Revision: https://reviews.llvm.org/D118629
These tests appear to be causing timeouts on our silent
Thumbv7 bot: https://lab.llvm.org/staging/#/builders/162/builds/260
It is possible they would complete given enough time. value-profile-switch
seems to take a long time even on a powerful Armv8 machine.
Do "simplifyShift" and "FoldConstantArithmetic" folds for the SSHLSAT
and USHLSAT DAG nodes.
This includes folds such as:
(shlsat undef/poison, x) -> 0
(shlsat x, undef/poison) -> undef
(shlsat x, too_large_shamt) -> undef
(shlsat 0, x) -> 0
(shlsat x, 0) -> x
(shlsat c1, c2) -> c3
Differential Revision: https://reviews.llvm.org/D118603
This removes another instance of recipe execution still relying on
the cost model.
Depends on D116554.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D116656
I have updated TargetLowering::isConstTrueVal to also consider
SPLAT_VECTOR nodes with constant integer operands. This allows the
optimisation to also work for targets that support scalable vectors.
Differential Revision: https://reviews.llvm.org/D117210
In some cases StructurizeCFG inserts i1 xor instructions to invert
predicates. Add a quick loop to clean these up afterwards if we can get
away with modifying an existing compare instruction instead.
(StructurizeCFG is generally run late in the pipeline so instcombine
does not clean them up for us.)
Differential Revision: https://reviews.llvm.org/D118623
A call base can be a floating value if we talk about the instruction and
not the return value. This distinction was not made before but is
important for liveness, e.g., a call site return value might be unused
(=dead) but the call site is not.
When exporting textual IR during reduction the ShouldPreserveUseListOrder
parameter of the IR printer should be set to get predictable results.
Differential Revision: https://reviews.llvm.org/D118585
Fixes https://github.com/llvm/llvm-project/issues/52772.
This patch fixes the formatting of the code:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b = g([] {
return;
});
```
which should be left as is, but before this patch was formatted to:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b = g([] {
return;
});
```
Reviewed By: MyDeveloperDay, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D115972
Fixes https://github.com/llvm/llvm-project/issues/34626.
Before, the include sorter would break the code:
```
#include <stdio.h>
#include <stdint.h> /* long
comment */
```
and change it into:
```
#include <stdint.h> /* long
#include <stdio.h>
comment */
```
This commit handles only the most basic case of a single block comment on an include line, but does not try to handle all the possible edge cases with multiple comments.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118627
To make usage easier (compared to the many reachability related AAs),
this patch introduces a helper API, `AA::isPotentiallyReachable`, which
performs all the necessary steps. It also does the "backwards"
reachability (see D106720) as that simplifies the AA a lot (backwards
queries were somewhat different from the other query resolvers), and
ensures we use cached values in every stage.
To test inter-procedural reachability in a reasonable way this patch
includes an extension to `AAPointerInfo::forallInterferingWrites`.
Basically, we can exclude writes if they cannot reach a load "during the
lifetime" of the allocation. That is, we need to go up the call graph to
determine reachability until we can determine the allocation would be
dead in the caller. This leads to new constant propagations (through
memory) in `value-simplify-pointer-info-gpu.ll`.
Note: The new code contains plenty debug output to determine how
reachability queries are resolved.
Parts extracted from D110078.
Differential Revision: https://reviews.llvm.org/D118673
D106720 introduced features that did not work properly as we could add
new queries after a fixpoint was reached and which could not be answered
by the information gathered up to the fixpoint alone.
As an alternative to D110078, which forced eager computation where we
want to continue to be lazy, this patch fixes the problem.
QueryAAs are AAs that allow lazy queries during their lifetime. They are
never fixed if they have no outstanding dependences and always run as
part of the updates in an iteration. To determine if we are done, all
query AAs are asked if they received new queries, if not, we only need
to consider updated AAs, as before. If new queries are present we go for
another iteration.
Differential Revision: https://reviews.llvm.org/D118669
This test shows how we can use alloca position and kernel+AS information
to improve reachability queries and consequently store-load forwarding.
The thirst argument passed to the @use function can be determined
statically (a constant). The others cannot and are there for
verification.
This patch implement instruction reachability for AAFunctionReachability
attribute. It is used to tell if a certain instruction can reach a function
transitively.
NOTE: I created a new commit based of D106720 and set the author back to
Kuter. Other metadata, etc. is wrong. I also addressed the
remaining review comments and fixed the unit test.
Differential Revision: https://reviews.llvm.org/D106720
We missed out on AANoRecurse in the module pass because we had no call
graph. With AAFunctionReachability we can simply ask if the function may
reach itself.
Differential Revision: https://reviews.llvm.org/D110099
genericValueTraversal can look through arguments and allow value
simplification across function boundaries. In fact, the latter already
happened unchecked. With this change we allow the user of
genericValueTraversal to opt-out of interprocedural traversal if
required. We explicitly look through arguments now which helps to do
various things, incl. the propagation of constants into OpenMP parallel
regions (on the host).
Following the discussion in D118318, mark `arith.addf/mulf` commutative.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D118600
Part 2 of 3 of unifying the assembly formats of attributes/types and operations.The last patch that introduced attribute/type formats (D111594) factored out the format lexer entirely. This patch factors out most of the format parsers such that the attribute/type and op parsers only need to implement handling for specific elements.
Certain things could be factored better (element verification, 'seen' variables) but the primary goal of factoring is so that features can be used across both assembly formats.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117971
This fixes a conceptual problem with our AAIsDead usage which conflated
call site liveness with call site return value liveness. Without the
fix tests would obviously miscompile as we make genericValueTraversal
more powerful (in a follow up). The effects on the tests are mixed but
mostly marginal. The most prominent one is the lack of `noreturn` for
functions. The reason is that we make entire blocks live at the same
time (for time reasons). Now that we actually look at the block
liveness, which we need to do, the return instructions are live and
will survive. As an example, `noreturn_async.ll` has been modified
to retain the `noreturn` even with block granularity. We could address
this easily but there is little need in practice.