Create a copy of vector-to-loops.mlir and adapt the test for
ProgressiveVectorToSCF. Fix a small bug in getExtractOp() triggered by
this test.
Differential Revision: https://reviews.llvm.org/D102388
Globals with “available_externally” linkage should never be emitted into the
object file corresponding to the LLVM module.
However, AIX system assembler default print error for undefined reference .
so AIX chose to emit the available externally symbols into .s,
so that users won't run into errors in situations like:
clang -target powerpc-ibm-aix -xc -<<<$'extern inline
__attribute__((__gnu_inline__)) void foo() {}\nvoid bar() { foo(); }' -O
-Xclang -disable-llvm-passes
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102377
Do not rely on pass labels to detect if the pattern was already applied in the past (which allows for more some extra optimizations to avoid extra InsertOps and ExtractOps). Instead, check if these optimizations can be applied on-the-fly.
This also fixes a bug, where vector.insert and vector.extract ops sometimes disappeared in the middle of the pass because they get folded away, but the next application of the pattern expected them to be there.
Differential Revision: https://reviews.llvm.org/D102206
This commit fixes the cppcoreguidelines-pro-type-vararg test when it
runs on a Windows host, but the toolchain is targeted a non-Windows
platform.
Reviewed By: njames93
Differential Revision: https://reviews.llvm.org/D102337
On some platforms/compiler combinations, it appears the output is
slightly different. Update the test to use a regex, as is done at other
places in the new-pm-*default.ll tests to address buildbot failures.
This patch adjusts the LTO pipeline in the new PM to run GlobalsAA
before LICM to match the legacy PM.
This fixes a regression where the new PM failed to vectorize loops that
require hoisting/sinking by LICM depending on GlobalsAA info.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102345
Split off from D102345 to commit this separately from other changes in
the patch. This aligns the behavior of the new PM with the legacy PM
for LTO, with respect to running LICM.
Together with the remaining changes in D102345, this fixes new PM
regressions where we fail to vectorize loops that are vectorized with
the legacy PM.
959eec1fdd changed the message
to show the local username with "$user" but this is not always set.
Some systems will have USER/USERNAME/LOGNAME, so just use "whoami"
instead.
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*
*Without this patch, the following causes a selection error:
int test(vector signed long a, vector signed long b) {
return a < b;
}
This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
We already apply loop-guards when computing the maximum with unitary
steps. This extends the code to also do so when dealing with non-unitary
steps.
This allows us to infer a tighter maximum in some cases.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D102267
Original commit message:
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
We want it to be available in analyzes so that we could use the
CodeGen notion in middle-end passes (for example, to check if
a GC may free some particular pointer).
This is a preparatory patch that simply moves the files around.
Note: if this causes some build issues, this patch must just be reverted.
Differential Revision: https://reviews.llvm.org/D100557
Reviewed By: reames
The transferDefinedSymbol operation updates a Symbol's target block, offset,
and size. This can be convenient when you want to redefine the content of some
symbol(s) pointing at a block, while retaining the original block in the graph.
A debugserver launched x86_64 cannot control an arm64/arm64e
process on an Apple Silicon system. Warn when this situation
has happened and return an error for the most common case of
attach. I think there will be refinements to this in the
future, but start out by making it easy to spot the problem
when it happens.
rdar://76630595
Summary: The previous implementation of coro-split didn't collect values
used by dbg instructions into the spills which made a log debug info
unavailable with optimization on.
This patch tries to collect these uses which are used by dbg.values. In
this way, the debugbility of coroutine could be as powerful as normal
functions with optimization on.
To avoid enlarging the coroutine frame, this patch only collects
`dbg.value` whose value is already in the coroutine frame. This decision
may make some debug info getting unavailable. But if we are with
optimization on, the performance issue should be considered first. And
this patch would make the debugbility of coroutine to be better only
without changing the layout of the frame.
Test-plan: check-llvm
Reviewed By: aprantl, lxfind
Differential Revision: https://reviews.llvm.org/D97673
Rounding to integers requires rounding (for floating points) and clipping
to the min/max values of the destination range. Added this behavior and
updated tests appropriately.
Reviewed By: sjarus, silvas
Differential Revision: https://reviews.llvm.org/D102375
This reverts commit 44a4000181.
We are seeing build failures due to missing dependency to libSupport and
CMake Error at tools/clang/tools/clang-repl/cmake_install.cmake
file INSTALL cannot find
Summary: This patch tries to build debug info for coroutine frame in the
middle end. Although the coroutine frame is constructed and maintained by
the compiler and the programmer shouldn't care about the coroutine frame
by the design of C++20 coroutine,
a lot of programmers told me that they want to see the layout of the
coroutine frame strongly. Although C++ is designed as an abstract layer
so that the programmers shouldn't care about the actual memory in bits,
many experienced C++ programmers are familiar with assembler and
debugger to see the memory layout in fact, After I was been told they
want to see the coroutine frame about 3 times, I think it is an actual
and desired demand.
However, the debug information is constructed in the front end and
coroutine frame is constructed in the middle end. This is a natural and
clear gap. So I could only try to construct the debug information in the
middle end after coroutine frame constructed. It is unusual, but we are
in consensus that the approch is the best one.
One hard part is we need construct the name for variables since there
isn't a map from llvm variables to DIVar. Then here is the strategy this
patch uses:
- The name `__resume_fn `, `__destroy_fn` and `__coro_index ` are
constructed by the patch.
- Then the name `__promise` comes from the dbg.variable of corresponding
dbg.declare of PromiseAlloca, which shows highest priority to
construct the debug information for the member of coroutine frame.
- Then if the member is struct, we would try to get the name of the llvm
struct directly. Then replace ':' and '.' with '_' to make it
printable for debugger.
- If the member is a basic type like integer or double, we would try to
emit the corresponding name.
- Then if the member is a Pointer Type, we would add `Ptr` after
corresponding pointee type.
- Otherwise, we would name it with 'UnknownType'.
Reviewered by: lxfind, aprantl, rjmcall, dblaikie
Differential Revision: https://reviews.llvm.org/D99179
Add new type of tree node for `InsertElementInst` chain forming vector.
These instructions could be either removed, or replaced by shuffles during
vectorization and we can add this node to cost model, so naturally estimating
their cost, getting rid of `CompensateCost` tricks and reducing further work
for InstCombine. This fixes PR40522 and PR35732 in a natural way. Also this
patch is the first step towards revectorization of partially vectorization
(to fix PR42022 completely). After adding inserts to tree the next step is
to add vector instructions there (for instance, to merge `store <2 x float>`
and `store <2 x float>` to `store <4 x float>`).
Fixes PR40522 and PR35732.
Differential Revision: https://reviews.llvm.org/D98714
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
Instead of an SCF for loop, these pattern generate fully unrolled loops with no temporary buffer allocations.
Differential Revision: https://reviews.llvm.org/D101981
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745