On some platforms/compiler combinations, it appears the output is
slightly different. Update the test to use a regex, as is done at other
places in the new-pm-*default.ll tests to address buildbot failures.
This patch adjusts the LTO pipeline in the new PM to run GlobalsAA
before LICM to match the legacy PM.
This fixes a regression where the new PM failed to vectorize loops that
require hoisting/sinking by LICM depending on GlobalsAA info.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102345
Split off from D102345 to commit this separately from other changes in
the patch. This aligns the behavior of the new PM with the legacy PM
for LTO, with respect to running LICM.
Together with the remaining changes in D102345, this fixes new PM
regressions where we fail to vectorize loops that are vectorized with
the legacy PM.
959eec1fdd changed the message
to show the local username with "$user" but this is not always set.
Some systems will have USER/USERNAME/LOGNAME, so just use "whoami"
instead.
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*
*Without this patch, the following causes a selection error:
int test(vector signed long a, vector signed long b) {
return a < b;
}
This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
We already apply loop-guards when computing the maximum with unitary
steps. This extends the code to also do so when dealing with non-unitary
steps.
This allows us to infer a tighter maximum in some cases.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D102267
Original commit message:
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
We want it to be available in analyzes so that we could use the
CodeGen notion in middle-end passes (for example, to check if
a GC may free some particular pointer).
This is a preparatory patch that simply moves the files around.
Note: if this causes some build issues, this patch must just be reverted.
Differential Revision: https://reviews.llvm.org/D100557
Reviewed By: reames
The transferDefinedSymbol operation updates a Symbol's target block, offset,
and size. This can be convenient when you want to redefine the content of some
symbol(s) pointing at a block, while retaining the original block in the graph.
A debugserver launched x86_64 cannot control an arm64/arm64e
process on an Apple Silicon system. Warn when this situation
has happened and return an error for the most common case of
attach. I think there will be refinements to this in the
future, but start out by making it easy to spot the problem
when it happens.
rdar://76630595
Summary: The previous implementation of coro-split didn't collect values
used by dbg instructions into the spills which made a log debug info
unavailable with optimization on.
This patch tries to collect these uses which are used by dbg.values. In
this way, the debugbility of coroutine could be as powerful as normal
functions with optimization on.
To avoid enlarging the coroutine frame, this patch only collects
`dbg.value` whose value is already in the coroutine frame. This decision
may make some debug info getting unavailable. But if we are with
optimization on, the performance issue should be considered first. And
this patch would make the debugbility of coroutine to be better only
without changing the layout of the frame.
Test-plan: check-llvm
Reviewed By: aprantl, lxfind
Differential Revision: https://reviews.llvm.org/D97673
Rounding to integers requires rounding (for floating points) and clipping
to the min/max values of the destination range. Added this behavior and
updated tests appropriately.
Reviewed By: sjarus, silvas
Differential Revision: https://reviews.llvm.org/D102375
This reverts commit 44a4000181.
We are seeing build failures due to missing dependency to libSupport and
CMake Error at tools/clang/tools/clang-repl/cmake_install.cmake
file INSTALL cannot find
Summary: This patch tries to build debug info for coroutine frame in the
middle end. Although the coroutine frame is constructed and maintained by
the compiler and the programmer shouldn't care about the coroutine frame
by the design of C++20 coroutine,
a lot of programmers told me that they want to see the layout of the
coroutine frame strongly. Although C++ is designed as an abstract layer
so that the programmers shouldn't care about the actual memory in bits,
many experienced C++ programmers are familiar with assembler and
debugger to see the memory layout in fact, After I was been told they
want to see the coroutine frame about 3 times, I think it is an actual
and desired demand.
However, the debug information is constructed in the front end and
coroutine frame is constructed in the middle end. This is a natural and
clear gap. So I could only try to construct the debug information in the
middle end after coroutine frame constructed. It is unusual, but we are
in consensus that the approch is the best one.
One hard part is we need construct the name for variables since there
isn't a map from llvm variables to DIVar. Then here is the strategy this
patch uses:
- The name `__resume_fn `, `__destroy_fn` and `__coro_index ` are
constructed by the patch.
- Then the name `__promise` comes from the dbg.variable of corresponding
dbg.declare of PromiseAlloca, which shows highest priority to
construct the debug information for the member of coroutine frame.
- Then if the member is struct, we would try to get the name of the llvm
struct directly. Then replace ':' and '.' with '_' to make it
printable for debugger.
- If the member is a basic type like integer or double, we would try to
emit the corresponding name.
- Then if the member is a Pointer Type, we would add `Ptr` after
corresponding pointee type.
- Otherwise, we would name it with 'UnknownType'.
Reviewered by: lxfind, aprantl, rjmcall, dblaikie
Differential Revision: https://reviews.llvm.org/D99179
Add new type of tree node for `InsertElementInst` chain forming vector.
These instructions could be either removed, or replaced by shuffles during
vectorization and we can add this node to cost model, so naturally estimating
their cost, getting rid of `CompensateCost` tricks and reducing further work
for InstCombine. This fixes PR40522 and PR35732 in a natural way. Also this
patch is the first step towards revectorization of partially vectorization
(to fix PR42022 completely). After adding inserts to tree the next step is
to add vector instructions there (for instance, to merge `store <2 x float>`
and `store <2 x float>` to `store <4 x float>`).
Fixes PR40522 and PR35732.
Differential Revision: https://reviews.llvm.org/D98714
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
Instead of an SCF for loop, these pattern generate fully unrolled loops with no temporary buffer allocations.
Differential Revision: https://reviews.llvm.org/D101981
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
The Mips in DW_LANG_Mips_Assembler is a vendor name not an
architecture name and in lack of a proper generic DW_LANG_assembler,
some assemblers emit DWARF using this tag. Due to a warning I recently
introduced users will now be greeted with
This version of LLDB has no plugin for the mipsassem language. Inspection of frame variables will be limited.
By renaming this to just "Assembler" this error message will make more sense.
Differential Revision: https://reviews.llvm.org/D101406
rdar://77214764
For a type-constraint in a lambda signature, this makes the lambda
contain an unexpanded pack; for requirements in a requires-expressions
it makes the requires-expression contain an unexpanded pack; otherwise
it's invalid.
properly track that it has constraints.
Previously an instantiation of a constrained generic lambda would behave
as if unconstrained because we incorrectly cached a "has no constraints"
value that we computed before the constraints from 'auto' parameters
were attached.
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield, ronlieb
Differential Revision: https://reviews.llvm.org/D102065
Change-Id: I10c0127ab7357787769fdf9a2edd4b3071e790a1
The bounds check that we previously had here was suitable for secondary
allocations but not for UAF on primary allocations, where it is likely
to result in false positives. Fix it by using a different bounds check
for UAF that requires the fault address to be in bounds.
Differential Revision: https://reviews.llvm.org/D102376
D85051's honeypot solution was a bit too aggressive swallowed up the
comparison types, which made comparing objects of different ordering
types ambiguous.
Depends on D101707.
Differential Revision: https://reviews.llvm.org/D101708
First set of "boilerplate" to get sparse tensor
passes available through CAPI and Python.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D102362