Besides SPMDization, other analysis and optimization for original, frontend-generated SPMD regions uses information from the AAKernelInfoFunction attribute. This fix makes sure disabling SPMDization through the corresponding option applies only to generic mode regions, which should not be SPMDized, while it leaves unaffected the attribute state of original SPMD regions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108001
We may use several COPY instructions to copy the needed sub-registers
during split. But the way we split the lanes during the COPYs may be
different from the subranges of the old register. This would fail when we
extend the subranges of the new register because the LaneMasks do not
match exactly between subranges of new register and old register.
Since we are bundling the COPYs, I think there is no need to further refine the
subranges of the new register based on the set of LaneMasks of the inserted COPYs.
I am not sure if there will be further breaking cases. But as the subranges of
new register are created based on the LaneMasks of the subranges of old register,
it will be highly possible we will always find an exact LaneMask match.
We can think about how to make the extendPHIKillRanges() work for
subrange mask mismatch case if we meet more such cases in the future.
The test case was from D105065 by @arsenm.
Differential Revision: https://reviews.llvm.org/D107829
A new attribute btf_tag is added. The syntax looks like
__attribute__((btf_tag(<string>)))
Users may tag a particular structure/member/function/func_parameter/variable
declaration with an arbitrary string and the intention is
that this string is passed to dwarf so it is available for
post-compilation analysis. The string will be also passed
to .BTF section if the target is BPF. For each permitted
declaration, multiple btf_tag's are allowed.
For detailed use cases, please see
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151009.html
In case that there exist redeclarations, the btf_tag attributes
will be accumulated along with different declarations, and the
last declaration will contain all attributes.
Differential Revision: https://reviews.llvm.org/D106614
For SjLj, we allocate a table to record setjmp buffer info in the entry
of each setjmp-calling function by inserting a `malloc` call, and insert
a `free` call to free the buffer before each `ret` instruction.
But this is not sufficient; we have to free the buffer before we throw.
In SjLj handling, normal functions that can possibly throw or longjmp
are wrapped with an invoke and caught within the function so they don't
end up escaping the function. But three functions throw and escape the
function:
- `__resumeException` (Emscripten library function used for Emscripten
EH)
- `emscripten_longjmp` (Emscripten library function used for Emscripten
SjLj)
- `__cxa_throw` (libc++abi function called when for C++ `throw` keyword)
The first two functions are used to rethrow the current
exception/longjmp when the caught exception/longjmp is not for the
current function. `__cxa_throw` is used for exception, and because we
consider that a function that cannot longjmp, it escapes the function
right away, before which we should free the buffer.
Currently `lsan.test_longjmp3` and `lsan.test_exceptions_longjmp3` fail
in Emscripten; this CL fixes these.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107852
Currently, when Wasm EH is used with Emscripten SjLj, Emscripten SjLj
cannot handle `invoke` instructions - it assumes all `invoke`s have been
lowered away with Emscripten EH. But in Wasm EH they are lowered in
instruction selection, so they are still present in the IR stage. This
happens when
1. Wasm EH and Emscripten SjLj are used together
2. A function that calls `setjmp` uses exceptions, i.e., has `invoke`s
We were already erroring out with an assertion failure in this case, but
this CL makes it error out more properly with a valid error message.
Wasm EH + Wasm SjLj will not have this restrictions. (it will have
another restriction though, e.g., `setjmp` cannot be called within
`catch`. But why would anyone do that..)
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107687
Add -cc1 flags `-fmodules-uses-lock` and `-fno-modules-uses-lock` to
allow the lock manager to be turned off when building implicit modules.
Add `-Rmodule-lock` so that we can see when it's being used.
Differential Revision: https://reviews.llvm.org/D95583
This happens in createInvocationWithCommandLine but only clangd currently passes
ShouldRecoverOnErorrs (sic).
One cause of this (with correct command) is several -arch arguments for mac
multi-arch support.
Fixes https://github.com/clangd/clangd/issues/827
Differential Revision: https://reviews.llvm.org/D107632
... to the one of signature hints.
In particular, completion items now also carry annotations, which client
code might be interested in.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D107365
This renames `compileModuleAndReadAST`, adding a `BehindLock` suffix,
and refactors it to significantly reduce nesting.
- Split out helpers `compileModuleAndReadASTImpl` and
`readASTAfterCompileModule` which have straight-line code that doesn't
worry about locks.
- Use `break` in the interesting cases of `switch` statements to reduce
nesting.
- Use early `return`s to reduce nesting.
Detangling the compile-and-read logic from the check-for-locks logic
should be a net win for readability, although I also have a side
motivation of making the locks optional in a follow-up.
No functionality change here.
Differential Revision: https://reviews.llvm.org/D95581
It's quite useful to be able to hover over an #include and see the full
path to the header file.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D107137
Only the bare name is completed, with no args.
For args to be useful we need arg names. These *are* in the tablegen but
not currently emitted in usable form, so left this as future work.
C++11, C2x, GNU, declspec, MS syntax is supported, with the appropriate
spellings of attributes suggested.
`#pragma clang attribute` is supported but not terribly useful as we
only reach completion if parens are balanced (i.e. the line is not truncated)
There's no filtering of which attributes might make sense in this
grammatical context (e.g. attached to a function). In code-completion context
this is hard to do, and will only work in few cases :-(
There's also no filtering by langopts: this is because currently the
only way of checking is to try to produce diagnostics, which requires a
valid ParsedAttr which is hard to get.
This should be fairly simple to fix but requires some tablegen changes
to expose the logic without the side-effect.
Differential Revision: https://reviews.llvm.org/D107696
On aarch64 a two instruction sequence is used to calculate a
pc-relative address, add some state to the DisassemblerLLVMC
symbolicator so it can track the necessary data across the
two instructions and compute the address being calculated.
Differential Revision: https://reviews.llvm.org/D107213
rdar://49119253
Add builtin and intrinsic for `__addex`.
This patch is part of a series of patches to provide builtins for
compatibility with the XL compiler.
Reviewed By: stefanp, nemanjai, NeHuang
Differential Revision: https://reviews.llvm.org/D107002
Add an implementation of numeric_limits for use in str_conv_utils.
It currently only supports the basic integer types, with more types
coming as needed.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D107987
Move StaticVerifierFunctionEmitter to CodeGenHelper.h so that it can be
used for both ODS and DRR.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D106636
The new ELF notes are added in clang-offload-wrapper, and llvm-readobj has to visualize them properly.
Differential Revision: https://reviews.llvm.org/D99552
Clang test CodeGen/thinlto-clang-diagnostic-handler-in-be.c fails on
some non x86 targets, e.g. hexagon. Since the test already requires x86
to be available as a target this commit forces the target to x86_64.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D107667
This is a re-try of 6de1dbbd09 which was reverted because
it missed a null check. Extra test for that failure added.
Original commit message:
This is an adaptation of D41603 and another step on the way
to canonicalizing to the intrinsic forms of min/max.
See D98152 for status.
-Add Z for the B extension subextensions.
-Don't mention I along with B or its sub extensions.
This is based on comments in D107817.
Differential Revision: https://reviews.llvm.org/D107992
https://reviews.llvm.org/D106137 added support for plugins in Flang. The
CMake configuration for plugins requires some LLVM variables that are
not available in out-of-tree builds (e.g. `LLVM_SHLIB_OUTPUT_INTDIR`).
This has caused the out-of-tree BuildBot worker to start failing:
* https://lab.llvm.org/buildbot/#/builders/175
This patch effectively disables plugins in out-of-tree builds and fixes
the configuration error. In order to support plugins in out-of-tree
builds, we would have to find a way to access the missing CMake
variables from LLVM. This could be implemented at a later time.
Differential Revision: https://reviews.llvm.org/D107973
In some binaries, built with clang/lld, libunwind crashes
with "unsupported x86_64 register" for regNum == 16:
Differential Revision: https://reviews.llvm.org/D107919
A DiffConsumer object may be reused, but we'd like to reset it before
the next use.
No functionality change intended.
Differential Revision: https://reviews.llvm.org/D107985
Using the python API to easily set up sparse kernels, this test
exhaustively builds, compilers, and runs SpMM for all annotations
on a sparse tensor, making sure every version generates the correct
result. This test also illustrates using the python API to set up
a sparse kernel and sparse compilation.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D107943
Flang uses positional arguments for `messages::say()`, such as "%1$s" which is only supported in MS Compilers with the `_*printf_p` form of the function. This uses a conditional macro to convert the existing `vsnprintf` used to the one needed in MS-World.
7 tests in D107575 rely on this change.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D107654
This reverts the revert 28c04794df.
The failing MLIR test that caused the revert should be fixed in this
version.
Also includes a PPC test fix previously in 1f87c7c478.
Since we officially don't support several older compilers now, we can
drop a lot of the markup in the test suite. This helps keep the test
suite simple and makes sure that UNSUPPORTED annotations don't rot.
This is the first patch of a series that will remove annotations for
compilers that are now unsupported.
Differential Revision: https://reviews.llvm.org/D107787
std::clock_t can be an unsigned value on some platforms like MacOS and
therefore needs a cast when initializing an std::clock_t value with -1.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D107972
DAGCombiner::visitStore can call GetDemandedBits which will remove
upper bits from immediates. The upper bits are important for good
materialization of negative constants on RISCV. GetDemandedBits is a
different mechanism than SimplifyDemandedBits so
TargetShrinkDemandedConstant can't block it.
As far as I know this behavior is unique to stores.
I think we can fix this in isel using a concept similar to D107658.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D107860
For unit-stride and strided load/stores we set the SEW operand of
the pseudo instruction equal the EEW in the opcode. The LMUL
of the pseudo instruction is the LMUL we want.
These instructions calculate EMUL=(EEW/SEW) * LMUL. We can use
this to avoid changing vtype if the SEW/LMUL of the previous
vtype matches the EEW/EMUL ratio we need for the instruction.
Due to how the global analysis works, we can only do this
optimization when the previous vsetvli was produced in the block
containing the store. We need to know in the first phase if the
vsetvli will be inserted so we can propagate information to
the successors in the second phase correctly. This means we can't
depend on predecessors.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D106601
Instead of using scalar size divided by 8 for segment loads, get
the alignment from clang's type system.
Make vleff match for consistency.
Also replace uses of getPointerElementType() which will be removed as part of the OpaquePtr changes.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D106738
We really shouldn't deal with a conditional branch that can be trivially
constant-folded into an unconditional branch.
Indeed, barring failure to trigger BB reprocessing, that should be true,
so let's assert as much, and hope the assertion never fires.
If it does, we have a bug to fix.
Mainly, i want to add an assertion that `SimplifyCFGOpt::simplifyCondBranch()`
doesn't get asked to deal with non-unconditional branches,
and if i do that, then said assertion fires on existing tests,
and this is what prevents it from firing.
This is a direct translation of the select folds added with
D53033 / D53036 and another step towards canonicalization
using the intrinsics (see D98152).