This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode.
This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation.
Reviewed By: dexonsmith, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D93395
In ST mode, flat scratch instructions have neither an sgpr nor a vgpr
for the address. This lead to an assertion when inserting hard clauses.
Differential Revision: https://reviews.llvm.org/D94406
The issue was introduced in commit rG84a1120943a651184bae507fed5d648fee381ae4
and would cause a VarLoc's StackOffset to be compared with its own, instead of
the StackOffset from the other VarLoc. This patch fixes that.
Updating `ScopeTops` is something we frequently do in CFGStackify, so
this factors it out as a function. This also makes a few utility
functions templated so that they are not dependent on input vector
types and simplifies function parameters.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D94046
Fixes PR48693: --emit-relocs keeps relocation sections. --gdb-index drops
.debug_gnu_pubnames and .debug_gnu_pubtypes but not their relocation sections.
This can cause a null pointer dereference in `getOutputSectionName`.
Also delete debug-gnu-pubnames.s which is covered by gdb-index.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D94354
Previously, llvm/runtimes/CMakeLists.txt played two different roles:
1. host side which could used to set up the build of runtimes for
different targets in the right order;
2. target side to build the runtimes for the specified target.
This change splits llvm/runtimes/CMakeLists.txt and moves the target
side to runtimes/CMakeLists laying down the foundation for the "A vision
for building the runtimes" proposal. From the user perspective, there
shouldn't be any visible difference at the moment.
Differential Revision: https://reviews.llvm.org/D93408
Debugging app launch/attach failures can be difficult because of
all of the messages logged to the console on a darwin system;
emitting specific messages around critical API calls can make it
easier to narrow the search for the console messages related to
the failure.
<rdar://problem/67220442>
Differential revision: https://reviews.llvm.org/D94357
Memory operands store a base alignment that does not factor in
the effect of the offset on the alignment.
Previously the printing code only printed the base alignment if
it was different than the size. If there is an offset, the reader
would need to figure out the effective alignment themselves. This
has confused me before and someone else was recently confused on
IRC.
This patch prints the possibly offset adjusted alignment if it is
different than the size. And prints the base alignment if it is
different than the alignment. The MIR parser has been updated to
read basealign in addition to align.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D94344
The lifetime of `libomptarget` and its opened plugins are not aligned
and it's hard for `libomptarget` to determine when the plugins are destroyed.
As a result, some issues (see D94256 for details) occur on some platforms.
Actually, if we take target memory as target resources, same as other resources,
such as CUDA streams, in each plugin, then the memory manager should also be in
the plugin. Also considering some platforms may want to opt out the feature, it
makes sense to move the memory manager to plugin, make it a common interface, and
let plguin developers determine whether they need it. This is what this patch does.
CUDA plugin is taken as example to show how to integrate it. In this way, we can
also get a bonus that different thresholds can be set for different platforms.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D94379
Added a utility function in Value class to print block name and use
block labels for unnamed blocks.
Changed LICM to call this function in its debug output.
Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>
Differential Revision: https://reviews.llvm.org/D93577
This ensures every single terminate pad is a single BB in the form of:
```
%exn = catch $__cpp_exception
call @__clang_call_terminate(%exn)
unreachable
```
This is a preparation for HandleEHTerminatePads pass, which will be
added in a later CL and will run after CFGStackify. That pass duplicates
terminate pads with a `catch_all` instruction, and duplicating it
becomes simpler if we can ensure every terminate pad is a single BB.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94045
SX Aurora VE is an experimental target. We upstreamed many part of
ported llvm and clang. In order to continue this move, we need to
support libraries next, then we need to show the ability of llvm for
VE through test cases. As a first step for that, we need to use
crt in compiler-rt. VE has it's own crt but they are a part of
proprietary compiler. So, we want to use crt in compiler-rt as an
alternative.
This patch enables VE as a candidate of crt in compiler-rt.
Reviewed By: phosek, compnerd
Differential Revision: https://reviews.llvm.org/D92748
Passes in the new PostAllocationPasses list will run immediately after memory
allocation and address assignment for defined symbols, and before
JITLinkContext::notifyResolved is called. These passes can set up state
associated with the addresses of defined symbols before any query for these
addresses completes.
getDynOperands behavior is commonly used in a number of passes. Refactored to
use a helper function and avoid code reuse.
Differential Revision: https://reviews.llvm.org/D94340
The `-Wpointer-sign` warning text is inappropriate for describing the
incompatible pointer conversion between plain `char` and explicitly
`signed`/`unsigned` `char` (whichever plain `char` has the same range
as) and vice versa.
Specifically, in part, it reads "converts between pointers to integer
types with different sign". This patch changes that portion to read
instead as "converts between pointers to integer types where one is of
the unique plain 'char' type and the other is not" when one of the types
is plain `char`.
C17 subclause 6.5.16.1 indicates that the conversions resulting in
`-Wpointer-sign` warnings in assignment-like contexts are constraint
violations. This means that strict conformance requires a diagnostic for
the case where the message text is wrong before this patch. The lack of
an even more specialized warning group is consistent with GCC.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D93999
Define the `vfclass` IR intrinsics for the respective V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Differential Revision: https://reviews.llvm.org/D94356
The standard requires comparisons of pointers to unrelated storage to
use `std::less`. Split out some helpers that do that and update all the
code that was comparing using `<` and friends (mostly assertions).
Differential Revision: https://reviews.llvm.org/D93777
This boils down to how we deal with early-increment iterator
over function's basic blocks: not only we need to early-increment,
after that we also need to skip all the blocks
that are scheduled for removal, as per DomTreeUpdater.
Thus supporting lazy DomTreeUpdater mode,
where the domtree updates (and thus block removals)
aren't applied immediately, but are delayed
until last possible moment.
When DomTreeUpdater is in lazy update mode, the blocks
that were scheduled to be removed, won't be removed
until the updates are flushed, e.g. by asking
DomTreeUpdater for a up-to-date DomTree.
From the function's current code, it is pretty evident
that the support for the lazy mode is an afterthought,
see e.g. how we roll-back NumRemoved statistic..
So instead of considering all the unreachable blocks
as the blocks-to-be-removed, simply additionally skip
all the blocks that are already scheduled to be removed
When we are adding edges to the terminator and potentially turning it
into a switch (if it wasn't already), it is possible that the
case we're adding will share it's destination with one of the
preexisting cases, in which case there is no domtree edge to add.
Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
BB was already always branching to EdgeBB, there is no edge to add.
Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
SI is the terminator of BB, so the edge we are adding obviously already existed.
Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
Based on the comments in lib/Parser/TypeParser.cpp on the
parseMemRefType and parseTensorType functions.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94262
When building a 64-bit big endian PowerPC Linux kernel with a 64-bit
little endian PowerPC target, the 32-bit vDSO errors:
```
$ make ARCH=powerpc CC=clang CROSS_COMPILE=powerpc64le-linux-gnu- \
pseries_defconfig arch/powerpc/kernel/vdso32/
ld.lld: error: arch/powerpc/kernel/vdso32/sigtramp.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/gettimeofday.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/datapage.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/cacheflush.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/note.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/getcpu.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/vgettimeofday.o is incompatible with elf32-powerpc
...
```
This happens because the endian information is missing from the call to
the assembler, even though it was explicitly passed to clang. See the
below example.
```
$ echo | clang --target=powerpc64le-linux-gnu \
--prefix=/usr/bin/powerpc64le-linux-gnu- \
-no-integrated-as -m32 -mbig-endian -### -x c -c -
".../clang-12" "-cc1" "-triple" "powerpc-unknown-linux-gnu" ...
...
"/usr/bin/powerpc64le-linux-gnu-as" "-a32" "-mppc" "-many" "-o" "-.o" "/tmp/--e69e28.s"
```
clang sets the right target with -m32 and -mbig-endian but -mbig-endian
does not make it to the assembler, resulting in a 32-bit little endian
binary. This differs from the little endian targets, which always pass
-mlittle-endian.
```
$ echo | clang --target=powerpc64-linux-gnu \
--prefix=/usr/bin/powerpc64-linux-gnu- \
-no-integrated-as -m32 -mlittle-endian -### -x c -c -
".../clang-12" "-cc1" "-triple" "powerpcle-unknown-linux-gnu" ...
...
"/usr/bin/powerpc64-linux-gnu-as" "-a32" "-mppc" "-mlittle-endian" "-many" "-o" "-.o" "/tmp/--405dbd.s"
```
Do the same thing for the big endian targets so that there is no more
error. This matches GCC's behavior, where -mbig and -mlittle are always
passed along to GNU as.
```
$ echo | powerpc64-linux-gcc -### -x c -c -
...
.../powerpc64-linux/bin/as -a64 -mpower4 -many -mbig -o -.o /tmp/ccVn7NAm.s
...
$ echo | powerpc64le-linux-gcc -### -x c -c -
...
.../powerpc64le-linux/bin/as -a64 -mpower8 -many -mlittle -o -.o /tmp/ccPN9ato.s
...
```
Reviewed By: nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D94442
For now `elf_common.c` is taken as a common part included into
different plugin implementations directly via
`#include "../../common/elf_common.c"`, which is not a best practice. Since it
is simple enough such that we don't need to create a real library for it, we just
take it as a interface library so that other targets can link it directly. Another
advantage of this method is, we don't need to add the folder into header search
path which can potentially pollute the search path.
VE and AMD platforms have not been tested because I don't have target machines.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D94443
Functions that are renamed under -funique-internal-linkage-names have their debug linkage name updated as well.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93747
This ensures the memref base + indices expression is well-formed
Reviewed By: ThomasRaoux, ftynse
Differential Revision: https://reviews.llvm.org/D94441
Original patch by @rogfer01.
This patch adds ISel patterns for the above operations to the
corresponding vector/vector and vector/scalar RVV instructions, as well
as extra patterns to match operand-swapped scalar/vector vfrsub and
vfrdiv.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94408
literals.
A literal interpretation of the standard wording allows this, but it was
never intended that string literal operator templates would be used for
anything other than user-defined string literals.
It was previously a generated header. It can easily converted to a
generated header if required in future.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D94445
Before this patch there was generic mapping from vector_extract
to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That
mapping is now replaced by a mapping from extractelt instead.
The reasoning is that vector_extract is marked as deprecated,
so it is assumed that a majority of targets will use extractelt
and not vector_extract (and that the long term solution for all
targets would be to use extractelt).
Targets like AArch64 that still use vector_extract can add an
additional mapping from the deprecated vector_extract as target
specific tablegen definitions. Such a mapping is added for AArch64
in this patch to avoid breaking tests.
When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we
triggered some new code paths in GlobalISelEmitter, ending up in
an assert when trying to import a pattern containing EXTRACT_SUBREG
for ARM. Therefore this patch also adds a "failedImport" warning
for that situation (instead of hitting the assert).
Differential Revision: https://reviews.llvm.org/D93416
This is a more basic pattern that we should handle before trying to solve:
https://llvm.org/PR48640
There might be a better way to think about this because the pre-condition
that I came up with (number of sign bits in the compare constant) misses a
potential transform for each of ugt and ult as commented on in the test file.
Tried to model this is in Alive:
https://rise4fun.com/Alive/juX1
...but I couldn't get the ComputeNumSignBits() pre-condition to work as
expected, so replaced with leading 0/1 preconditions instead.
Name: ugt
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ugt i8 %a, C2
=>
%r = icmp slt i8 %x, 0
Name: ult
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ult i4 %a, C2
=>
%r = icmp sgt i4 %x, -1
Also approximated in Alive2:
https://alive2.llvm.org/ce/z/u5hCczhttps://alive2.llvm.org/ce/z/__szVL
Differential Revision: https://reviews.llvm.org/D94014
This reverts commit df86f15f0c.
The gcc-5 build was broken by this change:
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp:1275:77: required from here
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, {anonymous}::TCParser::RegisteredAttr>::pair(llvm::StringRef&, {anonymous}::TCParser::RegisteredAttr'
* Registers a small set of sample dialects.
* NFC with respect to existing C-API symbols but some headers have been moved down a level to the Dialect/ sub-directory.
* Adds an additional entry point per dialect that is needed for dynamic discovery/loading.
* See discussion: https://llvm.discourse.group/t/dialects-and-the-c-api/2306/16
Differential Revision: https://reviews.llvm.org/D94370