When building a 64-bit big endian PowerPC Linux kernel with a 64-bit
little endian PowerPC target, the 32-bit vDSO errors:
```
$ make ARCH=powerpc CC=clang CROSS_COMPILE=powerpc64le-linux-gnu- \
pseries_defconfig arch/powerpc/kernel/vdso32/
ld.lld: error: arch/powerpc/kernel/vdso32/sigtramp.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/gettimeofday.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/datapage.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/cacheflush.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/note.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/getcpu.o is incompatible with elf32-powerpc
ld.lld: error: arch/powerpc/kernel/vdso32/vgettimeofday.o is incompatible with elf32-powerpc
...
```
This happens because the endian information is missing from the call to
the assembler, even though it was explicitly passed to clang. See the
below example.
```
$ echo | clang --target=powerpc64le-linux-gnu \
--prefix=/usr/bin/powerpc64le-linux-gnu- \
-no-integrated-as -m32 -mbig-endian -### -x c -c -
".../clang-12" "-cc1" "-triple" "powerpc-unknown-linux-gnu" ...
...
"/usr/bin/powerpc64le-linux-gnu-as" "-a32" "-mppc" "-many" "-o" "-.o" "/tmp/--e69e28.s"
```
clang sets the right target with -m32 and -mbig-endian but -mbig-endian
does not make it to the assembler, resulting in a 32-bit little endian
binary. This differs from the little endian targets, which always pass
-mlittle-endian.
```
$ echo | clang --target=powerpc64-linux-gnu \
--prefix=/usr/bin/powerpc64-linux-gnu- \
-no-integrated-as -m32 -mlittle-endian -### -x c -c -
".../clang-12" "-cc1" "-triple" "powerpcle-unknown-linux-gnu" ...
...
"/usr/bin/powerpc64-linux-gnu-as" "-a32" "-mppc" "-mlittle-endian" "-many" "-o" "-.o" "/tmp/--405dbd.s"
```
Do the same thing for the big endian targets so that there is no more
error. This matches GCC's behavior, where -mbig and -mlittle are always
passed along to GNU as.
```
$ echo | powerpc64-linux-gcc -### -x c -c -
...
.../powerpc64-linux/bin/as -a64 -mpower4 -many -mbig -o -.o /tmp/ccVn7NAm.s
...
$ echo | powerpc64le-linux-gcc -### -x c -c -
...
.../powerpc64le-linux/bin/as -a64 -mpower8 -many -mlittle -o -.o /tmp/ccPN9ato.s
...
```
Reviewed By: nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D94442
For now `elf_common.c` is taken as a common part included into
different plugin implementations directly via
`#include "../../common/elf_common.c"`, which is not a best practice. Since it
is simple enough such that we don't need to create a real library for it, we just
take it as a interface library so that other targets can link it directly. Another
advantage of this method is, we don't need to add the folder into header search
path which can potentially pollute the search path.
VE and AMD platforms have not been tested because I don't have target machines.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D94443
Functions that are renamed under -funique-internal-linkage-names have their debug linkage name updated as well.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93747
This ensures the memref base + indices expression is well-formed
Reviewed By: ThomasRaoux, ftynse
Differential Revision: https://reviews.llvm.org/D94441
Original patch by @rogfer01.
This patch adds ISel patterns for the above operations to the
corresponding vector/vector and vector/scalar RVV instructions, as well
as extra patterns to match operand-swapped scalar/vector vfrsub and
vfrdiv.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94408
literals.
A literal interpretation of the standard wording allows this, but it was
never intended that string literal operator templates would be used for
anything other than user-defined string literals.
It was previously a generated header. It can easily converted to a
generated header if required in future.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D94445
Before this patch there was generic mapping from vector_extract
to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That
mapping is now replaced by a mapping from extractelt instead.
The reasoning is that vector_extract is marked as deprecated,
so it is assumed that a majority of targets will use extractelt
and not vector_extract (and that the long term solution for all
targets would be to use extractelt).
Targets like AArch64 that still use vector_extract can add an
additional mapping from the deprecated vector_extract as target
specific tablegen definitions. Such a mapping is added for AArch64
in this patch to avoid breaking tests.
When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we
triggered some new code paths in GlobalISelEmitter, ending up in
an assert when trying to import a pattern containing EXTRACT_SUBREG
for ARM. Therefore this patch also adds a "failedImport" warning
for that situation (instead of hitting the assert).
Differential Revision: https://reviews.llvm.org/D93416
This is a more basic pattern that we should handle before trying to solve:
https://llvm.org/PR48640
There might be a better way to think about this because the pre-condition
that I came up with (number of sign bits in the compare constant) misses a
potential transform for each of ugt and ult as commented on in the test file.
Tried to model this is in Alive:
https://rise4fun.com/Alive/juX1
...but I couldn't get the ComputeNumSignBits() pre-condition to work as
expected, so replaced with leading 0/1 preconditions instead.
Name: ugt
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ugt i8 %a, C2
=>
%r = icmp slt i8 %x, 0
Name: ult
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ult i4 %a, C2
=>
%r = icmp sgt i4 %x, -1
Also approximated in Alive2:
https://alive2.llvm.org/ce/z/u5hCczhttps://alive2.llvm.org/ce/z/__szVL
Differential Revision: https://reviews.llvm.org/D94014
This reverts commit df86f15f0c.
The gcc-5 build was broken by this change:
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp:1275:77: required from here
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, {anonymous}::TCParser::RegisteredAttr>::pair(llvm::StringRef&, {anonymous}::TCParser::RegisteredAttr'
* Registers a small set of sample dialects.
* NFC with respect to existing C-API symbols but some headers have been moved down a level to the Dialect/ sub-directory.
* Adds an additional entry point per dialect that is needed for dynamic discovery/loading.
* See discussion: https://llvm.discourse.group/t/dialects-and-the-c-api/2306/16
Differential Revision: https://reviews.llvm.org/D94370
* We've got significant missing features in order to use most of these effectively (i.e. custom builders, region-based builders).
* We presently also lack a mechanism for actually registering these dialects but they can be use with contexts that allow unregistered dialects for further prototyping.
Differential Revision: https://reviews.llvm.org/D94368
This patch finishes addressing unused prefixes under CodeGen: 2
remaining tests fixed, and then undo-ing the lit.local.cfg changes under
various subdirs and moving the policy under CodeGen.
Differential Revision: https://reviews.llvm.org/D94430
The type tablegen backend now has enough support to represent these types well enough, so we can now move them to be declaratively defined.
Differential Revision: https://reviews.llvm.org/D94275
This allows for specifying additional get/getChecked methods that should be generated on the type, and acts similarly to how OpBuilders work. TypeBuilders have two additional components though:
* InferredContextParam
- Bit indicating that the context parameter of a get method is inferred from one of the builder parameters
* checkedBody
- A code block representing the body of the equivalent getChecked method.
Differential Revision: https://reviews.llvm.org/D94274
This removes the need for OpDefinitionsGen to use raw tablegen API, and will also
simplify adding builders to TypeDefs as well.
Differential Revision: https://reviews.llvm.org/D94273
Reorder the AMDGPUUage description of the memory model code sequences
for volatile so clear that it applies independent of the nontemporal
setting.
Differential Revision: https://reviews.llvm.org/D94358
This adds `// clang-format off` in the auto-generated file to avoid lint warnings.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D94410
Original patch by @rogfer01.
All ordered comparisons except ONE are supported natively, and all
unordered comparisons except UNE are expanded into sequences involving
explicit NaN checks and mask arithmetic.
Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and
pattern-match those back to the "original", swapping operands once more. This
way we catch both operations and both "vf" and "fv" forms with fewer patterns.
Also add support for floating-point splat_vector, with an optimization for
splatting fpimm0.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94242
Lowering of async dialect uses a fixed type converter and therefore does not support lowering non-standard types.
This revision adds a structural conversion so that non-standard types in `!async.value`s can be lowered to LLVM before lowering the async dialect itself.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D94404
On z/OS, the error message "EDC5111I Permission denied." is not matched correctly in lit tests. This patch updates the check expression to match successfully.
Reviewed By: fanbo-meng
Differential Revision: https://reviews.llvm.org/D94432
Change [x86] Fix tile register spill issue was causing problems for our build
using gcc-5.4.1
The problem was caused by this line:
for (const MachineInstr &MI : make_range(MIS.begin(), MI))
where MI was previously defined as a MachineBasicBlock iterator.
Differential Revision: https://reviews.llvm.org/D94415
Summary:
Introduce a new mode of operation for -print-changed that only reports
after a pass changes the IR with all of the other messages suppressed (ie,
no initial IR and no messages about ignored, filtered or non-modifying
passes).
The option processing for -print-changed is changed to take an optional
string indicating options for print-changed. Initially, the only option
supported is quiet (as described above). This new quiet mode is specified
with -print-changed=quiet while -print-changed will continue to function
in the same way. It is intended that there will be more options in the
future.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D92589
Please see D93747 for more context which tries to make linkage names of internal
linkage functions to be the uniqueified names. This causes a problem with gdb
because breaking using the demangled function name will not work if the new
uniqueified name cannot be demangled. The problem is the generated suffix which
is a mix of integers and letters which do not demangle. The demangler accepts
either all numbers or all letters. This patch simply converts the hash to decimal.
There is no loss of uniqueness by doing this as the precision is maintained.
The symbol names get longer by a few characters though.
Differential Revision: https://reviews.llvm.org/D94154
Remove duplicated function to check for required clauses on a directive. This was
still there from the merging of OpenACC and OpenMP common semantic checks and it can now be
removed so we use only one function.
Reviewed By: sameeranjoshi
Differential Revision: https://reviews.llvm.org/D93575
This wasn't possible before because there was no support for affine expressions
as maps. Now that this support is available, provide the mechanism for
constructing maps with a layout and inspecting it.
Rework the `get` method on MemRefType in Python to avoid needing an explicit
memory space or layout map. Remove the `get_num_maps`, it is too low-level,
using the length of the now-avaiable pseudo-list of layout maps is more
pythonic.
Depends On D94297
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94302
Now that the bindings for AffineExpr have been added, add more bindings for
constructing and inspecting AffineMap that consists of AffineExprs.
Depends On D94225
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94297
This adds the Python bindings for AffineExpr and a couple of utility functions
to the C API. AffineExpr is a top-level context-owned object and is modeled
similarly to attributes and types. It is required, e.g., to build layout maps
of the built-in memref type.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94225
VOP3 and VOP DPP subroutines to generate input
operands and asm strings were essentially copy
pasted several times. They are deduplicated to
reduce the maintenance burden and allow faster
development.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D94102
Change-Id: I76225eed3c33239d9573351e0c8a0abfad0146ea
The result cannot be widened, unfortunately, because widening vNi1
would depend on the context in which it appears (i.e. the type alone
is not sufficient to tell if it needs to be widened).
Introduce a function attribute 'enforce_tcb' that prevents the function
from calling other functions without the same attribute. This allows
isolating code that's considered to be somehow privileged so that it could not
use its privileges to exhibit arbitrary behavior.
Introduce an on-by-default warning '-Wtcb-enforcement' that warns
about violations of the above rule.
Introduce a function attribute 'enforce_tcb_leaf' that suppresses
the new warning within the function it is attached to. Such leaf functions
may implement common functionality between the trusted and the untrusted code
but they require extra careful audit with respect to their capabilities.
Fixes after a revert in 419ef38a50293c58078f830517f5e305068dbee6:
Fix a test.
Add workaround for GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67274).
Attribute the patch appropriately!
Differential Revision: https://reviews.llvm.org/D91898
Now that we flush the local value map for every instruction, we don't
need any extra flushes for specific cases. Also, LastFlushPoint is
not used for anything. Follow-ups to #c161665 (D91734).
This reapplies #3fd39d3.
Differential Revision: https://reviews.llvm.org/D92338
Fixes PR48681: after LTO, lto.tmp may reference a libcall symbol not in an IR
symbol table of any bitcode file. If such a symbol is defined in an archive
matched by a --exclude-libs, we don't correctly localize the symbol.
Add another `excludeLibs` after `compileBitcodeFiles` to localize such libcall
symbols. Unfortunately we have keep the existing one for D43126.
Using VER_NDX_LOCAL is an implementation detail of `--exclude-libs`, it does not
necessarily tie to the "localize" behavior. `local:` patterns in a version
script can be omitted.
The `symbol ... has undefined version ...` error should not be exempted.
Ideally we should error as GNU ld does. https://issuetracker.google.com/issues/73020933
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D94280
When fusing tensor_reshape ops with generic/indexed_Generic op, new
linalg.init_tensor operations were created for the `outs` of the fused
op. While correct (technically) it is better to just reshape the
original `outs` operands and rely on canonicalization of init_tensor
-> tensor_reshape to achieve the same effect.
Differential Revision: https://reviews.llvm.org/D93774
This allow more accurate modeling of the side effects and allow dead code
elimination to remove dead transfer ops.
Differential Revision: https://reviews.llvm.org/D94318
If vpermf128/vpermi128 is acting on 2 similar 'inlane' ops, then try to perform the vpermf128 first which will allow us to merge the ops.
This will help us fix one of the regressions in D56387