Copy-paste P9 insns were added back in 2016,
however, looks like the opcodes has changed in ISA3.1.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D97416
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).
This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).
Differential Revision: https://reviews.llvm.org/D88175
Reduction updates should be masked, just like the load and stores.
Note that alternatively, we could use the fact that masked values are
zero of += updates and mask invariants to get this working but that
would not work for *= updates. Masking the update itself is cleanest.
This change also replaces the constant mask with a broadcast of "true"
since this constant folds much better for various folding patterns.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98000
and __kmpc_end_masked. The "master" construct is deprecated. Changed
proc-bind keyword from "master" to "primary". Use of both master
construct and master as proc-bind keyword is still allowed, but
deprecated.
Remove references to "master" in comments and strings, and replace
with "primary" or "primary thread". Function names and variables were
not touched, nor were references to deprecated master construct. These
can be updated over time. No new code should refer to master.
Add diagnostic tests with fir-opt for the diagnostics emitted by the ops verifier
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D97996
A few files in libc++abi make use of libc++ headers and a few of those use threading primitives provided by libc++. Since libc++ has multiple threading APIs it may be necessary to override auto-detection.
This patch adds the LIBCXXABI_HAS_WIN32_THREAD_API which does roughly the same as LIBCXXABI_HAS_PTHREAD_API and the similarly named LIBCXX_HAS_WIN32_THREAD_API from libc++. Instead of using autodetection it will force the use of win32 threads instead of pthreads in headers included from libc++.
Without this patch, libc++abi may depend on pthreads if present on the users build environment, even if win32 threading was selected for libc++.
Differential revision: https://reviews.llvm.org/D98021
Previously, lld/mac only ad-hoc codesigned executables on arm64.
Matches ld64 behavior. Part of PR49443. Fixes 14 of 17 failures when running
check-llvm with lld as host linker on an M1 MBP.
Differential Revision: https://reviews.llvm.org/D97994
Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types. At the
moment there is no BTF_KIND for floats and doubles, so the release
version of LLVM ends up emitting type id 0 for them, which the
in-kernel verifier does not accept.
Introduce support for such types to libbpf by representing them using
the new BTF_KIND_FLOAT.
Reviewed By: yonghong-song
Differential Revision: https://reviews.llvm.org/D83289
We have no way to reason about the bool returned by try_emplace, so we
simply ignore any std::move()s that happen in a try_emplace argument.
A lot of the time in this situation, the code will be checking the
bool and doing something else if it turns out the value wasn't moved
into the map, and this has been causing false positives so far.
I don't currently have any intentions of handling "maybe move" functions
more generally.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D98034
A bug was introduced when adding -munsafe-fp-atomics.
By default it should be off.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97967
Like nvptx and some other targets, -mconstructor-aliases does not work well with amdgpu,
therefore we disable it in the same approach.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97959
`mix` is subtly different from `clamp`: in the overloads where the
last argument is a scalar, the second argument should be a gentype for
`mix`.
As scalars can be implicitly converted to vectors, this cannot be
caught in the Sema test. Hence adding a CodeGen test, where we can
verify the types using the mangled name.
Previously if we couldn't run the clang-format command
for some reason, you'd get an unhelpful error message:
```
OSError: [Errno 2] No such file or directory
```
Which doesn't tell you what was happening to cause this.
Catch the error and add the command we were attempting to run:
```
RuntimeError: Failed to run "<...>/clang-food <...>" - No such file or directory"
RuntimeError: Failed to run "<...>/clang-format <...>" - Permission denied"
```
Reviewed By: krasimir
Differential Revision: https://reviews.llvm.org/D98032
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.
This reverts commit d2000b45d0.
This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform.
Reviewed By: jhenderson, ASDenysPetrov
Differential Revision: https://reviews.llvm.org/D97472
Same as other memory instructions, ds instructions add latency even if
exec is zero. Jumping over them if exec=0 is cheaper than executing
them.
With this change, the branch instruction that skips over a basic block
if exec=0 is not removed when the block contains a ds instruction.
Differential Revision: https://reviews.llvm.org/D97922
This patch fixes LLD to allow element sections for tables whose number
is nonzero. We also add a test for linking multiple tables, showing
that nonzero table numbers for the indirect function table,
user-declared imported tables, and local user table definitions work.
Differential Revision: https://reviews.llvm.org/D92321
The new test llvm/test/Transforms/CodeGenPrepare/remove-assume-block.ll
breaks on non-X86 machines. Change it to look like the existing test
llvm/test/Transforms/CodeGenPrepare/X86/delete-assume-dead-code.ll
to fix it.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D97952
With reference types, tables can have non-zero table numbers. This
commit adds support for element sections against these tables.
Differential Revision: https://reviews.llvm.org/D97923
Clang exposes an interface for extending the PCM/PCH file format: `ModuleFileExtension`.
Clang itself has only a single implementation of the interface: `TestModuleFileExtension` that can be instantiated via the `-ftest-module-file_extension=` command line argument (and is stored in `FrontendOptions::ModuleFileExtensions`).
Clients of the Clang library can extend the PCM/PCH file format by pushing an instance of their extension class to the `FrontendOptions::ModuleFileExtensions` vector.
When generating the `-ftest-module-file_extension=` command line argument from `FrontendOptions`, a downcast is used to distinguish between the Clang's testing extension and other (client) extensions.
This functionality is enabled by LLVM-style RTTI. However, this style of RTTI is hard to extend, as it requires patching Clang (adding new case to the `ModuleFileExtensionKind` enum).
This patch switches to the LLVM RTTI for open class hierarchies, which allows libClang users (e.g. Swift) to create implementations of `ModuleFileExtension` without patching Clang. (Documentation of the feature: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html#rtti-for-open-class-hierarchies)
Reviewed By: artemcm
Differential Revision: https://reviews.llvm.org/D97702
Recommit bf5a582650. Depends on
4c8fb7ddd6 which was reverted.
RegBankSelect creates zext and trunc when it selects banks for uniform i1.
Add zext_trunc_fold from generic combiner to post RegBankSelect combiner.
Differential Revision: https://reviews.llvm.org/D95432
There are certain loops like this below:
for (int i = 0; i < n; i++) {
a[i] = b[i] + 1;
*inv = a[i];
}
that can only be vectorised if we are able to extract the last lane of the
vectorised form of 'a[i]'. For fixed width vectors this already works since
we know at compile time what the final lane is, however for scalable vectors
this is a different story. This patch adds support for extracting the last
lane from a scalable vector using a runtime determined lane value. I have
added support to VPIteration for runtime-determined lanes that still permit
the caching of values. I did this by introducing a new class called VPLane,
which describes the lane we're dealing with and provides interfaces to get
both the compile-time known lane and the runtime determined value. Whilst
doing this work I couldn't find any explicit tests for extracting the last
lane values of fixed width vectors so I added tests for both scalable and
fixed width vectors.
Differential Revision: https://reviews.llvm.org/D95139
This patch fixes failure of the `CodeGen/aix-ignore-xcoff-visibility.cpp` test with command line round-trip.
The absence of '-fvisibility' implies '-mignore-xcoff-visibility'.
The problem is that when '-fvisibility default' is passed to -cc1, it isn't being generated. (This adheres to the principle that generation doesn't produce arguments with default values.)
However, that caused '-mignore-xcoff-visibility' to be implied in the generated command line (without '-fvisibility'), while it wasn't implied in the original command line (with '-fvisibility').
This patch fixes that by always generating '-fvisibility' and explains the situation in comment.
(The '-mginore-xcoff-visibility' option was added in D87451).
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D97552
Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and
for which only the last loop iteration is actually visible outside of the
loop. The canonicalization looks for a pattern such as:
```
%t0 = ... : tensor_type
%0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) {
...
// %m is either tensor_to_memref(%bb00) or defined above the loop
%m... : memref_type
... // uses of %m with potential inplace updates
%new_tensor = tensor_load %m : memref_type
...
scf.yield %new_tensor : tensor_type
}
```
`%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a
`%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load`
op.
If no aliasing write of `%new_tensor` occurs between tensor_load and yield
then the value %0 visible outside of the loop is the last `tensor_load`
produced in the loop.
For now, we approximate the absence of aliasing by only supporting the case
when the tensor_load is the operation immediately preceding the yield.
The canonicalization rewrites the pattern as:
```
// %m is either a tensor_to_memref or defined above
%m... : memref_type
scf.for ... { // no iter_args
... // uses of %m with potential inplace updates
}
%0 = tensor_load %m : memref_type
```
Differential revision: https://reviews.llvm.org/D97953
As pointed out in D96244, "Module" is already pretty overloaded to refer
to clang and llvm modules. (And clangd deals directly with the former).
FeatureModule is a bit of a mouthful but it's pretty self-descriptive.
I think it might be better than "Component" which doesn't really capture
the "common interface" aspect - it's IMO confusing to refer to
"components" but exclude CDB for example.
Differential Revision: https://reviews.llvm.org/D97950
The code was using the standard isalnum function which doesn't handle
values outside the non-ascii range. Switching to using llvm::isAlnum
instead ensures we don't provoke undefined behaviour, which can in some
cases result in crashes.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97663
The test was showing that when --strip-unneeded is specified for an
executable, all the symbols are stripped. However, the set of symbols
used in the test would be stripped by --strip-unneeded for an ET_REL
object too. Fix this by adding additional symbols that aren't normally
stripped by --strip-unneeded.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97664
Opening a path like \\server (without a trailing share name and
path) produces this error, while opening e.g. \\server\share
(for a nonexistent server/share) produces ERROR_BAD_NETPATH (which
already is mapped).
This happens in some testcases (in fs.op.proximate); as proximate()
calls weakly_canonical() on the inputs, weakly_canonical() checks
whether the path exists or not. When the error code wasn't recognized
(it mapped to errc::invalid_argument), the stat operation wasn't
conclusive and weakly_canonical() errored out. With the proper error
code mapping, this isn't considered an error, just a nonexistent
path, and weakly_canonical() can proceed.
This roughly matches what MS STL does - it doesn't have
ERROR_BAD_PATHNAME in its error code mapping table, but it
checks for this error code specifically in the return of their
correspondence of the stat function.
Differential Revision: https://reviews.llvm.org/D97619
One ASan test currently `XPASS`es on Solaris:
AddressSanitizer-i386-sunos :: TestCases/Posix/unpoison-alternate-stack.cpp
It was originally `XFAIL`ed in D88501 <https://reviews.llvm.org/D88501>
because `longjmp` from a signal handled is highly unportable, warned
against in XPG7, and was not supported by Solaris `libc` at the time.
However, since then support has been added for some cases including the
current one, so the `XFAIL` can go.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D97933
One ASan test currently `XPASS`es on Solaris:
AddressSanitizer-i386-sunos :: TestCases/Posix/no_asan_gen_globals.c
It was originally `XFAIL`ed in D88218 <https://reviews.llvm.org/D88218>
because Solaris `ld`, unlike GNU `ld`, doesn't strip local labels. Since
then, the integrated assembler has stopped emitting those local labels, so
the difference becomes moot and the `XFAIL` can go.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D97932
This pass runs in any situations but we skip it when it is not O0 and the
function doesn't have optnone attribute. With -O0, the def of shape to amx
intrinsics is near the amx intrinsics code. We are not able to find a
point which post-dominate all the shape and dominate all amx intrinsics.
To decouple the dependency of the shape, we transform amx intrinsics
to scalar operation, so that compiling doesn't fail. In long term, we
should improve fast register allocation to allocate amx register.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D93594