This patch fixes LLD to allow element sections for tables whose number
is nonzero. We also add a test for linking multiple tables, showing
that nonzero table numbers for the indirect function table,
user-declared imported tables, and local user table definitions work.
Differential Revision: https://reviews.llvm.org/D92321
The new test llvm/test/Transforms/CodeGenPrepare/remove-assume-block.ll
breaks on non-X86 machines. Change it to look like the existing test
llvm/test/Transforms/CodeGenPrepare/X86/delete-assume-dead-code.ll
to fix it.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D97952
With reference types, tables can have non-zero table numbers. This
commit adds support for element sections against these tables.
Differential Revision: https://reviews.llvm.org/D97923
Clang exposes an interface for extending the PCM/PCH file format: `ModuleFileExtension`.
Clang itself has only a single implementation of the interface: `TestModuleFileExtension` that can be instantiated via the `-ftest-module-file_extension=` command line argument (and is stored in `FrontendOptions::ModuleFileExtensions`).
Clients of the Clang library can extend the PCM/PCH file format by pushing an instance of their extension class to the `FrontendOptions::ModuleFileExtensions` vector.
When generating the `-ftest-module-file_extension=` command line argument from `FrontendOptions`, a downcast is used to distinguish between the Clang's testing extension and other (client) extensions.
This functionality is enabled by LLVM-style RTTI. However, this style of RTTI is hard to extend, as it requires patching Clang (adding new case to the `ModuleFileExtensionKind` enum).
This patch switches to the LLVM RTTI for open class hierarchies, which allows libClang users (e.g. Swift) to create implementations of `ModuleFileExtension` without patching Clang. (Documentation of the feature: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html#rtti-for-open-class-hierarchies)
Reviewed By: artemcm
Differential Revision: https://reviews.llvm.org/D97702
Recommit bf5a582650. Depends on
4c8fb7ddd6 which was reverted.
RegBankSelect creates zext and trunc when it selects banks for uniform i1.
Add zext_trunc_fold from generic combiner to post RegBankSelect combiner.
Differential Revision: https://reviews.llvm.org/D95432
There are certain loops like this below:
for (int i = 0; i < n; i++) {
a[i] = b[i] + 1;
*inv = a[i];
}
that can only be vectorised if we are able to extract the last lane of the
vectorised form of 'a[i]'. For fixed width vectors this already works since
we know at compile time what the final lane is, however for scalable vectors
this is a different story. This patch adds support for extracting the last
lane from a scalable vector using a runtime determined lane value. I have
added support to VPIteration for runtime-determined lanes that still permit
the caching of values. I did this by introducing a new class called VPLane,
which describes the lane we're dealing with and provides interfaces to get
both the compile-time known lane and the runtime determined value. Whilst
doing this work I couldn't find any explicit tests for extracting the last
lane values of fixed width vectors so I added tests for both scalable and
fixed width vectors.
Differential Revision: https://reviews.llvm.org/D95139
This patch fixes failure of the `CodeGen/aix-ignore-xcoff-visibility.cpp` test with command line round-trip.
The absence of '-fvisibility' implies '-mignore-xcoff-visibility'.
The problem is that when '-fvisibility default' is passed to -cc1, it isn't being generated. (This adheres to the principle that generation doesn't produce arguments with default values.)
However, that caused '-mignore-xcoff-visibility' to be implied in the generated command line (without '-fvisibility'), while it wasn't implied in the original command line (with '-fvisibility').
This patch fixes that by always generating '-fvisibility' and explains the situation in comment.
(The '-mginore-xcoff-visibility' option was added in D87451).
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D97552
Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and
for which only the last loop iteration is actually visible outside of the
loop. The canonicalization looks for a pattern such as:
```
%t0 = ... : tensor_type
%0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) {
...
// %m is either tensor_to_memref(%bb00) or defined above the loop
%m... : memref_type
... // uses of %m with potential inplace updates
%new_tensor = tensor_load %m : memref_type
...
scf.yield %new_tensor : tensor_type
}
```
`%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a
`%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load`
op.
If no aliasing write of `%new_tensor` occurs between tensor_load and yield
then the value %0 visible outside of the loop is the last `tensor_load`
produced in the loop.
For now, we approximate the absence of aliasing by only supporting the case
when the tensor_load is the operation immediately preceding the yield.
The canonicalization rewrites the pattern as:
```
// %m is either a tensor_to_memref or defined above
%m... : memref_type
scf.for ... { // no iter_args
... // uses of %m with potential inplace updates
}
%0 = tensor_load %m : memref_type
```
Differential revision: https://reviews.llvm.org/D97953
As pointed out in D96244, "Module" is already pretty overloaded to refer
to clang and llvm modules. (And clangd deals directly with the former).
FeatureModule is a bit of a mouthful but it's pretty self-descriptive.
I think it might be better than "Component" which doesn't really capture
the "common interface" aspect - it's IMO confusing to refer to
"components" but exclude CDB for example.
Differential Revision: https://reviews.llvm.org/D97950
The code was using the standard isalnum function which doesn't handle
values outside the non-ascii range. Switching to using llvm::isAlnum
instead ensures we don't provoke undefined behaviour, which can in some
cases result in crashes.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97663
The test was showing that when --strip-unneeded is specified for an
executable, all the symbols are stripped. However, the set of symbols
used in the test would be stripped by --strip-unneeded for an ET_REL
object too. Fix this by adding additional symbols that aren't normally
stripped by --strip-unneeded.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97664
Opening a path like \\server (without a trailing share name and
path) produces this error, while opening e.g. \\server\share
(for a nonexistent server/share) produces ERROR_BAD_NETPATH (which
already is mapped).
This happens in some testcases (in fs.op.proximate); as proximate()
calls weakly_canonical() on the inputs, weakly_canonical() checks
whether the path exists or not. When the error code wasn't recognized
(it mapped to errc::invalid_argument), the stat operation wasn't
conclusive and weakly_canonical() errored out. With the proper error
code mapping, this isn't considered an error, just a nonexistent
path, and weakly_canonical() can proceed.
This roughly matches what MS STL does - it doesn't have
ERROR_BAD_PATHNAME in its error code mapping table, but it
checks for this error code specifically in the return of their
correspondence of the stat function.
Differential Revision: https://reviews.llvm.org/D97619
One ASan test currently `XPASS`es on Solaris:
AddressSanitizer-i386-sunos :: TestCases/Posix/unpoison-alternate-stack.cpp
It was originally `XFAIL`ed in D88501 <https://reviews.llvm.org/D88501>
because `longjmp` from a signal handled is highly unportable, warned
against in XPG7, and was not supported by Solaris `libc` at the time.
However, since then support has been added for some cases including the
current one, so the `XFAIL` can go.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D97933
One ASan test currently `XPASS`es on Solaris:
AddressSanitizer-i386-sunos :: TestCases/Posix/no_asan_gen_globals.c
It was originally `XFAIL`ed in D88218 <https://reviews.llvm.org/D88218>
because Solaris `ld`, unlike GNU `ld`, doesn't strip local labels. Since
then, the integrated assembler has stopped emitting those local labels, so
the difference becomes moot and the `XFAIL` can go.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D97932
This pass runs in any situations but we skip it when it is not O0 and the
function doesn't have optnone attribute. With -O0, the def of shape to amx
intrinsics is near the amx intrinsics code. We are not able to find a
point which post-dominate all the shape and dominate all amx intrinsics.
To decouple the dependency of the shape, we transform amx intrinsics
to scalar operation, so that compiling doesn't fail. In long term, we
should improve fast register allocation to allocate amx register.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D93594
GCC warning:
```
In file included from /usr/include/c++/9/cassert:44,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/ADT/BitVector.h:21,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Program.h:17,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Process.h:32,
from /home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:11:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::jitlink::JITLinkMemoryManager::Allocation> > llvm::jitlink::InProcessMemoryManager::allocate(const llvm::jitlink::JITLinkDylib*, const SegmentsRequestMap&)’:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:129:40: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
129 | assert(SlabRemaining.allocatedSize() >= 0 && "Mapping exceeds allocation");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
```
The return type of `allocatedSize()` is `size_t`, thus the expression
`SlabRemaining.allocatedSize() >= 0` always evaluate to `true`.
I'm not sure this would catch all such issues, but it would catch some.
The problem for PR49393 was that we were holding a reference to a node that
wasn't connect edto the DAG across a function that could delete unused nodes. In
this particular case we managed to try to use the deleted node while it was in
the deleted state before its memory got recycled.
It could also happen that we delete the node, something allocates a new node
which recycles the memory. Then we try to use the reference we were holding and
it is now a completely different node with different valid opcode. This patch
would not catch that.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D97969
For binary or ternary ops we call getNegatedExpression multiple
times and then compare costs. While we're doing this we need to
hold a node from the first call across the second call, but its
not yet attached to the DAG. Its possible the second call creates
an identical node and then decides it didn't need it so will try
to delete it if it has no uses. This can cause a reference to the
node we're holding further up the call stack to become invalidated.
To prevent this, we can use a HandleSDNode to artifically give
the node a use without connecting it to the DAG.
I've used a std::list of HandleSDNodes so we can create handles
only when we have a node to hold. HandleSDNode does not have
default constructor and cannot be copied or moved.
Fixes PR49393.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D97914
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
* Only the worksharing-loop directive.
* Recognizes only the nowait clause.
* No loop nests with more than one loop.
* Untested with templates, exceptions.
* Semantic checking left to the existing infrastructure.
This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
* The distance function: How many loop iterations there will be before entering the loop nest.
* The loop variable function: Conversion from a logical iteration number to the loop variable.
These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.
The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.
For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.
Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94973
GCC warning:
```
/llvm-project/clang-tools-extra/clangd/SemanticHighlighting.cpp: In function ‘bool clang::clangd::{anonymous}::canHighlightName(clang::DeclarationName)’:
/llvm-project/clang-tools-extra/clangd/SemanticHighlighting.cpp:64:1: warning: control reaches end of non-void function [-Wreturn-type]
64 | }
| ^
```
By implementing the method "unsigned RISCVTTIImpl::getRegisterBitWidth(bool Vector)",
fixed-length vectorization is enabled when possible. Without this method, the
"#pragma clang loop" directive is needed to enable vectorization(or the cost model
may inform LLVM that "Vectorization is possible but not beneficial").
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D97549
sample loader pass.
In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
to prevent repeated indirect call promotion for the same indirect call
and the same target, we used zero-count value profile to indicate an
indirect call has been promoted for a certain target. We removed
PromotedInsns cache in the same patch. However, there was a problem in
that patch described below, and that problem led me to add PromotedInsns
back as a mitigation in
https://reviews.llvm.org/rG4ffad1fb489f691825d6c7d78e1626de142f26cf.
When we get value profile from metadata by calling getValueProfDataFromInst,
we need to specify the maximum possible number of values we expect to read.
We uses MaxNumPromotions in the last patch so the maximum number of value
information extracted from metadata is MaxNumPromotions. If we have many
values including zero-count values when we write the metadata, some of them
will be dropped when we read them because we only read MaxNumPromotions
values. It will allow repeated indirect call promotion again. We need to
make sure if there are values indicating promoted targets, those values need
to be saved in metadata with higher priority than other values.
The patch fixed that problem. We change to use -1 to represent the count
of a promoted target instead of 0 so it is easier to sort the values.
When we prepare to update the metadata in updateIDTMetaData, we will sort
the values in the descending count order and extract only MaxNumPromotions
values to write into metadata. Since -1 is the max uint64_t number, if we
have equal to or less than MaxNumPromotions of -1 count values, they will
all be kept in metadata. If we have more than MaxNumPromotions of -1 count
values, we will only save MaxNumPromotions such values maximally. In such
case, we have logic in place in doesHistoryAllowICP to guarantee no more
promotion in sample loader pass will happen for the indirect call, because
it has been promoted enough.
With this change, now we can remove PromotedInsns without problem.
Differential Revision: https://reviews.llvm.org/D97350
Implements parts of:
- P0898R3 Standard Library Concepts
- P1754 Rename concepts to standard_case for C++20, while we still can
Depends on D96660
Reviewed By: ldionne, #libc, Quuxplusone
Differential Revision: https://reviews.llvm.org/D97176
As a preparation step for fast8 support, we need to update the tests
to pass in both modes. That requires generalizing the shadow width
and remove any hard coded references that assume it's always 2 bytes.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D97988
Lorenz Bauer from Cloudflare tried to use "const struct <name>"
as the type for __builtin_btf_type_id(*(const struct <name>)0, 1)
relocation and hit a llvm BPF fatal error.
https://lore.kernel.org/bpf/a3782f71-3f6b-1e75-17a9-1827822c2030@fb.com/
...
fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc
Currently, we require the debuginfo type itself must have a name.
In this case, the debuginfo type is "const" which points to "struct <name>".
The "const" type does not have a name, hence the above fatal error
will be triggered.
Let us permit "const" and "volatile" type modifiers. We skip modifiers
in some other cases as well like structure member type tracing.
This can aviod the above fatal error.
Differential Revision: https://reviews.llvm.org/D97986
This is included from IR files, and IR doesn't/can't depend on Analysis
(because Analysis depends on IR).
Also fix the implementation - don't use non-member static in headers, as
it leads to ODR violations, inaccurate "unused function" warnings, etc.
And fix the header protection macro name (we don't generally include
"LIB" in the names, so far as I can tell).
This is actually two changes. One is to avoid copies when fp values are fed into
a build_vector, without being able to tell from the opcode.
The other is that build_vectors are also marked as only defining FP, since they
produce vector results.
Differential Revision: https://reviews.llvm.org/D97968
This is a case D97677 missed. When taking out remaining BBs that are
reachable from already-taken-out exceptions (because they are not
subexcptions but unwind destinations), I assumed the remaining BBs are
not EH pads, but they can be. For example,
```
try {
try {
throw 0;
} catch (int) { // (a)
}
} catch (int) { // (b)
}
try {
foo();
} catch (int) { // (c)
}
```
In this code, (b) is the unwind destination of (a) so its exception is
taken out of (a)'s exception, But even though the next try-catch is not
inside the first two-level try-catches, because the first try always
throws, its continuation BB is unreachable and the whole rest of the
function is dominated by EH pad (a), including EH pad (c). So after we
take out of (b)'s exception out of (a)'s, we also need to take out (c)'s
exception out of (a)'s, because (c) is reachable from (b).
This adds one more step before what we did for remaining BBs in D97677;
it traverses EH pads first to take subexceptions out of their incorrect
parent exception. It's the same thing as D97677, but because we can do
this before we add BBs to exceptions' sets, we don't need to fix sets
and only need to fix parent exception pointers.
Other changes are variable name changes (I changed `WE` -> `SrcWE`,
`UnwindWE` -> `DstWE` for clarity), some comment changes, and a drive-by
fix in a bug in a `LLVM_DEBUG` print statement.
Fixes https://github.com/emscripten-core/emscripten/issues/13588.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D97929