To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries. This
commit adds support for linking in these libraries.
However, since these device libraries are delivered as LLVM bitcode,
raising the possibility of version incompatibilities, this commit only
links in libraries when the functions from those libraries are called
by the code being compiled.
This code also sets the math flags to their most conservative values,
as MLIR doesn't have a `-ffast-math` equivalent.
Depends on D114114
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114117
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `Thumb2SizeReduction.cpp`.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D114196
`DeducedTemplateSpecializationTypes` is a `llvm::FoldingSet<DeducedTemplateSpecializationType>` [1],
where `FoldingSetNodeID` is based on the values: {`TemplateName`, `QualType`, `IsDeducedAsDependent`},
those values are also used as `DeducedTemplateSpecializationType` constructor arguments.
A `FoldingSetNodeID` created by the static `DeducedTemplateSpecializationType::Profile` may not be equal
to`FoldingSetNodeID` created by a member `DeducedTemplateSpecializationType::Profile` of an instance
created with the same {`TemplateName`, `QualType`, `IsDeducedAsDependent`}, which makes
`DeducedTemplateSpecializationTypes` lookups nondeterministic.
Specifically, while `IsDeducedAsDependent` value is passes to the constructor, `IsDependent()` method on
the created instance may return a different value, because `IsDependent` is not saved as is:
```name=clang/include/clang/AST/Type.h
DeducedTemplateSpecializationType(TemplateName Template, QualType DeducedAsType, bool IsDeducedAsDependent)
: DeducedType(DeducedTemplateSpecialization, DeducedAsType,
toTypeDependence(Template.getDependence()) | // <~ also considers `TemplateName` parameter
(IsDeducedAsDependent ? TypeDependence::DependentInstantiation : TypeDependence::None)),
```
For example, if an instance A with key `FoldingSetNodeID {A, B, false}` is inserted. Then a key
`FoldingSetNodeID {A, B, true}` is probed:
If it happens to correspond to the same bucket in `FoldingSet` as the first key, and `A.Profile()` returns
`FoldingSetNodeID {A, B, true}`, then it's a hit.
If the bucket for the second key is different from the first key, instance A is not considered at all, and it's
a no hit, even if `A.Profile()` returns `FoldingSetNodeID {A, B, true}`.
Since `TemplateName`, `QualType` parameter values involve memory pointers, the lookup result depend on allocator,
and may differ from run to run. When this is used as part of modules compilation, it may result in "module out of date"
errors, if imported modules are built on different machines.
This makes `ASTContext::getDeducedTemplateSpecializationType` consider `Template.isDependent()` similar
`DeducedTemplateSpecializationType` constructor.
Tested on a very big codebase, by running modules compilations from directories with varied path length
(seem to affect allocator seed).
1. https://llvm.org/docs/ProgrammersManual.html#llvm-adt-foldingset-h
Patch by Wei Wang and Igor Sugak!
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D112481
This patch only adds tests for PowerPC. The purpose of these tests
is to track what code is generated for various vector reductions.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D113801
This change allows SwiftAttr to be used with #pragma clang attribute push
to add Swift attributes to large regions of header files.
We plan to use this to annotate headers with concurrency information.
Patch by: Becca Royal-Gordon
Differential Revision: https://reviews.llvm.org/D112773
Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. This commit adds a --rocm-path option to SerializeToHsaco,
and removes the HIP dependency that the SerializeToHsaco previously had.
Depends on D114113
(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114114
Removes a +x/-x pair on the only store/load of a variable
and deletes some nearby dead code. Also reduces the size of the implicit
struct to reflect the code currently emitted by clang.
Differential Revision: https://reviews.llvm.org/D114270
This is not mandated by the standard, so it goes in libcxx/test/libcxx/.
It's certainly arguable that the algorithms changed here
(`is_heap`, `is_sorted`, `min`, `max`) are harmless and we should
just let them copy their comparators once. But at the same time,
it's nice to have all our algorithms be 100% consistent and never
copy a comparator, not even once.
Differential Revision: https://reviews.llvm.org/D114136
We would have been defining it in <utility> instead of <charconv>. For
the time being, this doesn't change anything since we don't implement
the feature test macro anyways.
Also, as a fly-by, this removes obsolete feature test macro tests. There
was a brief time back in the days when we wrote feature test macro tests
manually. In particular, we had test files for __cpp_lib_to_chars and
__cpp_lib_memory_resource. Since we now have a principled way of generating
these tests with scripts, this commit removes the obsolete (and empty)
tests for these two feature test macros.
Differential Revision: https://reviews.llvm.org/D114243
We never noticed it because our CI doesn't actually build against a C
library that doesn't have threading functionality, however building
against a truly thread-free platform surfaces these issues.
Differential Revision: https://reviews.llvm.org/D114242
One some platforms, -Wimplicit-int-conversion is enabled by default,
which can lead to additional warnings being triggered in this test.
Since we're only trying to test errors related to calling abs(), the
assignment is superfluous.
As a fly-by fix, correct one instance of ::abs to std::abs and made
the test a .verify.cpp test instead.
Differential Revision: https://reviews.llvm.org/D114244
- Adds hooks that allow SerializeTo* passes to arbitrarily transform
the produced LLVM Module before it is passed to the code generation
passes.
- Uses these hooks within the SerializeToHsaco pass in order to run
LLVM optimizations and to set the optimization level on the
TargetMachine.
- Adds an optLevel parameter to SerializeToHsaco
Future work may include moving much of what's been added to
SerializeToHsaco to SerializeToBlob, but that would require
confirmation from the NVVM backend maintainers that it would be
appropriate to do so.
Depends on D114107
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114113
NFCI. Previously the implicit use was added to V_MOV_B32_indirect_read
when building the instruction. V_MOV_B32_indirect_write didn't have an
implicit use of M0 at all, but apparently it did not cause any problems.
Differential Revision: https://reviews.llvm.org/D114239
Fix a null pointer dereference when .got.plt is discarded.
This also adds a test for discarding `.plt`.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D114180
Instead of using shape_cast op in the pattern removing leading unit
dimensions we use extract/broadcast ops. This is part of the effort to
restrict ShapeCastOp fuirther in the future and only allow them to
convert to or from 1D vector.
This also adds extra canonicalization to fill the gaps in simplifying
broadcast/extract ops.
Differential Revision: https://reviews.llvm.org/D114205
Add an IR in unit test directory, which demonstrate the scalarization for struct allocations.
This is added to pave the way for an SROA change to skip scalarization for some cases.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D114128
`CallDescriptions` have a `RequiredArgs` and `RequiredParams` members,
but they are of different types, `unsigned` and `size_t` respectively.
In the patch I use only `unsigned` for both, that should be large enough
anyway.
I also introduce the `MaybeUInt` type alias for `Optional<unsigned>`.
Additionally, I also avoid the use of the //smart// less-than operator.
template <typename T>
constexpr bool operator<=(const Optional<T> &X, const T &Y);
Which would check if the optional **has** a value and compare the data
only after. I found it surprising, thus I think we are better off
without it.
Reviewed By: martong, xazax.hun
Differential Revision: https://reviews.llvm.org/D113594
Previously, CallDescription simply referred to the qualified name parts
by `const char*` pointers.
In the future we might want to dynamically load and populate
`CallDescriptionMaps`, hence we will need the `CallDescriptions` to
actually **own** their qualified name parts.
Reviewed By: martong, xazax.hun
Differential Revision: https://reviews.llvm.org/D113593
This patch replaces each use of the previous API with the new one.
In variadic cases, it will use the ADL `matchesAny(Call, CDs...)`
variadic function.
Also simplifies some code involving such operations.
Reviewed By: martong, xazax.hun
Differential Revision: https://reviews.llvm.org/D113591
This patch introduces `CallDescription::matches()` member function,
accepting a `CallEvent`.
Semantically, `Call.isCalled(CD)` is the same as `CD.matches(Call)`.
The patch also introduces the `matchesAny()` variadic free function template.
It accepts a `CallEvent` and at least one `CallDescription` to match
against.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D113590
Sometimes we only want to decide if some function is called, and we
don't care which of the set.
This `CallDescriptionSet` will have the same behavior, except
instead of `lookup()` returning a pointer to the mapped value,
the `contains()` returns `bool`.
Internally, it uses the `CallDescriptionMap<bool>` for implementing the
behavior. It is preferred, to reuse the generic
`CallDescriptionMap::lookup()` logic, instead of duplicating it.
The generic version might be improved by implementing a hash lookup or
something along those lines.
Reviewed By: martong, Szelethus
Differential Revision: https://reviews.llvm.org/D113589
- Move the #define s to the GPU Transform library from GPU Ops so that
SerializeToHsaco is non-trivially compiled
- Add required includes to SerializeToHsaco
- Move MCSubtargetInfo creation to the correct point in the
compilation process
- Change mlir in ROCM tests to account for renamed/moved ops
Differential Revision: https://reviews.llvm.org/D114184
We were using the client socket close as a way to terminate the handler
thread. But this kind of concurrent access to the same socket is not
safe. It also complicates running the handler without a dedicated thread
(next patch).
Instead, here I add an explicit way for a packet handler to request
termination. Waiting for lldb to terminate the connection would almost
be sufficient, but in the pty test we want to keep the pty open so we
can examine its state. Ability to disconnect at an arbitrary point may
be useful for testing other aspects of lldb functionality as well.
The way this works is that now each packet handler can optionally return
a list of responses (instead of just one). One of those responses (it
only makes sense for it to be the last one) can be a special
RESPONSE_DISCONNECT object, which triggers a disconnection (via a new
TerminateConnectionException).
As the mock server now cleans up the connection whenever it disconnects,
the pty test needs to explicitly dup(2) the descriptors in order to
inspect the post-disconnect state.
Differential Revision: https://reviews.llvm.org/D114156
Revert "[SCEV] Defer all work from ea12c2cb as late as possible"
Revert "[SCEV] Defer loop property checks from ea12c2cb as late as possible"
This reverts commit 734abbad79 and 1a5666acb2.
Both of these changes were speculative attempts to address a compile time regression. Neither worked, and both complicated the code in undesirable ways.
On RISC-V, icmp is not sunk (as the following snippet shows) which
generates the following suboptimal branch pattern:
```
core_list_find:
lh a2, 2(a1)
seqz a3, a0 <<
bltz a2, .LBB0_5
bnez a3, .LBB0_9 << should sink the seqz
[...]
j .LBB0_9
.LBB0_5:
bnez a3, .LBB0_9 << should sink the seqz
lh a1, 0(a1)
[...]
```
due to an icmp not being sunk.
The blocks after `codegenprepare` look as follows:
```
define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
entry:
%idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
%0 = load i16, i16* %idx, align 2, !tbaa !4
%cmp = icmp sgt i16 %0, -1
%tobool.not37 = icmp eq %struct.list_head_s* %list, null
br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader
while.cond9.preheader: ; preds = %entry
br i1 %tobool.not37, label %return, label %land.rhs11.lr.ph
```
where the `%tobool.not37` is the result of the icmp that is not sunk.
Note that it is computed in the basic-block up until what becomes the
`bltz` instruction and the `bnez` is a basic-block of its own.
Compare this to what happens on AArch64 (where the icmp is correctly sunk):
```
define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
entry:
%idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
%0 = load i16, i16* %idx, align 2, !tbaa !6
%cmp = icmp sgt i16 %0, -1
br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader
while.cond9.preheader: ; preds = %entry
%1 = icmp eq %struct.list_head_s* %list, null
br i1 %1, label %return, label %land.rhs11.lr.ph
```
This is caused by sinkCmpExpression() being skipped, if multiple
condition registers are supported.
Given that the check for multiple condition registers affect only
sinkCmpExpression() and shouldNormalizeToSelectSequence(), this change
adjusts the RISC-V target as follows:
* we no longer signal multiple condition registers (thus changing
the behaviour of sinkCmpExpression() back to sinking the icmp)
* we override shouldNormalizeToSelectSequence() to let always select
the preferred normalisation strategy for our backend
With both changes, the test results remain unchanged. Note that without
the target-specific override to shouldNormalizeToSelectSequence(), there
is worse code (more branches) generated for select-and.ll and select-or.ll.
The original test case changes as expected:
```
core_list_find:
lh a2, 2(a1)
bltz a2, .LBB0_5
beqz a0, .LBB0_9 <<
[...]
j .LBB0_9
.LBB0_5:
beqz a0, .LBB0_9 <<
lh a1, 0(a1)
[...]
```
Differential Revision: https://reviews.llvm.org/D98932
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `IntrinsicsNVVM.td`.
Reviewed By: steffenlarsen
Differential Revision: https://reviews.llvm.org/D114193
We were adding all defined weak symbols to the materialization
responsibility, but local symbols will not be in the symbol table, so it
failed to materialize due to the "missing" symbol.
Local weak symbols come up in practice when using `ld -r` with a hidden
weak symbol.
rdar://85574696
For tagged-globals, we only need to disable relaxation for globals that
we actually tag. With this patch function pointer relocations, which
we do not instrument, can be relaxed.
This patch also makes tagged-globals work properly with LTO, as
-Wa,-mrelax-relocations=no doesn't work with LTO.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D113220