The `decode_relrs` helper is declared as:
`Expected<std::vector<Elf_Rel>> decode_relrs(Elf_Relr_Range relrs) const;`
it never returns an error though and hence can be simplified to return
a vector.
Differential revision: https://reviews.llvm.org/D85302
OpenMP TR8 sec. 2.15.6 "target update Construct", p. 183, L3-4 states:
> If the corresponding list item is not present in the device data
> environment and there is no present modifier in the clause, then no
> assignment occurs to or from the original list item.
L10-11 states:
> If a present modifier appears in the clause and the corresponding
> list item is not present in the device data environment then an
> error occurs and the program termintates.
(OpenMP 5.0 also has the first passage but without mention of the
present modifier of course.)
In both passages, I assume "is not present" includes the case of
partially but not entirely present. However, without this patch, the
target update directive misbehaves in this case both with and without
the present modifier. For example:
```
#pragma omp target enter data map(to:arr[0:3])
#pragma omp target update to(arr[0:5]) // might fail on data transfer
#pragma omp target update to(present:arr[0:5]) // might fail on data transfer
```
The problem is that `DeviceTy::getTgtPtrBegin` does not return a null
pointer in that case, so `target_data_update` sees the data as fully
present, and the data transfer then might fail depending on the target
device. However, without the present modifier, there should never be
a failure. Moreover, with the present modifier, there should always
be a failure, and the diagnostic should mention the present modifier.
This patch fixes `DeviceTy::getTgtPtrBegin` to return null when
`target_data_update` is the caller. I'm wondering if it should do the
same for more callers.
Reviewed By: grokos, jdoerfert
Differential Revision: https://reviews.llvm.org/D85246
Without this patch, the following example fails but shouldn't
according to OpenMP TR8:
```
#pragma omp target enter data map(alloc:i)
#pragma omp target data map(present, alloc: i)
{
#pragma omp target exit data map(delete:i)
} // fails presence check here
```
OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states:
> If the map clause appears on a target, target data, target enter
> data or target exit data construct with a present map-type-modifier
> then on entry to the region if the corresponding list item does not
> appear in the device data environment an error occurs and the
> program terminates.
There is no corresponding statement about the exit from a region.
Thus, the `present` modifier should:
1. Check for presence upon entry into any region, including a `target
exit data` region. This behavior is already implemented correctly.
2. Should not check for presence upon exit from any region, including
a `target` or `target data` region. Without this patch, this
behavior is not implemented correctly, breaking the above example.
In the case of `target data`, this patch fixes the latter behavior by
removing the `present` modifier from the map types Clang generates for
the runtime call at the end of the region.
In the case of `target`, we have not found a valid OpenMP program for
which such a fix would matter. It appears that, if a program can
guarantee that data is present at the beginning of a `target` region
so that there's no error there, that data is also guaranteed to be
present at the end. This patch adds a comment to the runtime to
document this case.
Reviewed By: grokos, RaviNarayanaswamy, ABataev
Differential Revision: https://reviews.llvm.org/D84422
Implement proper folding of statepoint meta operands (deopt and GC)
when statepoint uses tied registers.
For deopt operands it is just about properly preserving tiedness
in new instruction.
For tied GC operands folding is a little bit more tricky.
We can fold tied GC operands only from InlineSpiller, because it knows
how to properly reload tied def after it was turned into memory operand.
Other users (e.g. peephole) cannot properly fold such operands as they
do not know how (or when) to reload them from memory.
We do this by un-tieing operand we want to fold in InlineSpiller
and allowing to fold only untied operands in foldPatchpoint.
Introduce an initial version of C API for MLIR core IR components: Value, Type,
Attribute, Operation, Region, Block, Location. These APIs allow for both
inspection and creation of the IR in the generic form and intended for wrapping
in high-level library- and language-specific constructs. At this point, there
is no stability guarantee provided for the API.
Reviewed By: stellaraccident, lattner
Differential Revision: https://reviews.llvm.org/D83310
This reverts commit ac70b37a00
which reverted commit 8aeb2fe13a
because codegen tests got broken and i needed time to investigate.
This shows some regressions in tests, but they are all around GEP's,
so i'm not really sure how important those are.
https://rise4fun.com/Alive/1Gn
NamedDecl::printName will print the pretty-printed name of the entity, which
is not what we want here (we should print "enum { e };" instead of "enum
(unnamed enum at input.cc:1:5) { e };").
For now only DecompositionDecl and MDGuidDecl have an overloaded printName so
this does not result in any functional change, but this change is needed since
I will be adding overloads to better handle unnamed entities in diagnostics.
sizeof(ConstantMatrixTypeBitfields) > 8 which increases the size of every type.
This was not detected because no corresponding static_assert for its size was
added.
To prevent this from occuring again replace the various static_asserts for
the size of each of the bit-field classes by a single static_assert for the
size of Type.
I have left ConstantMatrixType::MaxElementsPerDimension unchanged since
the limit is exercised by multiple tests.
`OS << ND->getDeclName();` is equivalent to `OS << ND->getNameAsString();`
without the extra temporary string.
This is not quite a NFC since two uses of `getNameAsString` in a
diagnostic are replaced, which results in the named entity being
quoted with additional "'"s (ie: 'var' instead of var).
This dialect was introduced during the bring-up of the new LLVM dialect type
system for testing purposes. The main LLVM dialect now uses the new type system
and the test dialect is no longer necessary, so remove it.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D85224
nvcc supports accessing file-scope static device variables in host code by host APIs
like cudaMemcpyToSymbol etc.
CUDA/HIP let users access device variables in host code by shadow variables. In host compilation,
clang emits a shadow variable for each device variable, and calls __*RegisterVariable to
register it in init function. The address of the shadow variable and the device side mangled
name of the device variable is passed to __*RegisterVariable. Runtime looks up the symbol
by name in the device binary to find the address of the device variable.
The problem with static device variables is that they have internal linkage, therefore their
name may be changed by the linker if there are multiple symbols with the same name. Also
they end up as local symbols in the elf file, whereas the runtime only looks up the global symbols.
Another reason for making the static device variables external linkage is that they may be
initialized externally by host code and their final value may be accessed by host code
after kernel execution, therefore they actually have external linkage. Giving them internal
linkage will cause incorrect optimizations on them.
To support accessing static device var in host code for -fno-gpu-rdc mode, change the intnernal
linkage to external linkage. The name does not need change since there is only one TU for
-fno-gpu-rdc mode. Also the externalization is done only if the device static var is referenced
by host code.
Differential Revision: https://reviews.llvm.org/D80858
As with other targets, set the throughput cost of control-flow
instructions to free so that we don't miss out of vectorization
opportunities.
Differential Revision: https://reviews.llvm.org/D85283
This patch adds support for dumping DWARF sections to obj2yaml. The
.debug_aranges section is used to illustrate the basic idea.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85094
This quietly disabled use of zlib on Windows even when building with
-DLLVM_ENABLE_ZLIB=FORCE_ON.
> Rather than handling zlib handling manually, use find_package from CMake
> to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
> HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
> set to YES, which requires the distributor to explicitly select whether
> zlib is enabled or not. This simplifies the CMake handling and usage in
> the rest of the tooling.
>
> This is a reland of abb0075 with all followup changes and fixes that
> should address issues that were reported in PR44780.
>
> Differential Revision: https://reviews.llvm.org/D79219
This reverts commit 10b1b4a231 and follow-ups
64d99cc6ab and
f9fec0447e.
Since there are no ill effects when performing these operations
with undefined elements, they are lowered to the already supported
unpredicated scalable vector equivalents.
Differential Revision: https://reviews.llvm.org/D85117
We currently don't do anything to fold any_extend vector loads as no target has such an instruction.
Instead I've added support for folding to a zextload, SimplifyDemandedBits does a good job of adjusting the zext(truncate(()) stages as required later on.
We still need the custom scalar extload handling instead of using the tryToFoldExtOfLoad helper as it has different legality tests - we can probably tweak that to reduce most of the code duplication.
Fixes the regression I mentioned in rG99a971cadff7
Differential Revision: https://reviews.llvm.org/D85129
Handle the case where the ViewOp takes in a memref that has
an memory space.
Reviewed By: ftynse, bondhugula, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D85048
Currently, we only test the `--stackmap` option here:
https://github.com/llvm/llvm-project/blob/master/llvm/test/Object/stackmap-dump.test
it uses a precompiled MachO binary currently and I've found no tests for this option for ELF.
The implementation also has issues. For example, it might assert on a wrong version
of the .llvm-stackmaps section. Or it might crash on an empty or truncated section.
This patch introduces a new tools/llvm-readobj/ELF test file as well as implements a few
basic checks to catch simple crashes/issues
It also eliminates `unwrapOrError` calls in `printStackMap()`.
Differential revision: https://reviews.llvm.org/D85208
This came up during the review for D67656. It's nice but also subtle, so documenting it as an idiom will make tests easier to understand.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D68061
The extent tensor type is a `tensor<?xindex>` that is used in the shape dialect.
To facilitate the use of this type when working with the shape dialect, we
expose the helper function for its construction.
Differential Revision: https://reviews.llvm.org/D85121
The code in SectionMemoryManager.cpp unnecessarily maps
read-only data sections with the READ+EXECUTE flags. This is
undesirable from a security stand-point.
Moreover, on the Fuchsia platform, which is now very strict
about mapping pages with the EXECUTE permission, this simply
fails, because the section's pages were initially allocated
with only the READ+WRITE flags.
A more detailed description of the issue can be found in this
public SwiftShader bug:
https://issuetracker.google.com/issues/154586551
This patch just restrict the mapping to the READ flag for ROData
sections. Code sections are still mapped with READ+EXECUTE as
expected.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D78574
This fixes an issue triggered by the following code, where emitEpilogue
got confused when trying to restore the SVE registers after the call,
whereas the call to bar() is implemented as a TCReturn:
int non_sve();
int sve(svint32_t x) { return non_sve(); }
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84869
Updated the documentation with new MLIR LLVM types for
vectors, pointers, arrays and structs. Also, changed remaining
tabs to spaces.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D85277
Based on https://reviews.llvm.org/D84022 with additional changes to
maintain out-of-tree builds.
Original commit message:
Currently the binaries are output directly into the bin subdirectory of
the build directory. This doesn't work correctly with multi-config
generators which should output the binaries into <CONFIG_NAME>/bin
instead.
The original patch was implemented by David Truby and the additional
changes added here were also proposed by David Truby.
Differential Revision: https://reviews.llvm.org/D85078/
Co-authored-by: David Truby <david.truby@arm.com>