I am breaking apart D99484 so the cause of build failures is easier to
understand.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D117541
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692,
SCEV is forbidden from reasoning about 'backedge taken count'
if the branch condition is a poison-safe logical operation,
which is conservatively correct, but is severely limiting.
Instead, we should have a way to express those
poison blocking properties in SCEV expressions.
The proposed semantics is:
```
Sequential/in-order min/max SCEV expressions are non-commutative variants
of commutative min/max SCEV expressions. If none of their operands
are poison, then they are functionally equivalent, otherwise,
if the operand that represents the saturation point* of given expression,
comes before the first poison operand, then the whole expression is not poison,
but is said saturation point.
```
* saturation point - the maximal/minimal possible integer value for the given type
The lowering is straight-forward:
```
compare each operand to the saturation point,
perform sequential in-order logical-or (poison-safe!) ordered reduction
over those checks, and if reduction returned true then return
saturation point else return the naive min/max reduction over the operands
```
https://alive2.llvm.org/ce/z/Q7jxvH (2 ops)
https://alive2.llvm.org/ce/z/QCRrhk (3 ops)
Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS
Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97
That allows us to handle the patterns in question.
Reviewed By: nikic, reames
Differential Revision: https://reviews.llvm.org/D116766
A prevectorized loop may contain multiple statements, in which case
isl_schedule_node_band_sink will sink the vector band to multiple
leaves. Instead of statically assuming a specific tree structure after
sinking, add a SIMD marker to all inner bands.
Fixes llvm.org/PR52637
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface.
In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`.
For this reason two helping functions have been added:
- `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds
- `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned`
Changes made:
- Add the functions `IslAssert` and `unsignedFromIslSize`
- Add the utility function `rangeIslSize()`
- Retype `MaxDisjunctsInDomain` from `int` to `unsigned`
- Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned`
- Retype `MaxDimensionsInAccessRange` from `int` to `unsigned`
- Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore
- `isl-noexceptions.h` has been generated by e704f73c88
No functional change intended.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113101
Polly is trying to move towards using isl::ast_expr / isl-noexceptions.h
(which implements RAII) where possible instead of manually managing memory.
checkIslAstExprInt manually frees Expr, so it has been removed to be
more idiomatic and consistent.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D111769
Instead of being inline and having a neverCalled() workaround to make it
work in the debugger, define it as a regular exported function.
Also add overloads for the C API types isl_* so it works with managed as
well as unmanaged ISL objects.
When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer
tries to aggressively fuse any band it can and does not violate any
dependences.
As part if the implementation, the functionalty for copying a band
into an new schedule was extracted out of the ScheduleTreeRewriter.
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
In some build environments, the C++ compiler is unable to infer the
correct type for the DenseMap::insert in isErrorBlock. Typing out
std::make_pair helps.
multiplication
The following code modifies elements of the array D.
for (i = 0; i < _PB_NI; i++)
for (j = 0; j < _PB_NJ; j++)
{
for (k = 0; k < _PB_NK; k++)
{
double Mul = A[i][k] * B[k][j];
D[i][j][k] += Mul;
C[i][j] += Mul;
}
}
Nevertheless, the code is recognised as a matrix-matrix multiplication, since
the second and third dimensions of D are accessed with non-zero strides.
This fixes the typo, which was made during the translation to C++ bindings
(https://reviews.llvm.org/D35845).
Reviewed By: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D110491
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Recommit with "REQUIRES: asserts" in test that uses statistics.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Inline assembly was not handled at all and treated like a llvm::Value.
In particular, it tried to create a pointer it which is not allowed.
Fix by handling like a llvm::Constant such that it is just reused when
required, instead of trying to marshall it in memory.
Fixes llvm.org/PR51960
VirtualUse ensures consistency over different source of values with
Polly. In particular, this enables its use of instructions moved between
Statement. Before the patch, the code wrongly assumed that the BB's
instructions are also the ScopStmt's instructions. Reference are
determined for OpenMP outlining and GPGPU kernel extraction.
GPGPU CodeGen had some problems. For one, it generated GPU kernel
parameters for constants. Second, it emitted GPU-side invariant loads
which have already been loaded by the host. This has been partially
fixed, it still generates a store for the invariant load result, but
using the value that the host has already written.
WARNING: I did not test the generated PollyACC code on an actual GPU.
The improved consistency will be made use of in the next patch.
The function was intended to catch OpenMP functions such as
get_thread_id(). If matched, the call would be considered synthesizable.
There were a few problems with this:
* get_thread_id() is not 'const' in the sense of have the gcc manual
defines it: "do not examine any values except their arguments".
get_thread_id() reads OpenCL runtime libreary global state.
What was inteded was probably 'speculable'.
* isConstCall was implemented using mayReadOrWriteMemory(). 'const' is
stricter than that, mayReadOrWriteMemory is e.g. true for malloc(),
since it may only read/write addresses that are considered
inaccessible fro the application. However, malloc is certainly not
speculable.
* Values that are isConstCall were not handled consistently throughout
Polly. In particular, it was not considered for referenced values
(OpenMP outlining and PollyACC).
Fix by removing special handling for isConstCall entirely.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.
Re-apply after revert in c7bcd72a38 with
fix: Take isBand out of #ifndef NDEBUG since it now is used
unconditionally.
The name of the option is misleading and has been renamed by isl to
"serialize-sccs". Instead of also renaming the option, remove it.
The option is still accessible using
-polly-isl-arg=--no-schedule-serialize-sccs
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.
This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing.
Rhe implemention had multiple issues:
* The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then.
* This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating.
* Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with.
* Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance.
* The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well.
A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well.
This effectively reverts D30606 and D35761.
getMetadata() currently uses a weird API where it populates a
structure passed to it, and optionally merges into it. Instead,
we can return the AAMDNodes and provide a separate merge() API.
This makes usages more compact.
Differential Revision: https://reviews.llvm.org/D109852
Building a source distribution using autotools adds GPL-licenced
files into the the sources. Although redistribution of theses files is
explicitly allowed with an exception, these are not used by Polly
which uses a CMake replacement. Use the direct source checkout
instead (replacing the output of 'make dist').
Some m4 scripts with the same licence are also included in isl/ppcg
repository. Removing them renders the autotools-based build scipts
inoperable, so remove the autotools build system altogether.
As of 741fabc222f226d34d806056b804244b012853b, polly builders are
failing from this error. The signiature is slightly different and
accepts a ScalarEvolution reference instead. This should fix the polly
builders.