Rather than checking the bitcast pointer element types, compare
the element type of the access and the GEP result type.
The entire code is dubious due to the inspection of GEP structure,
but this at least preserves the spirit of the existing code.
The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests.
There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D120782
Ensure that function definitions match their declrations in header
files, even if they have no effect on linking. This includes
1. Both have the same __isl_* annotations
2. Both use the same type alias
3. Remove unused declarations that have no definition
4. Use explicit polly namespace qualifier for definitions; generally,
the .cpp file should use at most an anon namespace region since
only symbols declared in the header file can be accessed from other
translation units anyway. For defintions that have been declared in
the header file, the explicit namespace qualifier ensures that both
match.
When compiling with Clang modules enabled, polly's use of using-directives
caused the global object `Target` in RegisterPasses.cpp to clash with
`llvm::Target`. By eliminating the using-directives, we're able to get
polly to play nicely with a modules build.
Differential Revision: https://reviews.llvm.org/D119809
I am breaking apart D99484 so the cause of build failures is easier to
understand.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D117541
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692,
SCEV is forbidden from reasoning about 'backedge taken count'
if the branch condition is a poison-safe logical operation,
which is conservatively correct, but is severely limiting.
Instead, we should have a way to express those
poison blocking properties in SCEV expressions.
The proposed semantics is:
```
Sequential/in-order min/max SCEV expressions are non-commutative variants
of commutative min/max SCEV expressions. If none of their operands
are poison, then they are functionally equivalent, otherwise,
if the operand that represents the saturation point* of given expression,
comes before the first poison operand, then the whole expression is not poison,
but is said saturation point.
```
* saturation point - the maximal/minimal possible integer value for the given type
The lowering is straight-forward:
```
compare each operand to the saturation point,
perform sequential in-order logical-or (poison-safe!) ordered reduction
over those checks, and if reduction returned true then return
saturation point else return the naive min/max reduction over the operands
```
https://alive2.llvm.org/ce/z/Q7jxvH (2 ops)
https://alive2.llvm.org/ce/z/QCRrhk (3 ops)
Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS
Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97
That allows us to handle the patterns in question.
Reviewed By: nikic, reames
Differential Revision: https://reviews.llvm.org/D116766
A prevectorized loop may contain multiple statements, in which case
isl_schedule_node_band_sink will sink the vector band to multiple
leaves. Instead of statically assuming a specific tree structure after
sinking, add a SIMD marker to all inner bands.
Fixes llvm.org/PR52637
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface.
In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`.
For this reason two helping functions have been added:
- `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds
- `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned`
Changes made:
- Add the functions `IslAssert` and `unsignedFromIslSize`
- Add the utility function `rangeIslSize()`
- Retype `MaxDisjunctsInDomain` from `int` to `unsigned`
- Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned`
- Retype `MaxDimensionsInAccessRange` from `int` to `unsigned`
- Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore
- `isl-noexceptions.h` has been generated by e704f73c88
No functional change intended.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113101
Polly is trying to move towards using isl::ast_expr / isl-noexceptions.h
(which implements RAII) where possible instead of manually managing memory.
checkIslAstExprInt manually frees Expr, so it has been removed to be
more idiomatic and consistent.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D111769
Instead of being inline and having a neverCalled() workaround to make it
work in the debugger, define it as a regular exported function.
Also add overloads for the C API types isl_* so it works with managed as
well as unmanaged ISL objects.
When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer
tries to aggressively fuse any band it can and does not violate any
dependences.
As part if the implementation, the functionalty for copying a band
into an new schedule was extracted out of the ScheduleTreeRewriter.
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
In some build environments, the C++ compiler is unable to infer the
correct type for the DenseMap::insert in isErrorBlock. Typing out
std::make_pair helps.
multiplication
The following code modifies elements of the array D.
for (i = 0; i < _PB_NI; i++)
for (j = 0; j < _PB_NJ; j++)
{
for (k = 0; k < _PB_NK; k++)
{
double Mul = A[i][k] * B[k][j];
D[i][j][k] += Mul;
C[i][j] += Mul;
}
}
Nevertheless, the code is recognised as a matrix-matrix multiplication, since
the second and third dimensions of D are accessed with non-zero strides.
This fixes the typo, which was made during the translation to C++ bindings
(https://reviews.llvm.org/D35845).
Reviewed By: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D110491
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Recommit with "REQUIRES: asserts" in test that uses statistics.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Inline assembly was not handled at all and treated like a llvm::Value.
In particular, it tried to create a pointer it which is not allowed.
Fix by handling like a llvm::Constant such that it is just reused when
required, instead of trying to marshall it in memory.
Fixes llvm.org/PR51960
VirtualUse ensures consistency over different source of values with
Polly. In particular, this enables its use of instructions moved between
Statement. Before the patch, the code wrongly assumed that the BB's
instructions are also the ScopStmt's instructions. Reference are
determined for OpenMP outlining and GPGPU kernel extraction.
GPGPU CodeGen had some problems. For one, it generated GPU kernel
parameters for constants. Second, it emitted GPU-side invariant loads
which have already been loaded by the host. This has been partially
fixed, it still generates a store for the invariant load result, but
using the value that the host has already written.
WARNING: I did not test the generated PollyACC code on an actual GPU.
The improved consistency will be made use of in the next patch.
The function was intended to catch OpenMP functions such as
get_thread_id(). If matched, the call would be considered synthesizable.
There were a few problems with this:
* get_thread_id() is not 'const' in the sense of have the gcc manual
defines it: "do not examine any values except their arguments".
get_thread_id() reads OpenCL runtime libreary global state.
What was inteded was probably 'speculable'.
* isConstCall was implemented using mayReadOrWriteMemory(). 'const' is
stricter than that, mayReadOrWriteMemory is e.g. true for malloc(),
since it may only read/write addresses that are considered
inaccessible fro the application. However, malloc is certainly not
speculable.
* Values that are isConstCall were not handled consistently throughout
Polly. In particular, it was not considered for referenced values
(OpenMP outlining and PollyACC).
Fix by removing special handling for isConstCall entirely.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.
Re-apply after revert in c7bcd72a38 with
fix: Take isBand out of #ifndef NDEBUG since it now is used
unconditionally.
The name of the option is misleading and has been renamed by isl to
"serialize-sccs". Instead of also renaming the option, remove it.
The option is still accessible using
-polly-isl-arg=--no-schedule-serialize-sccs
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.