In case LLVM pointers are annotated with !dereferencable attributes/metadata
or LLVM can look at the allocation from which a pointer is derived, we can know
that dereferencing pointers is safe and can be done unconditionally. We use this
information to proof certain pointers as save to hoist and then hoist them
unconditionally.
llvm-svn: 297375
Simplify ScopDetection::isInvariant(). Essentially deny everything that
is defined within the SCoP and is not load-hoisted.
The previous understanding of "invariant" has a few holes:
- Expressions without side-effects with only invariant arguments, but
are defined withing the SCoP's region with the exception of selects
and PHIs. These should be part of the index expression derived by
ScalarEvolution and not of the base pointer.
- Function calls with that are !mayHaveSideEffects() (typically
functions with "readnone nounwind" attributes). An example is given
below.
@C = external global i32
declare float* @getNextBasePtr(float*) readnone nounwind
...
%ptr = call float* @getNextBasePtr(float* %A, float %B)
The call might return:
* %A, so %ptr aliases with it in the SCoP
* %B, so %ptr aliases with it in the SCoP
* @C, so %ptr aliases with it in the SCoP
* a new pointer everytime it is called, such as malloc()
* a pointer into the allocated block of one of the aforementioned
* any of the above, at random at each call
Hence and contrast to a comment in the base_pointer.ll regression
test, %ptr is not necessarily the same all the time. It might also
alias with anything and no AliasAnalysis can tell otherwise if the
definition is external. It is hence not suitable in the role of a
base pointer.
The practical problem with base pointers defined in SCoP statements is
that it is not available globally in the SCoP. The statement instance
must be executed first before the base pointer can be used. This is no
problem if the base pointer is transferred as a scalar value between
statements. Uses of MemoryAccess::setNewAccessRelation may add a use of
the base pointer anywhere in the array. setNewAccessRelation is used by
JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently
assumes that base pointers are available globally and generates invalid
code for new access relation (referring to the base pointer of the
original code) if not, even if the base pointer would be available in
the statement.
This could be fixed with some added complexity and restrictions. The
ExprBuilder must lookup the local BBMap and code that call
setNewAccessRelation must check whether the base pointer is available
first.
The code would still be incorrect in the presence of aliasing. There
is the switch -polly-ignore-aliasing to explicitly allow this, but
it is hardly a justification for the additional complexity. It would
still be mostly useless because in most cases either getNextBasePtr()
has external linkage in which case the readnone nounwind attributes
cannot be derived in the translation unit itself, or is defined in the
same translation unit and gets inlined.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D30695
llvm-svn: 297281
Only when load-hoisted we can be sure the base pointer is invariant
during the SCoP's execution. Most of the time it would be added to
the required hoists for the alias checks anyway, except with
-polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if
AliasAnalysis is already sure it doesn't alias with anything
(for instance if there is no other pointer to alias with).
Two more parts in Polly assume that this load-hoisting took place:
- setNewAccessRelation() which contains an assert which tests this.
- BlockGenerator which would use to the base ptr from the original
code if not load-hoisted (if the access expression is regenerated)
Differential Revision: https://reviews.llvm.org/D30694
llvm-svn: 297195
There is no point in optimizing unreachable code, hence our test cases should
always return.
This commit is part of a series that makes Polly more robust on the presence of
unreachables.
llvm-svn: 297158
There is no point in optimizing unreachable code, hence our test cases should
always return.
This commit is part of a series that makes Polly more robust on the presence of
unreachables.
llvm-svn: 297150
There is no point in optimizing unreachable code, hence our test cases should
always return.
This commit is part of a series that makes Polly more robust on the presence of
unreachables.
llvm-svn: 297147
Instead of iterating over statements and their memory accesses to extract the
set of available base pointers, just directly iterate over all ScopArray
objects. This reflects more the actual intend of the code: collect all arrays
(and their base pointers) to emit alias information that specifies that accesses
to different arrays cannot alias.
This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.
llvm-svn: 294574
Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.
Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.
It might be possible to further simplify the code in addOperandToPHI,
but this would not only mean to pull out getNewValue, but to also change the
insertion point update logic. As this did not work when trying it the first
time, this change is likely not trivial. To not introduce bugs last minute, we
postpone further simplications to a subsequent commit.
We also document the current behavior a little bit better.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D28892
llvm-svn: 292486
We rename the test case with -metarenamer to make the variable names easier to
read and add additional check lines that verify the code we currently generate
for PHI nodes. This code is interesting as it contains a PHI node in a
non-affine sub-region, where some incoming blocks are within the non-affine
sub-region and others are outside of the non-affine subregion.
As can be seen in the check lines we currently load the PHI-node value twice.
This commit documents this behavior. In a subsequent patch we will try to
improve this.
llvm-svn: 292470
Summary:
Instead of forbidding such access functions completely, we verify that their
base pointer has been hoisted and only assert in case the base pointer was
not hoisted.
I was trying for a little while to get a test case that ensures the assert is
correctly fired in case of invariant load hoisting being disabled, but I could
not find a good way to do so, as llvm-lit immediately aborts if a command
yields a non-zero return value. As we do not generally test our asserts,
not having a test case here seems OK.
This resolves http://llvm.org/PR31494
Suggested-by: Michael Kruse <llvm@meinersbur.de>
Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D28798
llvm-svn: 292213
This feature is currently not supported and an explicit assert to prevent the
introduction of such accesses has been added in r282893. This test case allows
to reproduce the assert (and without the assert the miscompile) added in
r282893. It will help when adding such support at some point.
llvm-svn: 292147
This allows us to delinearize code such as the one below, where the array
sizes are A[][2 * n] as there are n times two elements in the innermost
dimension. Alternatively, we could try to generate another dimension for the
struct in the innermost dimension, but as the struct has constant size,
recovering this dimension is easy.
struct com {
double Real;
double Img;
};
void foo(long n, struct com A[][n]) {
for (long i = 0; i < 100; i++)
for (long j = 0; j < 1000; j++)
A[i][j].Real += A[i][j].Img;
}
int main() {
struct com A[100][1000];
foo(1000, A);
llvm-svn: 288489
Introduce the new flag -polly-codegen-generate-expressions which forces Polly
to code generate AST expressions instead of using our SCEV based access
expression generation even for cases where the original memory access relation
was not changed and the SCEV based access expression could be code generated
without any issue.
This is an experimental option for better testing the isl ast expression
generation. The default behavior of Polly remains unchanged. We also exclude
a couple of cases for which the AST expression is not yet working.
llvm-svn: 287694
Drop instructions that do not influence the memory impact of a basic block.
They are not needed to reproduce the original bug (verified) and will cause
random test noise if we would decide to only model the instructions that
have visible side-effects.
llvm-svn: 287626
Add two store instructions at the end of basic blocks that are required to
reproduce the original bug to ensure we always process and model these basic
blocks. This makes this test case stable even in case we would decide to bail
out early of basic blocks which do not modify the global state. Also add
additional check lines to verify how we model the basic block.
llvm-svn: 287625
Providing the context to the ast generator allows for additional simplifcations
and -- more importantly -- allows to generate loops with only partially bounded
domains, assuming the domains are bounded for all parameter configurations
that are valid as defined by the context.
This change fixes the crash reported in http://llvm.org/PR30956
The original reason why we did not include the context when generating an
AST was that CLooG and later isl used to sometimes transfer some of the
constraints that bound the size of parameters from the context into the
generated AST. This resulted in operations with very large constants, which
sometimes introduced problematic integer overflows. The latest versions of
the isl AST generator are careful to not introduce such constants.
Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286442
We do not need the size of the outermost dimension in most cases, but if we
allocate memory for newly created arrays, that size is needed.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D23991
llvm-svn: 281234
Change the code around setNewAccessRelation to allow to use a an existing array
element for memory instead of an ad-hoc alloca. This facility will be used for
DeLICM/DeGVN to convert scalar dependencies into regular ones.
The changes necessary include:
- Make the code generator use the implicit locations instead of the alloca ones.
- A test case
- Make the JScop importer accept changes of scalar accesses for that test case.
- Adapt the MemoryAccess interface to the fact that the MemoryKind can change.
They are named (get|is)OriginalXXX() to get the status of the memory access
before any change by setNewAccessRelation() (some properties such as
getIncoming() do not change even if the kind is changed and are still
required). To get the modified properties, there is (get|is)LatestXXX(). The
old accessors without Original|Latest become synonyms of the
(get|is)OriginalXXX() to not make functional changes in unrelated code.
Differential Revision: https://reviews.llvm.org/D23962
llvm-svn: 280408
Dump polyhedral descriptions of Scops optimized with the isl scheduling
optimizer and the set of post-scheduling transformations applied
on the schedule tree to be able to check the work of the IslScheduleOptimizer
pass at the polyhedral level.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23740
llvm-svn: 279395
We already invalidated a couple of critical values earlier on, but we now
invalidate all instructions contained in a scop after the scop has been code
generated. This is necessary as later scops may otherwise obtain SCEV
expressions that reference values in the earlier scop that before dominated
the later scop, but which had been moved into the conditional branch and
consequently do not dominate the later scop any more. If these very values are
then used during code generation of the later scop, we generate used that are
dominated by the values they use.
This fixes: http://llvm.org/PR28984
llvm-svn: 279047
This will make it easier to switch the default of Polly's invariant load
hoisting strategy and also makes it very clear that these test cases
indeed require invariant code hoisting to work.
llvm-svn: 278667
This is the third patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform replacement of
the access relations and create empty arrays, which are steps to implement
the packing transformation. In subsequent changes we will implement copying
to created arrays.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: http://reviews.llvm.org/D22187
llvm-svn: 278666
In case some code -- not guarded by control flow -- would be emitted directly in
the start block, it may happen that this code would use uninitalized scalar
values if the scalar initialization is only emitted at the end of the start
block. This is not a problem today in normal Polly, as all statements are
emitted in their own basic blocks, but Polly-ACC emits host-to-device copy
statements into the start block.
Additional Polly-ACC test coverage will be added in subsequent changes that
improve the handling of PHI nodes in Polly-ACC.
llvm-svn: 278124
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D22828
llvm-svn: 277263
It seems the order in which we generated memory accesses changed such that
the import of these updated memory accesses failed for the 'loop3' statement
in this test case. Unfortunately, the existing CHECK lines were not strict
enough to catch this. Hence, besides fixing the order of the memory access
lines we also ensure that the memory access changes are both clearly visibly
and well checked.
llvm-svn: 276247
With this update the isl AST generation extracts disjunctive constraints early
on. As a result, code that previously resulted in two branches with (close-to)
identical code within them:
if (P <= -1) {
for (int c0 = 0; c0 < N; c0 += 1)
Stmt_store(c0);
} else if (P >= 1)
for (int c0 = 0; c0 < N; c0 += 1)
Stmt_store(c0);
results now in only a single branch body:
if (P <= -1 || P >= 1)
for (int c0 = 0; c0 < N; c0 += 1)
Stmt_store(c0);
This resolves http://llvm.org/PR27559
Besides the above change, this isl update brings better simplification of
sets/maps containing existentially quantified dimensions and fixes a bug in
isl's coalescing.
llvm-svn: 272500
As these test cases will be changed in a subsequent commit, we expand and
tighten them to make the subsequent changes to them more obvious. As part of
this we add more context to some test cases and add CHECK-NEXT lines to ensure
no intermediate lines are missed by accident.
llvm-svn: 272499
IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
such. In order to generate them properly as parameters we had to improve the
ScopExpander, though the change is the first in the direction of a more
aggressive scalar synthetization.
This patch was originally contributed by Johannes Doerfert in r271888, but was
in conflict with the revert in r272483. This is a recommit with some minor
adjustment to the test cases to take care of differing instruction names.
llvm-svn: 272485
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:
- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"
Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.
llvm-svn: 272483
This patch refactors the code generation for divisions. This allows to
always generate a shift for a power-of-two division and to utilize
information about constant divisors in order to truncate the result
type.
llvm-svn: 271898
We now generate runtime checks __after__ the SCoP code generation and
not before, though they are still inserted at the same position int
the code. This allows to modify the runtime check during SCoP code
generation.
llvm-svn: 271894
IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
such. In order to generate them properly as parameters we had to improve the
ScopExpander, though the change is the first in the direction of a more
aggressive scalar synthetization.
llvm-svn: 271888