The test failed since commit
bc10888dc "DomTree: Make PostDomTree indifferent to block successors swap"
which is a re-commit of
c35585e20 "DomTree: Make PostDomTree immune to block successors swap"
This will allow us to use the datalayout to disambiguate other
constructs in IR, like load alignment. Split off from D78403.
Differential Revision: https://reviews.llvm.org/D78413
This is equivalent in terms of LLVM IR semantics, but we want to
transition away from using MaybeAlign to represent the alignment of
these instructions.
Differential Revision: https://reviews.llvm.org/D77984
The option is passed as argv to ISL's command line option parser.
Polly's own own command line options take precedence over options passed
as `-polly-isl-arg`. For instance,
`-polly-isl-arg=--schedule-outer-coincidence` will be ignored in favor
of `-polly-opt-outer-coincidence`.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D77303
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
Static chunked OpenMP scheduling has not been treated correctly.
This patch fixes the problem that threads would not process their
(work-)chunks as intended.
Differential Revision: https://reviews.llvm.org/D61081
Since the removal of extensions nodes from schedule trees in r362257 it
is possible to emit parallel code for SCoPs containing
matrix-multiplications. However, the code looking for references used in
outlined statement was not prepared to handle CopyStmts introduced by
the matrix-matrix multiplication detection.
In this case, CopyStmts do not introduce references in addition to the
ones captured by MemoryAccesses, i.e. we change the assertion to accept
CopyStmts and add a regression test for this case.
This fixes llvm.org/PR43164
llvm-svn: 372188
https://reviews.llvm.org/D61934, committed as r362687, r363540, r363364
and r363147, made some emitted instruction nus/nsw. Add these falgs to
Polly's regression tests.
This should fix
Polly :: Isl/CodeGen/partial_write_in_region_with_loop.ll
Polly :: Isl/CodeGen/scev_expansion_in_nonaffine.ll
llvm-svn: 363599
At the end of a region statement, the PHINode must be generated
while the current IRBuilder's block is the region's exit node. For
obvious reasons: The PHINode references the region's exiting block.
A partial write would insert new control flow, i.e. insert new basic
blocks between the exiting blocks and the current block.
We fix this by generating the PHI nodes (region exit values) before
generating any MemoryAccess's stores.
This should fix the AOSP buildbot.
Reported-by: Eli Friedman <efriedma@quicinc.com>
llvm-svn: 361204
The ParallelLoopGenerator class is changed such that GNU OpenMP specific
code was removed, allowing to use it as super class in a
template-pattern. Therefore, the code has been reorganized and one may
not use the ParallelLoopGenerator directly anymore, instead specific
implementations have to be provided. These implementations contain the
library-specific code. As such, the "GOMP" (code completely taken from
the existing backend) and "KMP" variant were created.
For "check-polly" all tests that involved "GOMP": equivalents were added
that test the new functionalities, like static scheduling and different
chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative
backend may be used.
Patch by Michael Halkenhäuser <michaelhalk@web.de>
Differential Revision: https://reviews.llvm.org/D59100
llvm-svn: 356434
compiler identification lines in test-cases.
(Doing so only because it's then easier to search for references which
are actually important and need fixing.)
llvm-svn: 351200
Summary:
Update all rdtscp callsites in PerfMonitor so that they conform with the signature changes introduced in r341698.
Reviewers: grosser, bollu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51928
llvm-svn: 341946
Summary:
This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338).
I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output.
All LLVM files are already reviewed in D48338.
Reviewers: jdoerfert, bollu, efriedma
Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia
Differential Revision: https://reviews.llvm.org/D48453
llvm-svn: 335292
Add the options -polly-codegen-trace-stmts and
-polly-codegen-trace-scalars. When enabled, adds a call to the
beginning of every generated statement that prints the executed
statement instance. With -polly-codegen-trace-scalars, it also prints
the value of all scalars that are used in the statement, and PHIs
defined in the beginning of the statement.
Differential Revision: https://reviews.llvm.org/D45743
llvm-svn: 330864
A check in assert-builds was meant to verify that a load provides a
value in all statement instances (i.e. its domain). The domain is
commonly gist'ed within the parameter context to contain fewer
constraints. However, statement instances outside the context are
no valid executions, hence the value provided can be undefined.
Refine the check for valid loads to only needed to be defined within
the SCoP context.
In addition, the JSONImporter had to be changed to allow importing
access relations that are broader than the current access relation,
but still defined over all statement instances.
This should fix the compiler crash in test-suite's oggenc of the
-polly-process-unprofitable buildbot.
llvm-svn: 329655
Summary:
When checking the parallelism of a scheduling dimension, we first check if excluding reduction dependences the loop is parallel or not.
If the loop is not parallel, then we need to return the minimal dependence distance of all data dependences, including the previously subtracted reduction dependences.
Reviewers: grosser, Meinersbur, efriedma, eli.friedman, jdoerfert, bollu
Reviewed By: Meinersbur
Subscribers: llvm-commits, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D45236
llvm-svn: 329214
Splitting basic blocks into multiple statements if there are now
additional scalar dependencies gives more freedom to the scheduler, but
more statements also means higher compile-time complexity. Switch to
finer statement granularity, the additional compile time should be
limited by the number of operations quota.
The regression tests are written for the -polly-stmt-granularity=bb
setting, therefore we add that flag to those tests that break with the
new default. Some of the tests only fail because the statements are
named differently due to a basic block resulting in multiple statements,
but which are removed during simplification of statements without
side-effects. Previous commits tried to reduce this effect, but it is
not completely avoidable.
Differential Revision: https://reviews.llvm.org/D42151
llvm-svn: 324169
Memory transfer instructions take two pointers. It is not defined to
which of those a noalias annotation applies. To ensure correctness,
do not add noalias annotations to memcpy/memmove instructions anymore.
The caused a miscompile with test-suite's MultiSource/Applications/obsequi.
Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if
known to copy uninitialized memory. In that case, it was initialized
by another memcpy, but the annotation for the target pointer said
it would not alias. The annotation was actually meant for the source
pointer, which was was an alloca and could not alias with the target
pointer.
llvm-svn: 321371
Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an
empty domain (i.e. has no pieces). We already detected the case if the
isl_pw_aff comes with an empty domain.
isl_ast_build also considers the domain empty if it is disjoint with the
parameter context (e.g. parameters values that we exclude by runtime
versioning).
Intersect the access relation domain with the parameter context to
also detect such practically empty access domains. The effective
pointer used in the generated code is unimportand because it will never
be executed.
This fixes llvm.org/PR35362
llvm-svn: 318806
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.
llvm-svn: 316983
After rL315683 (improve SCEV to calculate max BETakenCount when end
bound of loop is variant and loop is of form {Start,+1, Stride} LT End)
this test in polly started failing.
However, as discussed in https://reviews.llvm.org/rL315683,
this polly test is not a loops bound test and the MaxBECount calculated by
SCEV looks correct. The max BECount is the value calculated even when the end
bound of loop is invariant.
As discussed with Tobias offline, I'm marking this as an XFAIL, until he
gets a chance to update the testcase, so the build bot goes to green.
llvm-svn: 315912
This test XFAILs two test that start to fail when verifying DT's
DFS numbers, as per Tobias' suggestion.
Related VerifyDFSNumbers patch: D38331.
llvm-svn: 314800
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.
This is a recommit of r312663 after fixing
test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll
llvm-svn: 314075
Such RTCs may introduce integer wrapping intrinsics with more than 64 bit,
which are translated to library calls on AOSP that are not part of the
runtime and will consequently cause linker errors.
Thanks to Eli Friedman for reporting this issue and reducing the test case.
llvm-svn: 314065
Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo,
we might get those analysis that were computed by a different ScopInfo for a
different Scop structure. This would be unfortunate because DependenceInfo and
IslAstInfo hold references to resources allocated by
ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and
DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable
things can happen.
When the ScopInfo/Scop object is freed, there is a high probability that the
new ScopInfo/Scop object will be created at the same heap position with the
same address. Comparing whether the Scop or ScopInfo address is the expected
therefore is unreliable.
Instead, we compare the address of the isl_ctx object. Both, DependenceInfo
and IslAstInfo must hold a reference to the isl_ctx object to ensure it is
not freed before the destruction of those analyses which might happen after
the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx
will not be freed and its address not reused as long there is a
DependenceInfo or IslAstInfo around.
This fixes llvm.org/PR34441
llvm-svn: 313842
The type of NewValue might change due to ScalarEvolution
looking though bitcasts. The synthesized NewValue therefore
becomes the type before the bitcast.
llvm-svn: 312718
In certain situations, the context in the isl_ast_build could result for the
min/max locations of our alias sets to become empty, which would cause an
internal error in isl, which is then unable to derive a value for these
expressions. Check these conditions before code generating expressions and
instead assume that alias check succeeded. This is valid, as the corresponding
memory accesses will not be executed under any valid context.
This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting.
llvm-svn: 312455
The adds code generation support for the previous commit.
This patch has been re-applied, after the memory issue in the previous patch
has been fixed.
llvm-svn: 312211
This patch allows annotating of metadata in ir instruction
(with "polly_split_after"), which specifies where to split a particular
scop statement.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D36402
llvm-svn: 312107
Whether a partial write is tautological/unsatisfiable not only
depends on the access domain, but also on the domain covered
by its node in the AST.
In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0).
for (int c0 = 0; c0 < tmp5; c0 += 1) {
Stmt_for_body344(c0);
if (tmp5 >= c0 + 2)
Stmt_cond_false(c0);
Stmt_cond_end(c0);
}
if (tmp5 <= 0) {
Stmt_for_body344(0);
Stmt_cond_false(0);
Stmt_cond_end(0);
}
Isl cannot derive a subscript for an array element that is never accessed.
This caused an error in that no subscript expression has been generated
in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one
to exist because there is an execution of that write, just not in that
ast node.
Fixed by instead of determining whether the access domain is empty,
inspect whether isl generated a constant "false" ast expression in
the current ast node.
This should fix a compiler crash of the aosp buildbot.
llvm-svn: 311663
Summary:
There is no need to emit alias metadata for scalars, as basicaa will easily
distinguish them from arrays. This reduces the size of the metadata we generate.
This is especially useful after we moved to -polly-position=before-vectorizer,
where a lot more scalar dependences are introduced, which increased the size of
the alias analysis metadata and made us commonly reach the limits after which
we do not emit alias metadata that have been introduced to prevent quadratic
growth of this alias metadata.
This improves 2mm performance from 1.5 seconds to 0.17 seconds.
Reviewers: Meinersbur, bollu, singam-sanjay
Reviewed By: Meinersbur
Subscribers: pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D37028
llvm-svn: 311498
Summary:
Before, if we fail to parse a jscop file, this will be reported as an
error and importing is aborted. However, this isn't actually strong
enough, since although the import is aborted, the scop has already been
modified and is very likely broken. Instead, make this a hard failure
and throw an LLVM error. This new behaviour requires small changes to
the tests for the legacy pass, namely using `not` to verify the error.
Further, fixed the jscop file for the
base_pointer_load_is_inst_inside_invariant_1 testcase.
Reviewed By: Meinersbur
Split out of D36578.
llvm-svn: 310599
Codegen with -polly-parallel queried the unmapped MemoryAccess, but only
the MemoryKind after mapping is relevant for codegen.
This should fix various fails of the
perf-x86_64-penryn-O3-polly-parallel-fast buildbot.
llvm-svn: 310466
Summary:
This consists instances of two changes:
- Accept any order of checks for a specific loop form, that appear in different order in the new vs legacy-PM.
- Remove checks for specific regions.
Reviewers: grosser
Reviewed By: grosser
Subscribers: pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D35837
llvm-svn: 308976
When performing invariant load hoisting we check that invariant load expressions
are not too complex. Up to this commit, we performed this check by counting the
sum of dimensions in the access range as a very simple heuristic. This heuristic
is a little too conservative, as it prevents hoisting for any scops with a
very large number of parameters. Hence, we update the heuristic to only count
existentially quantified dimensions and set dimensions. We expect this to still
detect the problematic expressions in h264 because of which this check was
originally introduced.
For some unknown reason, this complexity check was originally committed in
IslNodeBuilder. It really belongs in ScopInfo, as there is no point in
optimizing a program which we could have known earlier cannot be code generated.
The benefit of running the check early is that we can avoid to even hoist checks
that are expensive to code generate as invariant loads. This can be seen in
the changed tests, where we now indeed detect the scop, but just not invariant
load hoist the complicated access.
We also improve the formatting of the code, document it, and use isl++ to
simplify expressions.
llvm-svn: 308659