Recursively traversing the operand tree leads to an exponential blowup
if instructions are used multiple times due to every path leading to an
additional copy of the instructions after forwarding. This problem was
marked as a TODO in the code and was reported as a bug in llvm.org/PR47340.
Fix by caching already visited instructions and returning the cached
version when already visited. Instead of calling forwardTree() twice,
return a ForwardingAction structure that contains a lambda which will
carry-out the forwarding when requested. The lambdas are executed in
reverse-postorder to mimic the previous recursive calls unless there
is a reuse.
Fixes llvm.org/PR47340
Polly incorrectly dropped the address space specified for a load instruction when it vectorized the code.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D88907
While we haven't encountered an earth-shattering problem with this yet,
by now it is pretty evident that trying to model the ptr->int cast
implicitly leads to having to update every single place that assumed
no such cast could be needed. That is of course the wrong approach.
Let's back this out, and re-attempt with some another approach,
possibly one originally suggested by Eli Friedman in
https://bugs.llvm.org/show_bug.cgi?id=46786#c20
which should hopefully spare us this pain and more.
This reverts commits 1fb6104293,
7324616660,
aaafe350bb,
e92a8e0c74.
I've kept&improved the tests though.
This relands commit 1c021c64ca which was
reverted in commit 17cec6a11a because
an assertion was being triggered, since `BuildConstantFromSCEV()`
wasn't updated to handle the case where the constant we want to truncate
is actually a pointer. I was unsuccessful in coming up with a test case
where we'd end there with constant zext/sext of a pointer,
so i didn't handle those cases there until there is a test case.
Original commit message:
While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.
This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D88806
> While we indeed can't treat them as no-ops, i believe we can/should
> do better than just modelling them as `unknown`. `inttoptr` story
> is complicated, but for `ptrtoint`, it seems straight-forward
> to model it just as a zext-or-trunc of unknown.
>
> This may be important now that we track towards
> making inttoptr/ptrtoint casts not no-op,
> and towards preventing folding them into loads/etc
> (see D88979/D88789/D88788)
>
> Reviewed By: mkazantsev
>
> Differential Revision: https://reviews.llvm.org/D88806
It caused the following assert during Chromium builds:
llvm/lib/IR/Constants.cpp:1868:
static llvm::Constant *llvm::ConstantExpr::getTrunc(llvm::Constant *, llvm::Type *, bool):
Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed.
See code review for a link to a reproducer.
This reverts commit 1c021c64ca.
While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.
This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D88806
VirtualUse of type UseKind::Inter expects the definition of a
llvm::Value to be represented in another statement. In the bug report
that statement has been removed due to its domain being empty.
Scop::InstStmtMap for the llvm::Value's defintion still pointed to the
removed statement, which resulted in the use-after-free.
The defintion statement was removed by Simplify because it was
considered to not be reachable by other uses; trivially because it is
never executed due to its empty domain. However, no such thing happend
to the using statement using the value altough its domain is also empty.
Fix by always removing statements with empty domains in Simplify since
these are not properly analyzable. A UseKind::Inter should always have a
statement with its defintion due to LLVM's SSA form.
Scop::removeStmtNotInDomainMap() also removes statements with empty
domains but does so without considering the context as used by
Simplify's analyzes.
In another angle, InstStmtMap pointing to removed statements should not
happen either and ForwardOpTree would have bailed out if the llvm::Value
definition was not represented by a statement. This will be corrected in
a followup-commit.
This fixes llvm.org/PR47098
The test failed since commit
bc10888dc "DomTree: Make PostDomTree indifferent to block successors swap"
which is a re-commit of
c35585e20 "DomTree: Make PostDomTree immune to block successors swap"
The schedule of a fused loop has one isl_space per statement, such that
a conversion to a isl_map fails. However, the prevectorization is
interested in the schedule space only: Converting to the non-union
representation only after extracting the schedule range fixes the problem.
This fixes llvm.org/PR46578
The member LastSchedule was never set, such that printScop would always
print "n/a" instead of the last schedule.
To ensure that the isl_ctx lives as least as long as the stored
schedule, also store a shared_ptr.
Also set the schedule tree output style to ISL_YAML_STYLE_BLOCK to avoid
printing everything on a single line.
`opt -polly-opt-isl -analyze` will be used in the next commit.
If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.
Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer. This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated. I
updated clang's handling of C++ references, and added a release note for
this.
Differential Revision: https://reviews.llvm.org/D80072
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.
The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
After the update to ISL to isl-0.22.1-87-gfee05a13 and its change of
isl_*_dim returning -1 instead of 0, the -1 got wrapped-around to
UINT_MAX because Polly often uses 'unsigned' type to represent
dimensions, as ISL did before this patch. This may happen in normal
executions after an out-of-quota.
Fix by catching the error-case earlier.
This will allow us to use the datalayout to disambiguate other
constructs in IR, like load alignment. Split off from D78403.
Differential Revision: https://reviews.llvm.org/D78413
This is equivalent in terms of LLVM IR semantics, but we want to
transition away from using MaybeAlign to represent the alignment of
these instructions.
Differential Revision: https://reviews.llvm.org/D77984
The option is passed as argv to ISL's command line option parser.
Polly's own own command line options take precedence over options passed
as `-polly-isl-arg`. For instance,
`-polly-isl-arg=--schedule-outer-coincidence` will be ignored in favor
of `-polly-opt-outer-coincidence`.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D77303
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
Static chunked OpenMP scheduling has not been treated correctly.
This patch fixes the problem that threads would not process their
(work-)chunks as intended.
Differential Revision: https://reviews.llvm.org/D61081
The primary motivation is to fix an assertion failure in
isl_basic_map_alloc_equality:
isl_assert(ctx, room_for_con(bmap, 1), return -1);
Although the assertion does not occur anymore, I could not identify
which of ISL's commits fixed it.
Compared to the previous ISL version, Polly requires some changes for this update
* Since ISL commit
20d3574 "perform parameter alignment by modifying both arguments to function"
isl_*_gist_* and similar functions do not always align the paramter
list anymore. This caused the parameter lists in JScop files to
become out-of-sync. Since many regression tests use JScop files with
a fixed parameter list and order, we explicitly call align_params to
ensure a predictable parameter list.
* ISL changed some return types to isl_size, a typedef of (signed) int.
This caused some issues where the return type was unsigned int before:
- No overload for std::max(unsigned,isl_size)
- It cause additional 'mixed signed/unsigned comparison' warnings.
Since they do not break compilation, and sizes larger than 2^31
were never supported, I am going to fix it separately.
* With the change to isl_size, commit
57d547 "isl_*_list_size: return isl_size"
also changed the return value in case of an error from 0 to -1. This
caused undefined looping over isl_iterator since the 'end iterator'
got index -1, never reached from the 'begin iterator' with index 0.
* Some internal changes in ISL caused the number of operations to
increase when determining access ranges to determine aliasing
overlaps. In one test, this caused exceeding the default limit of
800000. The operations-limit was disabled for this test.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).
Fixes regression in polly from recent patch to add writeonly to memset.
While I'm here, also fix a few dubious uses of the FMRB_* enum values.
Differential Revision: https://reviews.llvm.org/D73154
There's quite a lot of references to Polly in the LLVM CMake codebase. However
the registration pattern used by Polly could be useful to other external
projects: thanks to that mechanism it would be possible to develop LLVM
extension without touching the LLVM code base.
This patch has two effects:
1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it
with a generic mechanism
2. Provide a generic mechanism to register compiler extensions.
A compiler extension is similar to a pass plugin, with the notable difference
that the compiler extension can be configured to be built dynamically (like
plugins) or statically (like regular passes).
As a result, people willing to add extra passes to clang/opt can do it using a
separate code repo, but still have their pass be linked in clang/opt as built-in
passes.
Differential Revision: https://reviews.llvm.org/D61446
Commit 395124 "NVPTX: Don't insert an extra empty line at the end of the last section"
changed the length of the kernel payload. Update the regression test to the new binary size.
ScopBuilder::buildEqivClassBlockStmts creates ScopStmts for instruction
groups in basic block and inserts these ScopStmts into Scop::StmtMap,
however, as described in llvm.org/PR38358, comment #5, StmtScops are
inserted into vector ScopStmt[BB] in wrong order. As a result,
ScopBuilder::buildSchedule creates wrong order sequence node.
Looking closer to code, it's clear there is no equivalent classes with
interleaving isOrderedInstruction(memory access) instructions after
joinOrderedInstructions. Afterwards, ScopStmts need to be created and
inserted in the original order of memory access instructions, however,
at the moment ScopStmts are inserted in the order of leader instructions
which are probably not memory access instructions.
The fix is simple with a standalone loop scanning
isOrderedInstruction(memory access) instructions in basic block and
inserting elements into LeaderToInstList one by one. The patch also
removes double reversing operations which are now unnecessary.
New test preserve-equiv-class-order-in-basic_block.ll is also added.
Differential Revision: https://reviews.llvm.org/D68941
llvm-svn: 375192
Since the removal of extensions nodes from schedule trees in r362257 it
is possible to emit parallel code for SCoPs containing
matrix-multiplications. However, the code looking for references used in
outlined statement was not prepared to handle CopyStmts introduced by
the matrix-matrix multiplication detection.
In this case, CopyStmts do not introduce references in addition to the
ones captured by MemoryAccesses, i.e. we change the assertion to accept
CopyStmts and add a regression test for this case.
This fixes llvm.org/PR43164
llvm-svn: 372188
Function joinOrderedInstructions merges instructions when a leader is encountered twice.
It also notices that leaders in SeenLeaders may lose their leadership in previous merging,
and tries to handle the case using following code:
Instruction *PrevLeader = UnionFind.getLeaderValue(SeenLeaders.back());
However, this is wrong because it always gets leader for the last element of SeenLeaders,
and I believe it's wrong even we get leader for Prev here. As a result, Statements in cases
like the one in patch aren't merged as expected. After investigation, I believe it's
unnecessary to get leader instruction at all. This is based on fact: Although leaders in
SeenLeaders could lose leadership, they only lose to others in SeenLeaders, in other words,
one existing leader will be chosen as new leader of merged equivalent statements. We can
take advantage of this and simply check if current leader equals to Prev and break merging
if it does.
The patch also adds a new test.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D67007
llvm-svn: 371801
https://reviews.llvm.org/D61934, committed as r362687, r363540, r363364
and r363147, made some emitted instruction nus/nsw. Add these falgs to
Polly's regression tests.
This should fix
Polly :: Isl/CodeGen/partial_write_in_region_with_loop.ll
Polly :: Isl/CodeGen/scev_expansion_in_nonaffine.ll
llvm-svn: 363599
Extension nodes make schedule trees are less flexible: Many operations,
such as rescheduling, do not work on such schedule trees with extension.
As such, some functionality such as determining parallel loops in isl's
AST are disabled.
Currently, only the pattern-matching generalized matrix-matrix
multiplication optimization adds extension nodes (to add copy-in
statements).
This patch removes all extension nodes as the last step of the schedule
optimization by hoisting the extension node's added domain up to the
root domain node. All following passes can assume that schedule trees
work without restrictions, including the parallelism test. Mark the
outermost loop of the optimized matrix-matrix multiplication as parallel
such that -polly-parallel is able to parallelize that loop.
Differential Revision: https://reviews.llvm.org/D58202
llvm-svn: 362257
isl_map_from_union_map cannot determine the map's space if the union_map
is empty. polly::singleton was designed for this case. We pass the
expected map space to avoid crashing in isl_map_from_union_map.
This fixes an issue found by the aosp buildbot. Thanks to Eli Friedman
for the reproducer.
llvm-svn: 361290
At the end of a region statement, the PHINode must be generated
while the current IRBuilder's block is the region's exit node. For
obvious reasons: The PHINode references the region's exiting block.
A partial write would insert new control flow, i.e. insert new basic
blocks between the exiting blocks and the current block.
We fix this by generating the PHI nodes (region exit values) before
generating any MemoryAccess's stores.
This should fix the AOSP buildbot.
Reported-by: Eli Friedman <efriedma@quicinc.com>
llvm-svn: 361204
In certain cases, it's possible for delinearization to decide one of the
array dimensions should be some function of an induction variable inside
the scop. Make sure if this happens, we refuse to use those dimensions
for delinearization.
Usually, we end up rejecting the scop before it actually crashes, but it
looks like it's possible to slip past other checks in certain cases
involving smax expressions.
Fixes a crash that started showing up this week on the polly AOSP
builder. As far as I can tell, this is a longstanding issue, though;
it was just exposed by better SCEV analysis of smin expressions.
Differential Revision: https://reviews.llvm.org/D61807
llvm-svn: 360708
PHI nodes (reads) could point to multiple instances of predecessor
blocks (PHI writes) when in an invalid context. Fix by removing PHI
instances that are in an invalid or ouside assumed context.
This fixes llvm.org/PR41656.
llvm-svn: 360454
The ParallelLoopGenerator class is changed such that GNU OpenMP specific
code was removed, allowing to use it as super class in a
template-pattern. Therefore, the code has been reorganized and one may
not use the ParallelLoopGenerator directly anymore, instead specific
implementations have to be provided. These implementations contain the
library-specific code. As such, the "GOMP" (code completely taken from
the existing backend) and "KMP" variant were created.
For "check-polly" all tests that involved "GOMP": equivalents were added
that test the new functionalities, like static scheduling and different
chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative
backend may be used.
Patch by Michael Halkenhäuser <michaelhalk@web.de>
Differential Revision: https://reviews.llvm.org/D59100
llvm-svn: 356434
compiler identification lines in test-cases.
(Doing so only because it's then easier to search for references which
are actually important and need fixing.)
llvm-svn: 351200
IslAst could mark two nested outer loops as "OutermostParallel". It
caused that the code generator tried to OpenMP-parallelize both loops,
which it is not prepared loop.
It was because the recursive AST build algorithm managed a flag
"InParallelFor" to ensure that no nested loop is also marked as
"OutermostParallel". Unfortunatetly the same flag was used by nodes
marked as SIMD, and reset to false after the SIMD node. Since loops can
be marked as SIMD inside "OutermostParallel" loops, the recursive
algorithm again tried to mark loops as "OutermostParellel" although
still nested inside another "OutermostParallel" loop.
The fix exposed another bug: The function "astScheduleDimIsParallel" was
only called when a loop was potentially "OutermostParallel" or
"InnermostParallel", but as a side-effect also determines the minimum
dependence distance. Hence, changing when we need to know whether a loop
is "OutermostParallel" also changed which loop was annotated with
"#pragma minimal dependence distance".
Moreover, some complex condition linked with "InParallelFor" determined
whether a loop should be an "InnermostParallel" loop. It missed some
situations where it would not use mark as such although being inside an
SIMD mark node, and therefore not be annotated using "#pragma simd".
The changes in particular:
1. Split the "InParallelFor" flag into an "InParallelFor" and an
"InSIMD" flag.
2. Unconditionally call "astScheduleDimIsParallel" for its side-effects
and store the result in "InParallel" for later use.
3. Simplify the condition when a loop is "InnermostParallel".
Fixes llvm.org/PR33153 and llvm.org/PR38073.
llvm-svn: 343212
Summary:
Update all rdtscp callsites in PerfMonitor so that they conform with the signature changes introduced in r341698.
Reviewers: grosser, bollu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51928
llvm-svn: 341946
The domain generation used nullptr to mark the domain of an error block
as never-executed. Later, nullptr domains are recreated with a
zero-tuple domain that then mismatches with the expected domain the
error block within the loop.
Instead of using nullptr, assign an empty domain which preserves the
expected space. Remove empty domains during SCoP simplification.
Fixes llvm.org/PR38218.
llvm-svn: 338646
Summary:
This patch changes the return types for ocl_get_* functions during SPIR code generation. Because these functions return size_t types, the return type needs to be changed to the actual size of size_t on the device.
Based on work by Michal Babej and Pekka Jääskeläinen
Patch by: Alain Denzler
Reviewers: grosser, philip.pfaffe, bollu
Reviewed By: grosser, philip.pfaffe
Subscribers: nemanjai, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D48774
llvm-svn: 336080
Summary: This patch aims to provide support for detecting load patterns which are collectively invariant but right now `isHoistableLoad()` is checking each load instruction individually which cannot detect the load pattern as a whole.
Patch by: Sahil Girish Yerawar
Reviewers: bollu, philip.pfaffe, Meinersbur
Reviewed By: philip.pfaffe, Meinersbur
Differential Revision: https://reviews.llvm.org/D48026
llvm-svn: 335949
Summary:
This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338).
I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output.
All LLVM files are already reviewed in D48338.
Reviewers: jdoerfert, bollu, efriedma
Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia
Differential Revision: https://reviews.llvm.org/D48453
llvm-svn: 335292
The default statement granularity changed in a recent change by Micheal. To
avoid forwad-porting the testcases, enable the legacy behaviour again in these tests.
llvm-svn: 333105
statement naming
- A recent ppcg/isl update caused the grid/block size upper bounds to
deviate by one from the oracle. This is not an effect that's visible at
runtime.
- Statement naming changed in polly. Update the testcases.
llvm-svn: 333090
An assertion was not prepared to be passed a nullptr because the
out-of-quota limit was exceeded. Bail-out before the assertion
since the assertion does not apply on out-of-quote.
This fixes llvm.org/PR37477.
llvm-svn: 332488
nullptr is not a valid affine expression, and none of the callers check
for null, so we eventually hit an isl error and crash.
Instead, invalidate the scop and return a constant zero.
Differential Revision: https://reviews.llvm.org/D46445
llvm-svn: 332309
The condition was introduced in r267142 to mitigate a long compile-time
case. In r306087, a max-computation limit was introduced that should
handle the same case while leaving the max disjuncts heuristic it
should have replaced intact.
Today, the max disjuncts bail-out causes problems in that it prematurely
stops SCoPs from being detected, e.g. in SPEC's lbm. This would hit less
like if isl_set_coalesce would be called after isl_set_remove_divs
(which makes more basic_set likely to be coalescable) instead of before.
This patch tries to remove the premature max-disjuncts bail-out
condition by using simple_hull() to reduce the computational overhead,
instead of directly invalidating that SCoP.
Differential Revision: https://reviews.llvm.org/D45066
Contributed-by: Sahil Girish Yerawar <cs15btech11044@iith.ac.in>
llvm-svn: 331891
This test case does not require undef to be present in branch
conditions. Replace these undef values with true/false values to clarify
the control-flow required to reach the loop under testing.
llvm-svn: 331744
Summary:
Occasionally you need an include or similar things to be configured
when making a new testcase. Allow passing these to the script and down to the
compiler calls.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: Meinersbur
Subscribers: bollu, llvm-commits, pollydev
Differential Revision: https://reviews.llvm.org/D46359
llvm-svn: 331364
Add the options -polly-codegen-trace-stmts and
-polly-codegen-trace-scalars. When enabled, adds a call to the
beginning of every generated statement that prints the executed
statement instance. With -polly-codegen-trace-scalars, it also prints
the value of all scalars that are used in the statement, and PHIs
defined in the beginning of the statement.
Differential Revision: https://reviews.llvm.org/D45743
llvm-svn: 330864
The current statement domain derivation algorithm does not (always)
consider that different exit blocks of a loop can have different
conditions to be reached.
From the code
for (int i = n; ; i-=2) {
if (i <= 0) goto even;
if (i <= 1) goto odd;
A[i] = i;
}
even:
A[0] = 42;
return;
odd:
A[1] = 21;
return;
Polly currently derives the following domains:
Stmt_even_critedge
Domain :=
[n] -> { Stmt_even_critedge[] };
Stmt_odd
Domain :=
[n] -> { Stmt_odd[] : (1 + n) mod 2 = 0 and n > 0 };
while the domain for the odd case is correct, Stmt_even is assumed to be
executed unconditionally, which is obviously wrong. While projecting out
the loop dimension in `adjustDomainDimensions`, it does not consider
that there are other exit condition that have matched before.
I don't know a how to fix this without changing a lot of code. Therefore
This patch rejects loops with multiple exist blocks to fix the
miscompile of test-suite's uuencode.
The odd condition is transformed by LLVM to
%cmp1 = icmp eq i64 %indvars.iv, 1
such that the project_out in adjustDomainDimensions() indeed only
matches for odd n (using this condition only, we'd have an infinite loop
otherwise).
The even condition manifests as
%cmp = icmp slt i64 %indvars.iv, 3
Because buildDomainsWithBranchConstraints() does not consider other exit
conditions, it has to assume that the induction variable will eventually
be lower than 3 and taking this exit.
IMHO we need to reuse the algorithm that determines the number of
iterations (addLoopBoundsToHeaderDomain) to determine which exit
condition applies first. It has to happen in
buildDomainsWithBranchConstraints() because the result will need to
propagate to successor BBs. Currently addLoopBoundsToHeaderDomain() just
look for union of all backedge conditions (which means leaving not the
loop here). The patch in llvm.org/PR35465 changes it to look for exit
conditions instead. This is required because there might be other exit
conditions that do not alternatively go back to the loop header.
Differential Revision: https://reviews.llvm.org/D45649
llvm-svn: 330858
Add the switch -polly-debug-func to define the name of a debug
function. This function is ignored for any validity check.
Its purpose is to allow to observe a value after transformation by a
SCoP, and to follow which statements are executed in which order. For
instance, consider the following code:
static void dbg_printf(int sum, int i) {
fprintf(stderr, "The value of sum is %d, i=%d\n", sum, i);
fflush(stderr);
}
void func(int n) {
int sum = 0;
for (int i = 0; i < 16; i+=1) {
sum += i;
dbg_printf(sum, i);
}
}
Executing this after Polly's codegen with -polly-debug-func=dbg_printf
reveals the new execution order and the assumed values at that point of
execution.
Differential Revision: https://reviews.llvm.org/D45728
llvm-svn: 330466
Summary:
As of rL329273, LLVM has a mechanism to load new-pm plugins in opt. Use
this API in Polly.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: grosser, Meinersbur
Subscribers: lksbhm, bollu, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D45484
llvm-svn: 330181
A check in assert-builds was meant to verify that a load provides a
value in all statement instances (i.e. its domain). The domain is
commonly gist'ed within the parameter context to contain fewer
constraints. However, statement instances outside the context are
no valid executions, hence the value provided can be undefined.
Refine the check for valid loads to only needed to be defined within
the SCoP context.
In addition, the JSONImporter had to be changed to allow importing
access relations that are broader than the current access relation,
but still defined over all statement instances.
This should fix the compiler crash in test-suite's oggenc of the
-polly-process-unprofitable buildbot.
llvm-svn: 329655
Commit r329640 introduced the removal of all MemoryAccesses of a Scop.
It accidentally continued iterating over a vector whose iterators
have been invalidated by a MemoryAccess removal.
Make a copy of the MemoryAccesses to remove to iterate over while
removing them.
llvm-svn: 329653
Removing a statement left its MemoryAccesses in some lists and maps of
the SCoP. Which lists depends on at which phase of the SCoP
construction the statement is deleted. Follow-up passes could still see
the already deleted MemoryAccesses by iterating through these
lists/maps, resulting in an access violation.
When removing a ScopStmt, also remove all its MemoryAccesses by using
the same mechnism that removes a MemoryAccess.
llvm-svn: 329640
This patch removes the heuristic in
- Polly :: lib/Support/ScopHelper.cpp
The heuristic forces blocks that directly follow a loop header to not to be considered error blocks.
It was introduced in r249611 with the following commit message:
> This replaces the support for user defined error functions by a
> heuristic that tries to determine if a call to a non-pure function
> should be considered "an error". If so the block is assumed not to be
> executed at runtime. While treating all non-pure function calls as
> errors will allow a lot more regions to be analyzed, it will also
> cause us to dismiss a lot again due to an infeasible runtime context.
> This patch tries to limit that effect. A non-pure function call is
> considered an error if it is executed only in conditionally with
> regards to a cheap but simple heuristic.
In the code below `CCK_Abort2()` would be considered as an error block, but not `CCK_Abort1()` due to this heuristic.
```
for (int i = 0; i < n; i+=1) {
if (ErrorCondition1)
CCK_Abort1(); // No __attribute__((noreturn))
if (ErrorCondition2)
CCK_Abort2(); // No __attribute__((noreturn))
}
```
This does not seem useful. Checking error conditions in the beginning of some work is quite common. It causes a switch default-case to be not considered an error block in SPEC's cactuBSSN. The comment justifying the heuristic mentions a "load", which does not seem to be applicable here. It has been proposed to remove the heuristic.
In addition, the patch fixes the following test cases:
- Polly :: ScopDetect/mod_ref_read_pointer.ll
- Polly :: ScopInfo/max-loop-depth.ll
- Polly :: ScopInfo/mod_ref_access_pointee_arguments.ll
- Polly :: ScopInfo/mod_ref_read_pointee_arguments.ll
- Polly :: ScopInfo/mod_ref_read_pointer.ll
- Polly :: ScopInfo/mod_ref_read_pointers.ll
The test cases failed after removing the heuristic.
Differential Revision: https://reviews.llvm.org/D45274
Contributed-by: Lorenzo Chelini <l.chelini@icloud.com>
llvm-svn: 329548
Summary:
When checking the parallelism of a scheduling dimension, we first check if excluding reduction dependences the loop is parallel or not.
If the loop is not parallel, then we need to return the minimal dependence distance of all data dependences, including the previously subtracted reduction dependences.
Reviewers: grosser, Meinersbur, efriedma, eli.friedman, jdoerfert, bollu
Reviewed By: Meinersbur
Subscribers: llvm-commits, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D45236
llvm-svn: 329214
Summary:
When building polly as part of the monorepo (actually, as part of any setup
using LLVM_ENABLE_PROJECTS), the LLVMPolly library used in the lit tests ends
up in a different directory in the build tree than in an in-tree build
Reviewers: Meinersbur, grosser, bollu
Reviewed By: Meinersbur
Subscribers: mgorny, bollu, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D44078
llvm-svn: 326702
isl does not guarantee that set dimension ids will be preserved, so using them
to carry information is not a good idea. Furthermore, the loop information can
be derived without problem from the statement itself. As this even requires
less code than propagating loop information on set dimension ids, starting from
this commit we just derive the loop information in collectSurroundingLoops
directly from the IR.
Interestingly this also results in a couple of isl sets to take a simpler
representation.
llvm-svn: 326664
This update:
- Removes several deprecated functions (e.g., isl_band).
- Improves the pretty-printing of sets by detecting modulos and "false"
equalities.
- Minor improvements to coalescing and increased robustness of the isl
scheduler.
This update does not yet include isl commit isl-0.18-90-gd00cb45
(isl_pw_*_alloc: add missing check for compatible spaces, Wed Sep 6 12:18:04
2017 +0200), as this additional check is too tight and unfortunately causes
two test case failures in Polly. A patch has been submitted to isl and will be
included in the next isl update for Polly.
llvm-svn: 325557
Two or more PHIs mutually using each other directly or indirectly as
incoming value could cause that a PHI WRITE be added before the PHI READ
(i.e. it overwrites the current incoming value with the next incoming
value before it being read).
Fix by ensuring that the PHI WRITE and PHI READ are in the same statement.
This should fix the miscompile of SingleSource/Benchmark/Misc/whetstone
from the test-suite.
llvm-svn: 324934
Splitting basic blocks into multiple statements if there are now
additional scalar dependencies gives more freedom to the scheduler, but
more statements also means higher compile-time complexity. Switch to
finer statement granularity, the additional compile time should be
limited by the number of operations quota.
The regression tests are written for the -polly-stmt-granularity=bb
setting, therefore we add that flag to those tests that break with the
new default. Some of the tests only fail because the statements are
named differently due to a basic block resulting in multiple statements,
but which are removed during simplification of statements without
side-effects. Previous commits tried to reduce this effect, but it is
not completely avoidable.
Differential Revision: https://reviews.llvm.org/D42151
llvm-svn: 324169
Theoretically, a PHI write can be added to any statement that represents
the incoming basic block. We previously always chose the last because
the incoming value's definition is guaranteed to be defined.
With this patch the PHI write is added to the statement that defines the
incoming value. It avoids the requirement for a scalar dependency between
the defining statement and the statement containing the write. As such the
logic for -polly-stmt-granularity=scalar-indep that ensures that there is
such scalar dependencies can be removed.
Differential Revision: https://reviews.llvm.org/D42147
llvm-svn: 323284
except Darwin and Windows. This prevents inserting an environment
variable with an empty name (which is illegal and leads to a Python
exception) on any of the BSDs.
llvm-svn: 323041
Summary:
Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the polly tests for this change.
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.
This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.
llvm-svn: 322963
The goal is to have -polly-stmt-granularity=bb and
-polly-stmt-granularity=scalar-indep to have the same names if there is
just one statement per basic block.
This fixes a fluke when Polybench's jacobi-2d is optimized differently
depending on the -polly-stmt-granularity option, although both options
create the same SCoP, just with different statement names.
The new naming scheme is:
With -polly-use-llvm-names=0:
Stmt<BBIdx as decimal><Idx within BB as letter>
With -polly-use-llvm-names=1:
Stmt_BBName_<Idx within BB as letter>
The <Idx within BB> suffix is omitted for the main statement of a BB. The
main statement is either the one containing the first store or call
(those cannot be removed by the simplifyer), or if there is no such
instruction, the first. If after simplification there is just a single
statement left, it should be the main statement and have the same names as
with -polly-stmt-granularity=bb.
Differential Revision: https://reviews.llvm.org/D42136
llvm-svn: 322852
isl_val_get_num_si crashes on overflow, so don't use it on arbitrary
integers.
Testcase only crashes on platforms where long is 32 bits because of the
signature of isl_val_get_num_si; not sure if it's possible to write a
testcase which crashes if long is 64 bits.
There are a few other places in polly which use isl_val_get_num_si;
they probably need to be fixed as well. I don't think polly uses any
of the other "long" isl APIs in an unsafe manner.
Differential Revision: https://reviews.llvm.org/D42129
llvm-svn: 322766
Memory transfer instructions take two pointers. It is not defined to
which of those a noalias annotation applies. To ensure correctness,
do not add noalias annotations to memcpy/memmove instructions anymore.
The caused a miscompile with test-suite's MultiSource/Applications/obsequi.
Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if
known to copy uninitialized memory. In that case, it was initialized
by another memcpy, but the annotation for the target pointer said
it would not alias. The annotation was actually meant for the source
pointer, which was was an alloca and could not alias with the target
pointer.
llvm-svn: 321371
If an out-of-quota error occurred, the last error would be
isl_error_quota unless a different error occured. We typically check
whether the max-operations occured by comparing to that error value
after leaving the quota guard. This would check whether there ever
was a quota-error, not just in the last quota guards.
The observable bug occurred if the max-operations limit was reached in
DeLICM, and if -polly-dependences-computout=0, DependenceInfo would
think that the quota for computing dependencies was the reason,
i.e., fail the operation even if the calculation itself was successful.
Fix by reseting the last error to isl_error_none when entering a
quota guard, signaling that no quota error occured unless in the
guard's scope.
llvm-svn: 321329
Summary:
This can be seen as a follow-up on my previous differential [D33411](https://reviews.llvm.org/D33411).
We received a bug report where this error was triggered. I have tried my best to recreate the issue in a minimal lit testcase which is also part of this differential.
I only handle return instructions as predecessors to a virtual TLR-exit right now. From inspecting the codebase, it seems `unreachable` instructions may also be of interest here. If requested, I can extend my patches to consider them as well. I would also apply this on `ScopHelper.cpp::isErrorBlock` (see D33411), of course.
Reviewers: philip.pfaffe, bollu
Reviewed By: bollu
Subscribers: Meinersbur, pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D40492
llvm-svn: 319431
Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an
empty domain (i.e. has no pieces). We already detected the case if the
isl_pw_aff comes with an empty domain.
isl_ast_build also considers the domain empty if it is disjoint with the
parameter context (e.g. parameters values that we exclude by runtime
versioning).
Intersect the access relation domain with the parameter context to
also detect such practically empty access domains. The effective
pointer used in the generated code is unimportand because it will never
be executed.
This fixes llvm.org/PR35362
llvm-svn: 318806
Summary:
Most changes are mechanical, but in one place I changed the program semantics
by fixing a likely bug:
In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the
error-case. Before, when the call to `addNonEmptyDomainConstraints()`
returned a null set, this (probably) accidentally worked because
isl_bool_error converts to true. I'm checking for nullptr now.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: Meinersbur
Subscribers: nemanjai, kbarton, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39971
llvm-svn: 318632