As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
In some build environments, the C++ compiler is unable to infer the
correct type for the DenseMap::insert in isErrorBlock. Typing out
std::make_pair helps.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Recommit with "REQUIRES: asserts" in test that uses statistics.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
The function was intended to catch OpenMP functions such as
get_thread_id(). If matched, the call would be considered synthesizable.
There were a few problems with this:
* get_thread_id() is not 'const' in the sense of have the gcc manual
defines it: "do not examine any values except their arguments".
get_thread_id() reads OpenCL runtime libreary global state.
What was inteded was probably 'speculable'.
* isConstCall was implemented using mayReadOrWriteMemory(). 'const' is
stricter than that, mayReadOrWriteMemory is e.g. true for malloc(),
since it may only read/write addresses that are considered
inaccessible fro the application. However, malloc is certainly not
speculable.
* Values that are isConstCall were not handled consistently throughout
Polly. In particular, it was not considered for referenced values
(OpenMP outlining and PollyACC).
Fix by removing special handling for isConstCall entirely.
getMetadata() currently uses a weird API where it populates a
structure passed to it, and optionally merges into it. Instead,
we can return the AAMDNodes and provide a separate merge() API.
This makes usages more compact.
Differential Revision: https://reviews.llvm.org/D109852
Code outside the SCoP will be executed recardless of the code versioning
runtime check introduced by CodeGeneration. Assumption made based on
that these are never executed in Polly-optimized code does not hold.
This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3
Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r
benchmark takes excessive time (> 30min) with Polly enabled. Most time
is spent in the isErrorBlock function querying the DominatorTree.
The isErrorBlock is invoked redundantly over the course of ScopDetection
and ScopBuilder. This patch introduces a caching mechanism for its
result.
Instead of a free function, isErrorBlock is moved to ScopDetection where
its cache map resides. This also means that many functions directly or
indirectly calling isErrorBlock are not "const" anymore. The
DetectionContextMap was marked as "mutable", but IMHO it never should
have been since it stores the detection result.
502.gcc_r only takes excessive time with the new pass manager. The
reason seeams to be that it invalidates the ScopDetection analysis more
often than the legacy pass manager, for unknown reasons.
Avoid doing the detection work inside the constructor. In addition to
polymorphism being unintuitive in constructors and other design problems
such as if an exception is thrown, the ScopDetection class is usable
without detection in the sense of "no Scop found" or "function skipped".
DetectionContext objects are stored as values in a DenseMap. When the
DenseMap reaches its maximum load factor, it is resized and all its
objects moved to a new memory allocation. Unfortunately Scop object have
a reference to its DetectionContext. When the DenseMap resizes, all the
DetectionContexts reference now point to invalid memory, even if caused
by an unrelated DetectionContext.
Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the
Verify=true parameter before each pass. This caused the old
DetectionContext to be removed an a new on created and re-verified.
Of course, the Scop object was already created pointing to the old
DetectionContext. Because the new DetectionContext would
usually be stored at the same position in the DenseMap, the reference
would usually reference the new DetectionContext of the same Region.
Usually.
If not, the old position still points to memory in the DenseMap
allocation (unless also a resizing occurs) such that tools like Valgrind
and AddressSanitizer would not be able to diagnose this.
Instead of storing the DetectionContext inside the DenseMap, use a
std::unique_ptr to a DetectionContext allocation, i.e. it will not move
around anymore. This also allows use to remove the very strange
DetectionContext(const DetectionContext &&)
copy/move(?) constructor. DetectionContext objects now are neither
copied nor moved.
As a result, every re-verification of a DetectionContext will use a new
allocation. Therefore, once a Scop object has been created using a
DetectionContext, it must not be re-verified (the Scop data structure
requires its underlying Region to not change before code generation
anyway). The NewPM may call isMaxRegionInScop only with
Validate=false parameter.
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).
Fixes regression in polly from recent patch to add writeonly to memset.
While I'm here, also fix a few dubious uses of the FMRB_* enum values.
Differential Revision: https://reviews.llvm.org/D73154
Previously, the polly unit tests were stuck in a infinite loop.
There was an edge case in StringRef::count() introduced by 9f6b13e5cc, where an empty 'Str' would cause the function to never exit.
Also fixed usage in polly.
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
The Loads.h API changed so that a Type parameter is now mandatory in
preparation for pointer types being opaque. Unfortunately I don't build
polly routinely and it still had some uses. This just provides the
(obvious) load type in each case.
llvm-svn: 365470
In certain cases, it's possible for delinearization to decide one of the
array dimensions should be some function of an induction variable inside
the scop. Make sure if this happens, we refuse to use those dimensions
for delinearization.
Usually, we end up rejecting the scop before it actually crashes, but it
looks like it's possible to slip past other checks in certain cases
involving smax expressions.
Fixes a crash that started showing up this week on the polly AOSP
builder. As far as I can tell, this is a longstanding issue, though;
it was just exposed by better SCEV analysis of smin expressions.
Differential Revision: https://reviews.llvm.org/D61807
llvm-svn: 360708
This removes unused includes (and forward declarations) as
suggested by include-what-you-use. If a transitive include of a removed
include is required to compile a file, I added the required header (or
forward declaration if suggested by include-what-you-use).
This should reduce compilation time and reduce the number of iterative
recompilations when a header was changed.
llvm-svn: 357209
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
of only 'break'.
We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
the outer case.
I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.
Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu
Differential Revision: https://reviews.llvm.org/D53950
llvm-svn: 345882
This removes the primary remaining API producing `TerminatorInst` which
will reduce the rate at which code is introduced trying to use it and
generally make it much easier to remove the remaining APIs across the
codebase.
Also clean up some of the stragglers that the previous mechanical update
of variables missed.
Users of LLVM and out-of-tree code generally will need to update any
explicit variable types to handle this. Replacing `TerminatorInst` with
`Instruction` (or `auto`) almost always works. Most of these edits were
made in prior commits using the perl one-liner:
```
perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g'
```
This also my break some rare use cases where people overload for both
`Instruction` and `TerminatorInst`, but these should be easily fixed by
removing the `TerminatorInst` overload.
llvm-svn: 344504
The general-purpose add() now sometimes adds unexpected loop-variant
pointers to the AliasSetTracker, so certain loops would be rejected with
-polly-allow-modref-calls. Use addUnknown() instead, which has the old
behavior.
I'm not completely convinced the resulting behavior is actually
correct: ScopDetection::isValidAccess seems to mostly ignore
"unknown" instructions in the AliasSetTracker. But it's not any worse
than what was happening before.
Committing without pre-commit review to unbreak the buildbots; the
following tests were failing:
test/ScopInfo/mod_ref_access_pointee_arguments.ll
test/ScopInfo/mod_ref_read_pointee_arguments.ll
test/ScopInfo/multidim_2d_with_modref_call_2.ll
llvm-svn: 342010
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.
All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.
llvm-svn: 340701
The method AliasSetTracker::getAliasSetForPointer was removed and replaced by AliasSetTracker::getAliasSetFor for the restructuring in r339930.
Since Polly uses AliasSetTracker::getAliasSetForPointer, a temporary fix has been committed in r339937 with a comment:
Can someone from polly please migrate usage and then delete the wrapper?
This commit is doing exactly that.
llvm-svn: 340072
Summary: This patch aims to provide support for detecting load patterns which are collectively invariant but right now `isHoistableLoad()` is checking each load instruction individually which cannot detect the load pattern as a whole.
Patch by: Sahil Girish Yerawar
Reviewers: bollu, philip.pfaffe, Meinersbur
Reviewed By: philip.pfaffe, Meinersbur
Differential Revision: https://reviews.llvm.org/D48026
llvm-svn: 335949
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
Differential Revision: https://reviews.llvm.org/D44978
llvm-svn: 332352
The current statement domain derivation algorithm does not (always)
consider that different exit blocks of a loop can have different
conditions to be reached.
From the code
for (int i = n; ; i-=2) {
if (i <= 0) goto even;
if (i <= 1) goto odd;
A[i] = i;
}
even:
A[0] = 42;
return;
odd:
A[1] = 21;
return;
Polly currently derives the following domains:
Stmt_even_critedge
Domain :=
[n] -> { Stmt_even_critedge[] };
Stmt_odd
Domain :=
[n] -> { Stmt_odd[] : (1 + n) mod 2 = 0 and n > 0 };
while the domain for the odd case is correct, Stmt_even is assumed to be
executed unconditionally, which is obviously wrong. While projecting out
the loop dimension in `adjustDomainDimensions`, it does not consider
that there are other exit condition that have matched before.
I don't know a how to fix this without changing a lot of code. Therefore
This patch rejects loops with multiple exist blocks to fix the
miscompile of test-suite's uuencode.
The odd condition is transformed by LLVM to
%cmp1 = icmp eq i64 %indvars.iv, 1
such that the project_out in adjustDomainDimensions() indeed only
matches for odd n (using this condition only, we'd have an infinite loop
otherwise).
The even condition manifests as
%cmp = icmp slt i64 %indvars.iv, 3
Because buildDomainsWithBranchConstraints() does not consider other exit
conditions, it has to assume that the induction variable will eventually
be lower than 3 and taking this exit.
IMHO we need to reuse the algorithm that determines the number of
iterations (addLoopBoundsToHeaderDomain) to determine which exit
condition applies first. It has to happen in
buildDomainsWithBranchConstraints() because the result will need to
propagate to successor BBs. Currently addLoopBoundsToHeaderDomain() just
look for union of all backedge conditions (which means leaving not the
loop here). The patch in llvm.org/PR35465 changes it to look for exit
conditions instead. This is required because there might be other exit
conditions that do not alternatively go back to the loop header.
Differential Revision: https://reviews.llvm.org/D45649
llvm-svn: 330858
Add the switch -polly-debug-func to define the name of a debug
function. This function is ignored for any validity check.
Its purpose is to allow to observe a value after transformation by a
SCoP, and to follow which statements are executed in which order. For
instance, consider the following code:
static void dbg_printf(int sum, int i) {
fprintf(stderr, "The value of sum is %d, i=%d\n", sum, i);
fflush(stderr);
}
void func(int n) {
int sum = 0;
for (int i = 0; i < 16; i+=1) {
sum += i;
dbg_printf(sum, i);
}
}
Executing this after Polly's codegen with -polly-debug-func=dbg_printf
reveals the new execution order and the assumed values at that point of
execution.
Differential Revision: https://reviews.llvm.org/D45728
llvm-svn: 330466
Loads before the SCoP are always invariant within the SCoP and
therefore are no "required invariant loads". An assertion failes in
ScopBuilder when it finds such an invariant load.
Fix by not adding such loads to the required invariant load list. This
likely will cause the region to be not considered a valid SCoP.
We may want to unconditionally accept instructions defined before
the region as valid invariant conditions instead of rejecting them.
This fixes a compilation crash of SPEC CPU2006 453.povray's
render.cpp.
llvm-svn: 314636
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.
This is a recommit of r312663 after fixing
test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll
llvm-svn: 314075
This reverts commit
r312410 - [ScopDetect/Info] Look through PHIs that follow an error block
The commit caused generation of invalid IR due to accessing a parameter
that does not dominate the SCoP.
llvm-svn: 312663
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.
llvm-svn: 312410
In cases where the entry block of a scop was not contained in a loop that was
part of the scop region and at the same time there was a loop surrounding the
scop, we missed to count the loops in the scop and consequently did not consider
the scop profitable. We correct this by only moving to the loop parent, in case
the current loop is loop contained in the scop.
This increases the number of loops in COSMO which we assume to be profitable
from 3974 to 4981.
llvm-svn: 311863