Read-only values (values defined before the SCoP) require special
handing with -polly-analyze-read-only-scalars=true (which is the
default). If active, each use of a value requires a read access.
When a copied value uses a read-only value, we must also ensure that
such a MemoryAccess is available or is created.
Differential Revision: https://reviews.llvm.org/D35764
llvm-svn: 308876
Summary:
- We were using `.count` in `StringRef`, which matches substrings.
- We may want to use this for equality as well.
- Generalise this, so allow regexes as a parameter to `polly-only-func`.
Differential Revision: https://reviews.llvm.org/D35728
llvm-svn: 308875
Change the indention of the last brace to align with the opening line.
Before:
Instructions {
%val = fadd double %arg, 2.100000e+01
store double %val, double* %A
}
After:
Instructions {
%val = fadd double %arg, 2.100000e+01
store double %val, double* %A
}
llvm-svn: 308828
Print a statement's instruction on dump() regardless of
-polly-print-instructions. dump() is supposed to be used in the debugger
only and never in regression tests. While debugging, get all the
information we have and we are not bound to break anything. For non-dump
purposes of print, forward the setting of -polly-print-instructions as
parameters.
Some calls to print() had to be changed because the
PollyPrintInstructions setting is only available in ScopInfo.cpp.
In ScheduleOptimizer.cpp, dump() was used in regression tests.
That's not what dump() is for.
The print parameter "PrintInstructions" will also be useful for an
explicit print SCoP pass in a future patch.
llvm-svn: 308746
When performing invariant load hoisting we check that invariant load expressions
are not too complex. Up to this commit, we performed this check by counting the
sum of dimensions in the access range as a very simple heuristic. This heuristic
is a little too conservative, as it prevents hoisting for any scops with a
very large number of parameters. Hence, we update the heuristic to only count
existentially quantified dimensions and set dimensions. We expect this to still
detect the problematic expressions in h264 because of which this check was
originally introduced.
For some unknown reason, this complexity check was originally committed in
IslNodeBuilder. It really belongs in ScopInfo, as there is no point in
optimizing a program which we could have known earlier cannot be code generated.
The benefit of running the check early is that we can avoid to even hoist checks
that are expensive to code generate as invariant loads. This can be seen in
the changed tests, where we now indeed detect the scop, but just not invariant
load hoist the complicated access.
We also improve the formatting of the code, document it, and use isl++ to
simplify expressions.
llvm-svn: 308659
When constructing a schedule true and there are multiple statements for
a basic block, create a sequence node for these statements.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D35679
llvm-svn: 308635
We are working towards removing uses of Scop::getStmtFor(BB). In this
patch, we remove dependency of Scop::getLastStmtFor(BB) on
getStmtFor(BB). To do so, we get the list of all statements
corresponding to the BB and then fetch the last one.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D35665
llvm-svn: 308633
Introduce previously missing PHIReads analogous the the already existing
PHIWrites/ValueWrites/ValueReads maps. PHIReads was initially not
required and the later introduced lookupPHIReadOf() used a linear
search instead.
With PHIReads, lookupPHIReadOf() can now also do a map lookup and remove
any surprising performance/behaviour differences to lookupPHIWriteOf(),
lookupValueWriteOf() and lookupValueReadOf().
llvm-svn: 308630
Use a mark-and-sweep algorithm to find and remove unused instructions
and MemoryAccesses. This is useful in particular to remove scalar
writes that are never used anywhere. A scalar write in a loop induces
a write-after-write dependency that stops the loop iterations to be
rescheduled. Such writes can be a result of previous transformations
such as DeLICM and operand tree forwarding.
It adds a new class VirtualInstruction that represents an instruction in
a particular statement. At the moment an instruction can only belong to
the statement that represents a BasicBlock. In the future, instructions
can be in one of multiple statements representing a BasicBlock
(Nandini's work), in different statements than its BasicBlock would
indicate, and even multiple statements at once (by forwarding operand
trees). It also integrates nicely with the VirtualUse class.
ScopStmt::contains(Instruction*) currently uses the instruction's parent
BasicBlock to check whether it contains the instruction. It will need to
check the actual statement list when one of the aforementioned features
become possible.
Differential Revision: https://reviews.llvm.org/D35656
llvm-svn: 308626
Since there will be no more a 1:1 correspondence between statements
and basic blocks, we would like to get rid of the method getStmtFor(BB)
and its uses. Here we remove one of its uses in ScopBuilder by fetching
the statement in which the instruction lies.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D35610
llvm-svn: 308610
This is one possible solution to implement wrap-arounds for integers in
unsigned icmp operations. For example,
store i32 -1, i32* %A_addr
%0 = load i32, i32* %A_addr
%1 = icmp ult i32 %0, 0
%1 should hold false, because under the assumption of unsigned integers,
-1 should wrap around to 2^32-1. However, previously. it was assumed
that the MSB (Most Significant Bit - aka the Sign bit) was never set for
integers in unsigned operations.
This patch modifies the buildConditionSets function in ScopInfo.cpp to
give better information about the integers in these unsigned
comparisons.
Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D35464
llvm-svn: 308608
Before this patch, ScalarDefUseChain was a tool used by DeLICM to find
all reads and writes of scalar accesses. It iterated once over all
accesses and stores the accesses into maps.
By integrating it into the Scop class, we can keep the maps up-to-date
without the need for recomputing them. It will be needed for more than
DeLICM in the future, such as SCoP simplification, code movement between
virtual statements, and array expansion (GSoC project).
Compared to ScalarUseDefChain, we save two maps by finding the ScopStmt
a Def/PHIRead must reside in, and use its already existing lookup
function to find the MemoryAccess.
Differential Revision: https://reviews.llvm.org/D35631
llvm-svn: 308495
Once statements are split, a BasicBlock will comprise of multiple
statements. To prepare for this change in future, we introduce a list
of statements in the statement map.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D35301
llvm-svn: 308318
Utilizing newer LLVM diagnostic remark API in order to enable use of
opt-viewer tool. Polly Diagnostic Remarks also now appear in YAML
remark file.
In this patch, I've added the OptimizationRemarkEmitter into certain
classes where remarks are being emitted and update the remark emit calls
itself. I also provide each remark a BasicBlock or Instruction from where
it is being called, in order to compute the hotness of the remark.
Patch by Tarun Rajendran!
Differential Revision: https://reviews.llvm.org/D35399
llvm-svn: 308233
Summary: Since there will be no more a 1-1 correspondence between statements and basic block, we would like to get rid of the method `getStmtFor(BB)` and its uses. Here we remove one of its uses in PolyhedralInfo, as suggested by Michael Sir.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: grosser
Subscribers: pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D35300
llvm-svn: 308220
Summary:
We do not keep domain constraints on access functions when building the
scop. Hence, for consistency reasons, it makes also sense to not include
them when storing a new access function. This change results in simpler
access functions that make output easier to read.
This patch also helps to make DeLICMed memory accesses to be understood by
our matrix multiplication pattern matching pass. Further changes to the
matrix multiplication pattern matching are needed for this to work, so the
corresponding test case will be added in a future commit.
Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg
Subscribers: pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D35237
llvm-svn: 308215
This patch makes sure that in case a loop is not fully contained within a region
that later forms a SCoP, none of the loop backedges are allowed to be part of
the region. We currently do not support the situation where only some of a loops
backedges are part of a scop. Today, this can break both scop modeling and code
generation. One such breaking test case is for example
test/ScopDetectionDiagnostics/loop_partially_in_scop-2.ll, where we totally
forgot to code generate some of the backedges. Fortunately, it is commonly not
necessary to support these partial loops, it is way more common that either
no backedge is included in a region or all loop backedge are included.
This fixes a recent miscompile in
MultiSource/Benchmarks/MiBench/consumer-typeset which was exposed after
r306477.
llvm-svn: 308113
We need to relax constraints on invariant loads so that they do not
create fake RAW dependences. So, we do not consider invariant loads as
scalar dependences in a region.
During these changes, it turned out that we do not consider `llvm::Value`
replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`.
The replacements dictated by `ValueMap` were not being followed in all
places. This was fixed in this commit. There is no clean way to decouple
this change because this bug only seems to arise when the relaxed
version of invariant load hoisting was enabled.
Differential Revision: https://reviews.llvm.org/D35120
llvm-svn: 307907
Summary:
Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops.
Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well.
Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ | kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately.
Reviewers: grosser, bollu
Reviewed By: grosser
Subscribers: mehdi_amini, nemanjai, pollydev, kbarton
Tags: #polly
Differential Revision: https://reviews.llvm.org/D35176
llvm-svn: 307814
Summary:
Since r306667, propagateInvalidStmtDomains gets a reference to an
InvalidDomainMap. As part of the branch leading to return false, the respective
domain is freed. It is, however, not removed from the InvalidDomainMap, leaking
a pointer to a freed object which results in a use-after-free. Fix this be
removing the domain from the map before returning.
We tried to derive a test case that reliably failes, but did not succeed in
producing one. Hence, for now the failures in our LNT bots must be sufficient
to keep this issue tested.
Reviewers: grosser, Meinersbur, bollu
Subscribers: bollu, nandini12396, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D34971
llvm-svn: 307499
Summary:
Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU.
When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration.
In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: grosser
Subscribers: kbarton, nemanjai, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D34054
llvm-svn: 306863
ScopStmts were being used in the computation of the Domain of the SCoPs
in ScopInfo. Once statements are split, there will not be a 1-to-1
correspondence between Stmts and Basic blocks. Thus this patch avoids
the use of getStmtFor() by creating a map of BB to InvalidDomain and
using it to compute the domain of the statements.
Contributed-by: Nanidini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D33942
llvm-svn: 306667
This patch aims to implement the option of allocating new arrays created
by polly on heap instead of stack. To enable this option, a key named
'allocation' must be written in the imported json file with the value
'heap'.
We need such a feature because in a next iteration, we will implement a
mechanism of maximal static expansion which will need a way to allocate
arrays on heap. Indeed, the expansion is very costly in terms of memory
and doing the allocation on stack is not worth considering.
The malloc and the free are added respectively at polly.start and
polly.exiting such that there is no use-after-free (for instance in case
of Scop in a loop) and such that all memory cells allocated with a
malloc are free'd when we don't need them anymore.
We also add :
- In the class ScopArrayInfo, we add a boolean as member called IsOnHeap
which represents the fact that the array in allocated on heap or not.
- A new branch in the method allocateNewArrays in the ISLNodeBuilder for
the case of heap allocation. allocateNewArrays now takes a BBPair
containing polly.start and polly.exiting. allocateNewArrays takes this
two blocks and add the malloc and free calls respectively to
polly.start and polly.exiting.
- As IntPtrTy for the malloc call, we use the DataLayout one.
To do that, we have modified :
- createScopArrayInfo and getOrCreateScopArrayInfo such that it returns
a non-const SAI, in order to be able to call setIsOnHeap in the
JSONImporter.
- executeScopConditionnaly such that it return both start block and end
block of the scop, because we need this two blocs to be able to add
the malloc and the free calls at the right position.
Differential Revision: https://reviews.llvm.org/D33688
llvm-svn: 306540
This reduces the compilation time of one reduced test case from Android from
16 seconds to 100 mseconds (we bail out), without negatively impacting any
other test case we currently have.
We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully,
those will go away with this change.
llvm-svn: 306235
During the construction of MemoryAccesses in ScopBuilder, BasicBlocks
were used in function parameters, assuming that the ScopStmt an be
directly derived from it. This won't be true anymore once we split
BasicBlocks into multiple ScopStmt. As a preparation for such a change
in the future, we instead pass the ScopStmt and avoid the use of
getStmtFor().
There are two occasions where a kind of mapping from BasicBlock to
ScopStmt is still required.
1. Get the statement representing the incoming block of a `PHINode`
using `getLastStmtOf`.
2. One statement is required to write a scalar to be readable by those
which need it. This is most often the statement which contains its
definition, which we get using `getStmtFor(Instruction*)`.
Differential Revision: https://reviews.llvm.org/D34369
llvm-svn: 306132