This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23260
llvm-svn: 281441
The alias to the array element is read-only and a primitive type (pointer),
therefore use the value directly instead of a reference to it.
llvm-svn: 281311
We do not need the size of the outermost dimension in most cases, but if we
allocate memory for newly created arrays, that size is needed.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D23991
llvm-svn: 281234
When running the clang static analyser to check for memory issues, this code
originally showed a double free, as the analyser was unable to understand that
isl_set_free always returns NULL and consequently later uses of the isl object
we just freed will never be reached. Without this knowledge, the analyser has
to issue a warning.
We refactor the code to make it clear that for empty maps the current loop
iteration is aborted.
llvm-svn: 280940
When running the clang static analyser to check for memory issues, this code
originally showed a double free, as the analyser was unable to understand that
isl_union_map_free always returns NULL and consequently later uses of the isl
object we just freed will never be reached. Without this knowledge, the analyser
has to issue a warning.
We refactor the code to make it clear that for empty maps the current loop
iteration is aborted.
llvm-svn: 280938
... but instead rely on the assumptions that we derive for load/store
instructions.
Before we were able to delinearize arrays, we used GEP pointer instructions
to derive information about the likely range of induction variables, which
gave us more freedom during loop scheduling. Today, this is not needed
any more as we delinearize multi-dimensional memory accesses and as part
of this process also "assume" that all accesses to these arrays remain
inbounds. The old derive-assumptions-from-GEP code has consequently become
mostly redundant. We drop it both to clean up our code, but also to improve
compile time. This change reduces the scop construction time for 3mm in
no-asserts mode on my machine from 48 to 37 ms.
llvm-svn: 280601
Without reductions we do not need a flat union_map schedule describing
the computation we want to perform, but can work purely on the schedule
tree. This reduces the dependence computation and scheduling time from 33ms
to 25ms. Another 30% reduction.
llvm-svn: 280558
In case we do not compute reduction dependences or dependences that are more
fine-grained than statement level dependences, we can avoid the corresponding
part of the dependence analysis all together. For the 3mm benchmark, this
reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts
build. The majority of the compile time is anyhow spent in the LLVM backends,
when doing code generation. Nevertheless, there is no need to waste compile time
either.
llvm-svn: 280557
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.
llvm-svn: 280468
Change the code around setNewAccessRelation to allow to use a an existing array
element for memory instead of an ad-hoc alloca. This facility will be used for
DeLICM/DeGVN to convert scalar dependencies into regular ones.
The changes necessary include:
- Make the code generator use the implicit locations instead of the alloca ones.
- A test case
- Make the JScop importer accept changes of scalar accesses for that test case.
- Adapt the MemoryAccess interface to the fact that the MemoryKind can change.
They are named (get|is)OriginalXXX() to get the status of the memory access
before any change by setNewAccessRelation() (some properties such as
getIncoming() do not change even if the kind is changed and are still
required). To get the modified properties, there is (get|is)LatestXXX(). The
old accessors without Original|Latest become synonyms of the
(get|is)OriginalXXX() to not make functional changes in unrelated code.
Differential Revision: https://reviews.llvm.org/D23962
llvm-svn: 280408
There are some constraints on maps that can be access relations. In builds with assertions enabled, verify
- The access domain is the same space as the statement's domain (modulo parameters).
- Whether an access is defined for every instance of the statement. (codegen does not yet support partial access relations)
- Whether the access range links to an array, represented by a ScopArrayInfo.
- The number of access dimensions equals the dimensions of the array.
- The array is not an indirect access. (also not supported by codegen)
Differential Revision: https://reviews.llvm.org/D23916
llvm-svn: 280404
getAccessFunctions() is dead code and the 'BB' argument
of getOrCreateAccessFunctions() is not used. This patch deletes
getAccessFunctions and transforms AccFuncMap into
a std::vector<std::unique_ptr<MemoryAccess>> AccessFunctions.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23759
llvm-svn: 279394
Normally this is ensured when adding PHI nodes, but as PHI node dependences
do not need to be added in case all incoming blocks are within the same
non-affine region, this was missed.
This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting
was disabled.
llvm-svn: 278792
With invariant load hoisting enabled the LLVM buildbots currently show some
miscompiles, which are possibly caused by invariant load hosting itself.
Confirming and fixing this requires a more in-depth analysis. To meanwhile get
back green buildbots that allow us to observe other regressions, we disable
invariant code hoisting temporarily. The relevant bug is tracked at:
http://llvm.org/PR28985
llvm-svn: 278681
The function expandRegion() frees Region* objects again when it determines that
these are not valid SCoPs. However, the DetectionContext added to the
DetectionContextMap still holds a reference. The validity is checked using the
ValidRegions lookup table. When a new Region is added to that list, it might
share the same address, such that the DetectionContext contains two
Region* associations that are in ValidRegions, but that are unrelated and of
which one has already been free.
Also remove the DetectionContext when not a valid expansion.
llvm-svn: 278062
When entering the dependence computation and the max_operations is set, the
operations counter may have already exceeded the counter, thus aborting any ISL
computation from the start. The counter is reset at the end of the dependence
calculation such that a follow-up recomputation might succeed, ie. the success
of the first dependence calculation depends on unrelated ISL operations that
happened before, giving it a disadvantage to the following calculations.
This patch resets the operations counter at the beginning of the dependence
recalculation to not depend on previous actions. Otherwise additional
preprocessing of the Scop that aims to improve its schedulability (eg. DeLICM)
do have the effect that DependenceInfo and hence the scheduling fail more
likely, contraproductive to the goal of said preprocessing.
llvm-svn: 277810
Otherwise, we would try to re-optimize them with Polly-ACC and possibly even
generate kernels that try to offload themselves, which does not work as the
GPURuntime is not available on the accelerator and also does not make any
sense.
llvm-svn: 277589
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D22828
llvm-svn: 277263
Adding a new pass PolyhedralInfo. This pass will be the interface to Polly.
Initially, we will provide the following interface:
- #IsParallel(Loop *L) - return a bool depending on whether the loop is
parallel or not for the given program order.
Patch by Utpal Bora <cs14mtech11017@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D21486
llvm-svn: 276637
Do not process SCoPs with infeasible runtime context in the new
ScopInfoWrapperPass. Do not compute dependences for such SCoPs in the new
DependenceInfoWrapperPass.
Patch by Utpal Bora <cs14mtech11017@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D22402
llvm-svn: 276631
Summary: LLVM adds a new value FMRB_DoesNotReadMemory in the enumeration.
Reviewers: andrew.w.kaylor, chrisj, zinob, grosser, jdoerfert
Subscribers: Meinersbur, pollydev
Differential Revision: http://reviews.llvm.org/D22109
llvm-svn: 275085
Commit r275056 introduced a gcc compile failure due to us using two
types named 'Type', the first being the newly introduced member variable
'Type' the second being llvm::Type. We resolve this issue by renaming
the newly introduced member variable to AccessType.
llvm-svn: 275057
Summary:
With a struct we can use named accessors instead of generic std::get<3>()
calls. This increases readability of the source code.
Reviewers: jdoerfert
Subscribers: pollydev, llvm-commits
Differential Revision: http://reviews.llvm.org/D21955
llvm-svn: 275056
We now compute the invalid context of memory accesses only for the domain under
which the memory access is executed. Without limiting ourselves to this
restricted domain, invalid accesses outside of the domain of actually executed
statement instances may result in the execution domain of the statement to
become empty despite the fact that the statement will actually be executed. As a
result, such scops would use unitialized values for their computations which
results in incorrect computations.
This fixes http://llvm.org/PR27944 and unbreaks the
-polly-position=before-vectorizer buildbots.
llvm-svn: 275053
For llvm the memory accesses from nonaffine loops should be visible,
however for polly those nonaffine loops should be invisible/boxed.
This fixes llvm.org/PR28245
Cointributed-by: Huihui Zhang <huihuiz@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D21591
llvm-svn: 274842
This ensures that the error status set with -polly-on-isl-error-abort is
maintained even after running DependenceInfo and ScheduleOptimizer. Both
passes temporarily set the error status to CONTINUE as the dependence
analysis uses a compute-out and the scheduler may not be able to derive
a schedule. In both cases we want to not abort, but to handle the error
gracefully. Before this commit, we always set the error reporting to ABORT
after these passes. After this commit, we use the error reporting mode that was
active earlier.
This comes without a test case as this would require us to introduce (memory)
errors which would trigger the isl errors.
llvm-svn: 274272
It is only used internally by the ScopInfo pass. By moving it into its
own header file we avoid it being processed that use only ScopInfo.
llvm-svn: 273983
The methods in ScopBuilder are used for the construction of a Scop,
while the remaining classes of ScopInfo are required by all passes that
use Polly's polyhedral analysis.
llvm-svn: 273982
This function is used by both ScopInfo and ScopBuilder. A common
location for this function is required when ScopInfo and ScopBuilder are
separated into separate files in the next commit.
llvm-svn: 273981
Reject and report regions that contains loops overlapping nonaffine region.
This situation typically happens in the presence of inifinite loops.
This addresses bug llvm.org/PR28071.
Differential Revision: http://reviews.llvm.org/D21312
Contributed-by: Huihui Zhang <huihuiz@codeaurora.org>
llvm-svn: 273905
This patch addresses:
- A new function pass to compute polyhedral dependences. This is
required to avoid the region pass manager.
- Stores a map of Scop to Dependence object for all the scops present
in a function. By default, access wise dependences are stored.
Patch by Utpal Bora <cs14mtech11017@iith.ac.in>
Differential Revision: http://reviews.llvm.org/D21105
llvm-svn: 273881
This patch adds a new function pass ScopInfoWrapperPass so that the
polyhedral description of a region, the SCoP, can be constructed and
used in a function pass.
Patch by Utpal Bora <cs14mtech11017@iith.ac.in>
Differential Revision: http://reviews.llvm.org/D20962
llvm-svn: 273856
1. SCoP object is not owned by ScopBuilder. It just creates a SCoP and
hand over ownership through getScop() method.
2. ScopInfoRegionPass owns the SCoP object for a given region.
Patch by Utpal Bora <cs14mtech11017@iith.ac.in>
Differential Revision: http://reviews.llvm.org/D20912
llvm-svn: 273855
llvm commonly adds a comment to the closing brace of a namespace to indicate
which namespace is closed. clang-tidy provides with llvm-namespace-comment
a handy tool to check for this habit. We use it to ensure we consitently use
namespace comments in Polly.
There are slightly different styles in how namespaces are closed in LLVM. As
there is no large difference between the different comment styles we go for the
style clang-tidy suggests by default.
To reproduce this fix run:
for i in `ls tools/polly/lib/*/*.cpp`; \
clang-tidy -checks='-*,llvm-namespace-comment' -p build $i -fix \
-header-filter=".*"; \
done
This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in
http://reviews.llvm.org/D21488 and was split out to increase readability.
llvm-svn: 273621
Instead of using 0 or NULL use the C++11 nullptr symbol when referencing null
pointers.
This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in
http://reviews.llvm.org/D21488 and was split out to increase readability.
llvm-svn: 273435
The 'Color' enum is only used for irreducible control flow detection. Johannes
already moved this enum in r270054 from ScopDetection.h to ScopDetection.cpp to
limit its scope to a single cpp file. We now move it into the only function
where this enum is needed to make clear that it is only needed locally in this
single function.
Thanks to Johannes for pointing out this cleanup opportunity.
llvm-svn: 272462
Created a new pass ScopInfoRegionPass. As name suggests, it is a
region pass and it is there to preserve compatibility with our
existing Polly passes. ScopInfoRegionPass will return a SCoP object
for a valid region while the creation of the SCoP stays in the
ScopInfo class.
Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in>
Reviewed-by: Tobias Grosser <tobias@grosser.es>,
Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D20770
llvm-svn: 271259