This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23260
llvm-svn: 281441
This line makes BUILD_SHARED_LIBS=ON work for Polly-ACC. Without it, ld
complains about missing isl symbols when constructing the shared library.
llvm-svn: 281396
The alias to the array element is read-only and a primitive type (pointer),
therefore use the value directly instead of a reference to it.
llvm-svn: 281311
The flag -fvisibility=hidden flag was used for the integrated Integer
Set Library (and PPCG) to keep their definitions local to Polly. The
motivation was the be loaded into a DragonEgg-powered GCC, where GCC
might itself use ISL for its Graphite extension. The symbols of Polly's
ISL and GCC's ISL would clash.
The DragonEgg project is not actively developed anymore, but Polly's
unittests need to call ISL functions to set up a testing environment.
Unfortunately, the -fvisibility=hidden flag means that the ISL symbols
are not available to the gtest executable as it resides outside of
libPolly when linked dynamically. Currently, CMake links a second copy
of ISL into the unittests which leads to subtle bugs. What got observed
is that two isl_ids for isl_id_none exist, one for each library
instance. Because isl_id's are compared by address, isl_id_none could
happen to be different from isl_id_none, depending on which library
instance set the address and does the comparison.
Also remove the FORCE_STATIC flag which was introduced to keep the ISL
symbols visible inside the same libPolly shared object, even when build
with BUILD_SHARED_LIBS.
Differential Revision: https://reviews.llvm.org/D24460
llvm-svn: 281242
We do not need the size of the outermost dimension in most cases, but if we
allocate memory for newly created arrays, that size is needed.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D23991
llvm-svn: 281234
Instead of aborting, we now bail out gracefully in case the kernel IR we
generate is invalid. This can currently happen in case the SCoP stores
pointer values, which we model as arrays, as data values into other arrays. In
this case, the original pointer value is not available on the device and can
consequently not be stored. As detecting this ahead of time is not so easy, we
detect these situations after the invalid IR has been generated and bail out.
llvm-svn: 281193
If these arrays have never been accessed we failed to derive an upper bound
of the accesses and consequently a size for the outermost dimension. We
now explicitly check for empty access sets and then just use zero as size
for the outermost dimension.
llvm-svn: 281165
The -polly-flatten-schedule pass reduces the number of scattering
dimensions in its isl_union_map form to make them easier to understand.
It is not meant to be used in production, only for debugging and
regression tests.
To illustrate, how it can make sets simpler, here is a lifetime set
used computed by the porposed DeLICM pass without flattening:
{ Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0;
Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5;
Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0;
Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3;
Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 }
And here the same lifetime for a semantically identical one-dimensional
schedule:
{ Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 }
Differential Revision: https://reviews.llvm.org/D24310
llvm-svn: 280948
... to preserve reference counting logic.
In practice the missing assignment would not have caused any issues. We still
fix it as the code is wrong and it also causes noise in the clang static
analysis runs.
llvm-svn: 280946
When running the clang static analyser to check for memory issues, this code
originally showed a double free, as the analyser was unable to understand that
isl_set_free always returns NULL and consequently later uses of the isl object
we just freed will never be reached. Without this knowledge, the analyser has
to issue a warning.
We refactor the code to make it clear that for empty maps the current loop
iteration is aborted.
llvm-svn: 280940
When running the clang static analyser to check for memory issues, this code
originally showed a double free, as the analyser was unable to understand that
isl_union_map_free always returns NULL and consequently later uses of the isl
object we just freed will never be reached. Without this knowledge, the analyser
has to issue a warning.
We refactor the code to make it clear that for empty maps the current loop
iteration is aborted.
llvm-svn: 280938
Disable some Visual C++ warnings on ISL. These are not reported by GCC/Clang in
the ISL build system. We do not intend to fix them in the Polly in-tree copy,
hence disable these warnings.
llvm-svn: 280811
The check-polly-tests target runs regression/unit tests but without checking
formatting. This is useful to not having to reload a file in an open editor
(which eg. clears the undo buffer, moves cursor/window position) when running
polly-update-format.
After this change, the following test targets exist:
- check-polly-unittests to run unittests only
- check-polly-tests to run unit and regression tests
- polly-check-format to check formatting using clang-format
- check-polly to run them all
As a side-effect, when running check-polly, polly-check-format and run in
parallel (instead of polly-check-format first).
Differential Revision: https://reviews.llvm.org/D24191
llvm-svn: 280654
... but instead rely on the assumptions that we derive for load/store
instructions.
Before we were able to delinearize arrays, we used GEP pointer instructions
to derive information about the likely range of induction variables, which
gave us more freedom during loop scheduling. Today, this is not needed
any more as we delinearize multi-dimensional memory accesses and as part
of this process also "assume" that all accesses to these arrays remain
inbounds. The old derive-assumptions-from-GEP code has consequently become
mostly redundant. We drop it both to clean up our code, but also to improve
compile time. This change reduces the scop construction time for 3mm in
no-asserts mode on my machine from 48 to 37 ms.
llvm-svn: 280601
Without reductions we do not need a flat union_map schedule describing
the computation we want to perform, but can work purely on the schedule
tree. This reduces the dependence computation and scheduling time from 33ms
to 25ms. Another 30% reduction.
llvm-svn: 280558
In case we do not compute reduction dependences or dependences that are more
fine-grained than statement level dependences, we can avoid the corresponding
part of the dependence analysis all together. For the 3mm benchmark, this
reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts
build. The majority of the compile time is anyhow spent in the LLVM backends,
when doing code generation. Nevertheless, there is no need to waste compile time
either.
llvm-svn: 280557
We replace the options
-polly-code-generator=none
=isl
with the options
-polly-code-generation=none
=ast
=full
This allows us to measure the overhead of Polly itself, versus the compile
time increases due to us generating more IR and consequently the LLVM backends
spending more time on this IR.
We also use this opportunity to rename the option. The original name was
introduced at a point where we still had two code generators. CLooG and the
isl AST generator. Since we only have one AST generator left, there is no need
to distinguish between 'isl' and something else. However, being able to disable
code generation all together has been shown useful for debugging. Hence, we
rename and extend this option to make it a good fit for its new use case.
llvm-svn: 280554
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.
llvm-svn: 280468
Change the code around setNewAccessRelation to allow to use a an existing array
element for memory instead of an ad-hoc alloca. This facility will be used for
DeLICM/DeGVN to convert scalar dependencies into regular ones.
The changes necessary include:
- Make the code generator use the implicit locations instead of the alloca ones.
- A test case
- Make the JScop importer accept changes of scalar accesses for that test case.
- Adapt the MemoryAccess interface to the fact that the MemoryKind can change.
They are named (get|is)OriginalXXX() to get the status of the memory access
before any change by setNewAccessRelation() (some properties such as
getIncoming() do not change even if the kind is changed and are still
required). To get the modified properties, there is (get|is)LatestXXX(). The
old accessors without Original|Latest become synonyms of the
(get|is)OriginalXXX() to not make functional changes in unrelated code.
Differential Revision: https://reviews.llvm.org/D23962
llvm-svn: 280408
There are some constraints on maps that can be access relations. In builds with assertions enabled, verify
- The access domain is the same space as the statement's domain (modulo parameters).
- Whether an access is defined for every instance of the statement. (codegen does not yet support partial access relations)
- Whether the access range links to an array, represented by a ScopArrayInfo.
- The number of access dimensions equals the dimensions of the array.
- The array is not an indirect access. (also not supported by codegen)
Differential Revision: https://reviews.llvm.org/D23916
llvm-svn: 280404
isl_val_int_from_ui takes an 'unsigned long' which has on 32-bit and LLP64
windows systems only 32 bit. Hence, make sure we do not use it with constants
that are larger than 32 bit.
Reported-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 279824
This improves the readability of failing test results, as gtest prints always
the first argument as the 'expected value'.
In the previous commit we already changed the tests for isl_valFromAPInt. In
this commit, the tests for IslValToAPInt follow.
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 279817
The recent unit tests we gained made clear that the semantics of
isl_valFromAPInt are not clear, due to missing documentation. In this change we
document both the calling interface as well as the implementation of
isl_valFromAPInt.
We also make the implementation easier to read by removing integer wrappig in
abs() when passing in the minimal integer value for a given bitwidth. Even
though wrapping and subsequently interpreting the result as unsigned value gives
the correct result, this is far from obvious. Instead, we explicitly add one
more bit to the input type to ensure that abs will never wrap. This change did
not uncover a bug in the old implementation, but was introduced to increase
readability.
We update the tests to add a test case for this special case and use this
opportunity to also test a number larger than 64 bit. Finally, we order the
arguments of the test cases to make sure the expected output is first. This
helps readability in case of failing test cases as gtest assumes the first value
to be the exected value.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D23917
llvm-svn: 279815
The recent unit tests we gained made clear that the semantics of APIntFromVal
are not clear, due to missing documentation. In this change we document both
the calling interface as well as the implementation of APIntFromVal. We also
make the implementation easier to read by removing the use of magic numbers.
Finally, we add tests to check the bitwidth of the created values as well as
the correct modeling of very large numbers.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D23910
llvm-svn: 279813
Remove the unused function get_system_libs. Instead, run
'llvm-config --system-libs' to determine which libraries are required in
addition LLVM's for linking an executable. At the moment these are the unittests
that link to gtest and transitively depend on these system libs.
llvm-svn: 279743