forked from OSchip/llvm-project
[Docs] Add a performance document.
Summary: Add a document which describes: - GEMM performance comparison. - An experiment that measures the compile time impact of enabling Polly when compiling LLVM+Clang+Polly. Contributed-by: Theodoros Theodoridis<theodoros.theodoridis@inf.ethz.ch> Differential Revision: https://reviews.llvm.org/D38330 llvm-svn: 314419
This commit is contained in:
parent
c965b30e54
commit
99d3567b0d
|
@ -0,0 +1,57 @@
|
|||
.. include:: <isonum.txt>
|
||||
==================================================
|
||||
Performance
|
||||
==================================================
|
||||
|
||||
High-Performance Generalized Matrix Multiplication
|
||||
--------------------------------------------------
|
||||
|
||||
Polly automatically detects and optimizes generalized matrix multiplication,
|
||||
the computation C |larr| α ⊗ C ⊕ β ⊗ A ⊗ B, where A, B, and C are three appropriately sized matrices,
|
||||
⊕ and ⊗ operations are originating from the corresponding matrix semiring, and α and β are
|
||||
constants, and beta is not equal to zero. It allows to obtain the highly optimized form structured
|
||||
similar to the expert implementation of GEMM that can be found in GotoBLAS and its successors. The
|
||||
performance evaluation of GEMM is shown in the following figure.
|
||||
|
||||
|
||||
.. image:: images/GEMM_double.png
|
||||
:align: center
|
||||
|
||||
|
||||
|
||||
Compile Time Impact of Polly
|
||||
----------------------------
|
||||
|
||||
Clang+LLVM+Polly are compiled using Clang on a Intel(R) Core(TM) i7-7700 based system. The experiment
|
||||
is repeated twice: with and without Polly enabled in order to measure its compile time impact.
|
||||
|
||||
The following versions are used:
|
||||
|
||||
|
||||
- Polly (git hash 0db98a4837b6f233063307bb9184374175401922)
|
||||
- Clang (git hash 3e1d04a92b51ed36163995c96c31a0e4bbb1561d)
|
||||
- LLVM git hash 0265ec7ebad69a47f5c899d95295b5eb41aba68e)
|
||||
|
||||
`ninja <https://ninja-build.org/>`_ is used as the build system.
|
||||
|
||||
For both cases the whole compilation was performed five times. The compile times in seconds are shown in the following table.
|
||||
|
||||
+----------------------------+
|
||||
| Compile Time |
|
||||
+--------------+-------------+
|
||||
|Polly Disabled|Polly Enabled|
|
||||
+==============+=============+
|
||||
|964 |977 |
|
||||
+--------------+-------------+
|
||||
|964 |980 |
|
||||
+--------------+-------------+
|
||||
|967 |981 |
|
||||
+--------------+-------------+
|
||||
|967 |981 |
|
||||
+--------------+-------------+
|
||||
|968 |982 |
|
||||
+--------------+-------------+
|
||||
|
||||
|
||||
The median compile time without Polly enabled is 967 seconds and with Polly enabled it is 981 seconds. The overhead is 1.4%.
|
||||
|
|
@ -25,6 +25,7 @@ Using Polly
|
|||
UsingPollyWithClang
|
||||
HowToManuallyUseTheIndividualPiecesOfPolly
|
||||
TipsAndTricks
|
||||
Performance
|
||||
|
||||
* `A list of Polly passes <http://polly.llvm.org/documentation/passes.html>`_
|
||||
|
||||
|
|
Loading…
Reference in New Issue