forked from OSchip/llvm-project
e0d466342b
Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662 |
||
---|---|---|
.. | ||
CodeGen/OpenMP | ||
DeLICM | ||
DeadCodeElimination | ||
DependenceInfo | ||
FlattenSchedule | ||
GPGPU | ||
Isl | ||
PruneUnprofitable | ||
ScheduleOptimizer | ||
ScopDetect | ||
ScopDetectionDiagnostics | ||
ScopInfo | ||
Simplify | ||
Unit | ||
UnitIsl | ||
CMakeLists.txt | ||
README | ||
create_ll.sh | ||
lit.cfg | ||
lit.site.cfg.in | ||
polly.ll | ||
update_check.py |
README
place tests here