forked from OSchip/llvm-project
42402c9e89
This is the first patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS micro-kernel by applying a combination of tiling and unrolling. In subsequent changes we will add the extraction of the BLIS macro-kernel and implement the packing transformation. Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21140 llvm-svn: 273397 |
||
---|---|---|
.. | ||
2012-03-16-Empty-Domain.ll | ||
2012-04-16-Trivially-vectorizable-loops.ll | ||
2013-04-11-Empty-Domain-two.ll | ||
computeout.ll | ||
full_partial_tile_separation.ll | ||
line-tiling-2.ll | ||
line-tiling.ll | ||
one-dimensional-band.ll | ||
outer_coincidence.ll | ||
pattern-matching-based-opts.ll | ||
pattern-matching-based-opts_2.ll | ||
pattern-matching-based-opts_3.ll | ||
prevectorization-without-tiling.ll | ||
prevectorization.ll | ||
rectangular-tiling.ll |