forked from OSchip/llvm-project
9802268ad3
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634 |
||
---|---|---|
.. | ||
agg-interleave-a2.ll | ||
large-loop-rdx.ll | ||
lit.local.cfg | ||
massv-altivec.ll | ||
massv-calls.ll | ||
massv-nobuiltin.ll | ||
massv-unsupported.ll | ||
pr30990.ll | ||
reg-usage.ll | ||
small-loop-rdx.ll | ||
stride-vectorization.ll | ||
vectorize-only-for-real.ll | ||
vsx-tsvc-s173.ll |