llvm-project/llvm/test/Analysis/CostModel
Dorit Nuzman e0e0f1ddb0 [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2
The cost of an interleaved access was only implemented for AVX512. For other
X86 targets an overly conservative Base cost was returned, resulting in
avoiding vectorization where it is actually profitable to vectorize.
This patch starts to add costs for AVX2 for most prominent cases of
interleaved accesses (stride 3,4 chars, for now).

Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb
workloads; There is also a known issue of 15-30% degradations on some of these
workloads, associated with an interleaved access followed by type
promotion/widening; the resulting shuffle sequence is currently inefficient and
will be improved by a series of patches that extend the X86InterleavedAccess pass
(such as D34601 and more to follow).

Note 2: The costs in this patch do not reflect port pressure penalties which can
be very dominant in the case of interleaved accesses since most of the shuffle
operations are restricted to a single port. Further tuning, that may incorporate
these considerations, will be done on top of the upcoming improved shuffle
sequences (that is, along with the abovementioned work to extend
X86InterleavedAccess pass).


Differential Revision: https://reviews.llvm.org/D34023

llvm-svn: 306238
2017-06-25 08:26:25 +00:00
..
AArch64 Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor 2017-05-24 16:48:39 +00:00
AMDGPU AMDGPU: Make some packed shuffles free 2017-05-10 21:29:33 +00:00
ARM [TTI/CostModel] Correct the way getGEPCost() calls isLegalAddressingMode() 2016-12-03 01:57:24 +00:00
PowerPC [PPC] Give unaligned memory access lower cost on processor that supports it 2017-02-17 22:29:39 +00:00
SystemZ [SystemZ] Modelling of costs of divisions with a constant power of 2. 2017-05-17 12:46:26 +00:00
X86 [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2 2017-06-25 08:26:25 +00:00
no_info.ll Roll forward r243250 2015-07-26 19:10:03 +00:00