[X86][Costmodel] `getReplicationShuffleCost()`: promote 1 bit-wide elements to 16 bit when have AVX512BW

Here we get pretty lucky. AVX512F does not provide any instructions
to convert between a `k` vector mask and a vector,
but AVX512BW adds `{k}<->nX{i8,i16}`conversions,
and just as it happens, with AVX512BW we have a i16 shuffle.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113915
This commit is contained in:
Roman Lebedev 2021-11-19 15:55:07 +03:00
parent 92d279fd6d
commit a50fdd3fc9
No known key found for this signature in database
GPG Key ID: 083C3EBB4A1689E0
2 changed files with 86 additions and 15 deletions

View File

@ -3664,6 +3664,13 @@ X86TTIImpl::getReplicationShuffleCost(Type *EltTy, int ReplicationFactor,
if (!ST->hasVBMI())
PromEltTyBits = 32; // promote to i32, AVX512F.
break; // AVX512VBMI
case 1:
// There is no support for shuffling i1 elements. We *must* promote.
if (ST->hasBWI()) {
PromEltTyBits = 16; // promote to i16, AVX512BW.
break;
}
return bailout();
default:
return bailout();
}

File diff suppressed because one or more lines are too long