[AArch64] Increase cost of v2i64 multiplies

The cost of a v2i64 multiply was special cased in D92208 as scalarized
into 4*extract + 2*insert + 2*mul. Scalarizing to/from gpr registers are
expensive though, and the cost wasn't high enough to prevent vectorizing
in places where it can be detrimental for performance. This increases it
so that the costs of copying to/from GPRs is increased to 2 each, with
the total cost increasing to 14. So long as umull/smull are handled
correctly (as in D123006) this seems to lead to better vectorization
factors and better performance.

Differential Revision: https://reviews.llvm.org/D123007
This commit is contained in:
David Green 2022-04-04 17:42:20 +01:00
parent 96039b73d8
commit 750bf3582a
5 changed files with 58 additions and 58 deletions

View File

@ -1841,15 +1841,15 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
// as elements are extracted from the vectors and the muls scalarized.
// As getScalarizationOverhead is a bit too pessimistic, we estimate the
// cost for a i64 vector directly here, which is:
// - four i64 extracts,
// - two i64 inserts, and
// - two muls.
// So, for a v2i64 with LT.First = 1 the cost is 8, and for a v4i64 with
// LT.first = 2 the cost is 16. If both operands are extensions it will not
// - four 2-cost i64 extracts,
// - two 2-cost i64 inserts, and
// - two 1-cost muls.
// So, for a v2i64 with LT.First = 1 the cost is 14, and for a v4i64 with
// LT.first = 2 the cost is 28. If both operands are extensions it will not
// need to scalarize so the cost can be cheaper (smull or umull).
if (LT.second != MVT::v2i64 || isWideningInstruction(Ty, Opcode, Args))
return LT.first;
return LT.first * 8;
return LT.first * 14;
case ISD::ADD:
case ISD::XOR:
case ISD::OR:

View File

@ -357,9 +357,9 @@ define i32 @smul(i32 %arg) {
; RECIP-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.smul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 88 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.smul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %I32 = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 undef, i32 undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 36 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 144 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %I16 = call { i16, i1 } @llvm.smul.with.overflow.i16(i16 undef, i16 undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.smul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.smul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)
@ -439,9 +439,9 @@ define i32 @umul(i32 %arg) {
; RECIP-NEXT: Cost Model: Found an estimated cost of 42 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.umul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 84 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.umul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %I32 = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 undef, i32 undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 23 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 92 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 35 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 70 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 140 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %I16 = call { i16, i1 } @llvm.umul.with.overflow.i16(i16 undef, i16 undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.umul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)
; RECIP-NEXT: Cost Model: Found an estimated cost of 30 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.umul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)

View File

@ -1554,15 +1554,15 @@ define void @extmulv2(<2 x i8> %i8, <2 x i16> %i16, <2 x i32> %i32, <2 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_32 = zext <2 x i8> %i8 to <2 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %azl_8_32 = mul <2 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sw_8_64 = sext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asw_8_64 = mul <2 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %asw_8_64 = mul <2 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sl1_8_64 = sext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sl2_8_64 = sext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asl_8_64 = mul <2 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %asl_8_64 = mul <2 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zw_8_64 = zext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azw_8_64 = mul <2 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %azw_8_64 = mul <2 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl1_8_64 = zext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_64 = zext <2 x i8> %i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azl_8_64 = mul <2 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %azl_8_64 = mul <2 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sw_16_32 = sext <2 x i16> %i16 to <2 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %asw_16_32 = mul <2 x i32> %i32, %sw_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sl1_16_32 = sext <2 x i16> %i16 to <2 x i32>
@ -1574,22 +1574,22 @@ define void @extmulv2(<2 x i8> %i8, <2 x i16> %i16, <2 x i32> %i32, <2 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl2_16_32 = zext <2 x i16> %i16 to <2 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %azl_16_32 = mul <2 x i32> %zl1_16_32, %zl2_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sw_16_64 = sext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asw_16_64 = mul <2 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %asw_16_64 = mul <2 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sl1_16_64 = sext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sl2_16_64 = sext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asl_16_64 = mul <2 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %asl_16_64 = mul <2 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zw_16_64 = zext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azw_16_64 = mul <2 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %azw_16_64 = mul <2 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl1_16_64 = zext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl2_16_64 = zext <2 x i16> %i16 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azl_16_64 = mul <2 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %azl_16_64 = mul <2 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sw_32_64 = sext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asw_32_64 = mul <2 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %asw_32_64 = mul <2 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %asl_32_64 = mul <2 x i64> %sl1_32_64, %sl2_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zw_32_64 = zext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azw_32_64 = mul <2 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %azw_32_64 = mul <2 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <2 x i32> %i32 to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %azl_32_64 = mul <2 x i64> %zl1_32_64, %zl2_32_64
@ -1693,15 +1693,15 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %azl_8_32 = mul <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sl1_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sl2_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %asl_8_64 = mul <4 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %asl_8_64 = mul <4 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zw_8_64 = zext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %azw_8_64 = mul <4 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %azw_8_64 = mul <4 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zl1_8_64 = zext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zl2_8_64 = zext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %azl_8_64 = mul <4 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %azl_8_64 = mul <4 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %sw_16_32 = sext <4 x i16> %i16 to <4 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %asw_16_32 = mul <4 x i32> %i32, %sw_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <4 x i16> %i16 to <4 x i32>
@ -1713,22 +1713,22 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <4 x i16> %i16 to <4 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %azl_16_32 = mul <4 x i32> %zl1_16_32, %zl2_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sw_16_64 = sext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %asw_16_64 = mul <4 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %asw_16_64 = mul <4 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sl1_16_64 = sext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %sl2_16_64 = sext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %asl_16_64 = mul <4 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %asl_16_64 = mul <4 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zw_16_64 = zext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %azw_16_64 = mul <4 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %azw_16_64 = mul <4 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zl1_16_64 = zext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zl2_16_64 = zext <4 x i16> %i16 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %azl_16_64 = mul <4 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %azl_16_64 = mul <4 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %sw_32_64 = sext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %asw_32_64 = mul <4 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %asw_32_64 = mul <4 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %asl_32_64 = mul <4 x i64> %sl1_32_64, %sl2_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %zw_32_64 = zext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %azw_32_64 = mul <4 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %azw_32_64 = mul <4 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <4 x i32> %i32 to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %azl_32_64 = mul <4 x i64> %zl1_32_64, %zl2_32_64
@ -1832,15 +1832,15 @@ define void @extmulv8(<8 x i8> %i8, <8 x i16> %i16, <8 x i32> %i32, <8 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %zl2_8_32 = zext <8 x i8> %i8 to <8 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %azl_8_32 = mul <8 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %sw_8_64 = sext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %asw_8_64 = mul <8 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %asw_8_64 = mul <8 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %sl1_8_64 = sext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %sl2_8_64 = sext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %asl_8_64 = mul <8 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %asl_8_64 = mul <8 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %zw_8_64 = zext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %azw_8_64 = mul <8 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %azw_8_64 = mul <8 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %zl1_8_64 = zext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %zl2_8_64 = zext <8 x i8> %i8 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %azl_8_64 = mul <8 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %azl_8_64 = mul <8 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %sw_16_32 = sext <8 x i16> %i16 to <8 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %asw_16_32 = mul <8 x i32> %i32, %sw_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <8 x i16> %i16 to <8 x i32>
@ -1852,22 +1852,22 @@ define void @extmulv8(<8 x i8> %i8, <8 x i16> %i16, <8 x i32> %i32, <8 x i64> %i
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <8 x i16> %i16 to <8 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %azl_16_32 = mul <8 x i32> %zl1_16_32, %zl2_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %sw_16_64 = sext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %asw_16_64 = mul <8 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %asw_16_64 = mul <8 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %sl1_16_64 = sext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %sl2_16_64 = sext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %asl_16_64 = mul <8 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %asl_16_64 = mul <8 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %zw_16_64 = zext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %azw_16_64 = mul <8 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %azw_16_64 = mul <8 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %zl1_16_64 = zext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %zl2_16_64 = zext <8 x i16> %i16 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %azl_16_64 = mul <8 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %azl_16_64 = mul <8 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %sw_32_64 = sext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %asw_32_64 = mul <8 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %asw_32_64 = mul <8 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %asl_32_64 = mul <8 x i64> %sl1_32_64, %sl2_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %zw_32_64 = zext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %azw_32_64 = mul <8 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %azw_32_64 = mul <8 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <8 x i32> %i32 to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %azl_32_64 = mul <8 x i64> %zl1_32_64, %zl2_32_64
@ -1971,15 +1971,15 @@ define void @extmulv16(<16 x i8> %i8, <16 x i16> %i16, <16 x i32> %i32, <16 x i6
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %zl2_8_32 = zext <16 x i8> %i8 to <16 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %azl_8_32 = mul <16 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %sw_8_64 = sext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %asw_8_64 = mul <16 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %asw_8_64 = mul <16 x i64> %i64, %sw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %sl1_8_64 = sext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %sl2_8_64 = sext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %asl_8_64 = mul <16 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %asl_8_64 = mul <16 x i64> %sl1_8_64, %sl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %zw_8_64 = zext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %azw_8_64 = mul <16 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %azw_8_64 = mul <16 x i64> %i64, %zw_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %zl1_8_64 = zext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %zl2_8_64 = zext <16 x i8> %i8 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %azl_8_64 = mul <16 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %azl_8_64 = mul <16 x i64> %zl1_8_64, %zl2_8_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %sw_16_32 = sext <16 x i16> %i16 to <16 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %asw_16_32 = mul <16 x i32> %i32, %sw_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <16 x i16> %i16 to <16 x i32>
@ -1991,22 +1991,22 @@ define void @extmulv16(<16 x i8> %i8, <16 x i16> %i16, <16 x i32> %i32, <16 x i6
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <16 x i16> %i16 to <16 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %azl_16_32 = mul <16 x i32> %zl1_16_32, %zl2_16_32
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %sw_16_64 = sext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %asw_16_64 = mul <16 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %asw_16_64 = mul <16 x i64> %i64, %sw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %sl1_16_64 = sext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %sl2_16_64 = sext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %asl_16_64 = mul <16 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %asl_16_64 = mul <16 x i64> %sl1_16_64, %sl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %zw_16_64 = zext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %azw_16_64 = mul <16 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %azw_16_64 = mul <16 x i64> %i64, %zw_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %zl1_16_64 = zext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %zl2_16_64 = zext <16 x i16> %i16 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %azl_16_64 = mul <16 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %azl_16_64 = mul <16 x i64> %zl1_16_64, %zl2_16_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %sw_32_64 = sext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %asw_32_64 = mul <16 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %asw_32_64 = mul <16 x i64> %i64, %sw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %asl_32_64 = mul <16 x i64> %sl1_32_64, %sl2_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %zw_32_64 = zext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %azw_32_64 = mul <16 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %azw_32_64 = mul <16 x i64> %i64, %zw_32_64
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <16 x i32> %i32 to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %azl_32_64 = mul <16 x i64> %zl1_32_64, %zl2_32_64

View File

@ -368,7 +368,7 @@ define void @vi64() {
; CHECK-LABEL: 'vi64'
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %c2 = add <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %d2 = sub <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %e2 = mul <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %e2 = mul <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %f2 = ashr <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %g2 = lshr <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %h2 = shl <2 x i64> undef, undef
@ -377,7 +377,7 @@ define void @vi64() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %k2 = xor <2 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %c4 = add <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %d4 = sub <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %e4 = mul <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %e4 = mul <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %f4 = ashr <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %g4 = lshr <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %h4 = shl <4 x i64> undef, undef
@ -386,7 +386,7 @@ define void @vi64() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %k4 = xor <4 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %c8 = add <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %d8 = sub <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %e8 = mul <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %e8 = mul <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %f8 = ashr <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %g8 = lshr <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %h8 = shl <8 x i64> undef, undef
@ -395,7 +395,7 @@ define void @vi64() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %k8 = xor <8 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %c16 = add <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %d16 = sub <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %e16 = mul <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %e16 = mul <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %f16 = ashr <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %g16 = lshr <16 x i64> undef, undef
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %h16 = shl <16 x i64> undef, undef

View File

@ -113,7 +113,7 @@ define <8 x i32> @t12(<8 x i32> %a, <8 x i32> %b) {
define <2 x i64> @t13(<2 x i64> %a, <2 x i64> %b) {
; THROUGHPUT-LABEL: 't13'
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %1 = mul nsw <2 x i64> %a, %b
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %1 = mul nsw <2 x i64> %a, %b
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <2 x i64> %1
;
%1 = mul nsw <2 x i64> %a, %b
@ -122,7 +122,7 @@ define <2 x i64> @t13(<2 x i64> %a, <2 x i64> %b) {
define <4 x i64> @t14(<4 x i64> %a, <4 x i64> %b) {
; THROUGHPUT-LABEL: 't14'
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %1 = mul nsw <4 x i64> %a, %b
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %1 = mul nsw <4 x i64> %a, %b
; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x i64> %1
;
%1 = mul nsw <4 x i64> %a, %b