Commit Graph

738 Commits

Author SHA1 Message Date
Caroline Concatto f2b749be15 [CostModel][SVE] Add cost model for shuffle reverse with i1 and scalable vector
This patch adds the cost model for experimental.vector.reverse
with scalable vector types: nxv16i1, nxv8i1, nxv4i1 and nxv2i1.
These types are missing from the previous cost model patch D95603.

The cost model for experimental.vector.reverse with 1 bit mask is used by
loop vectorization in the patch D95363

Differential Revision: https://reviews.llvm.org/D97758
2021-03-04 18:52:59 +00:00
Alexey Bataev 60470ac7ff [Cost]Add tests for boolean and/or reductions, NFC.
Tests with the default costs for boolean and/or reductions.

Differential Revision: https://reviews.llvm.org/D97793
2021-03-03 12:34:30 -08:00
Juneyoung Lee c89d9d8a48 [TTI] Consider select form of and/or i1 as having arithmetic cost
This is a patch that updates the cost of `select i1 a, b, false` to be equivalent to that of `and i1 a, b`
as well as the cost of `select i1 a, true, b` equivalent to `or i1 a, b`.

Until now, these selects were folded into and/or i1 by InstCombine, but the transformation is poison-unsafe.
This is a step towards removing the unsafe transformation. D93065 has relevant transformations linked.
These selects should be translated into the assemblies as and/or i1 do in the same manner. The cost should be equivalent.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97360
2021-03-02 02:18:19 +09:00
Sander de Smalen f870c551f0 [AArch64] NFC: Cleanup some SVE cost-model tests.
Moved some of the `sve-getIntrinsicCost-<..>` into a single sve-intrinsics.ll
file, and simplified the tests a bit by bundling all the intrinsics in one
function (instead of testing one intrinsic per function). That makes it easier
to see the cost of the intrinsics.
2021-03-01 13:26:31 +00:00
Stelios Ioannou 30cb9c03b5 [AArch64] Add abs intrinsic costs
This patch adds cost-modelling for abs vector intrinsic.

Change-Id: I89007971bfb15f5b4a02a2eadfd43018e9a73976
2021-02-25 09:31:52 +00:00
David Green dd2dbf7ee2 [TTI] Change getOperandsScalarizationOverhead to take Type args
As a followup to D95291, getOperandsScalarizationOverhead was still
using a VF as a vector factor if the arguments were scalar, and would
assert on certain matrix intrinsics with differently sized vector
arguments. This patch removes the VF arg, instead passing the Types
through directly. This should allow it to more accurately compute the
cost without having to guess at which operands will be vectorized,
something difficult with more complex intrinsics.

This adjusts one SVE test as it is now calling the wrong intrinsic vs
veccall. Without invalid InstructCosts the cost of the scalarized
intrinsic is too low. This should get fixed when the cost of
scalarization is accounted for with scalable types.

Differential Revision: https://reviews.llvm.org/D96287
2021-02-23 13:04:59 +00:00
David Green 33ba220611 [ARM] Ensure types provided to getIntrinsicCost are valid
It appears that pointer types were causing issues for the min/max cost
code in getIntrinsicInstrCost. This makes sure that when matching
icmp/select to a min/max, we only do that for normal int or float types.
2021-02-18 14:00:23 +00:00
David Green 1a6744e3dc [ARM] Add larger than legal ICmp costs
A v8i32 compare will produce a v8i1 predicate, but during codegen the
v8i32 will be split into two v4i32, potentially requiring two v4i1
predicates to be merged into a single v8i1. Because this merging of two
v4i1's into a v8i1 is very expensive, we need to make the cost of the
compare equally high.

This patch adds the cost of that to ARMTTIImpl::getCmpSelInstrCost.
Because we don't know whether the user of the predicate can be split,
and the cost model is mostly pre-instruction, we may be pessimistic but
that should only be for larger and legal types. This also adds min/max
detection to the costmodel where it can be detected, to keep those in
line with the cost of simple min/max instructions. Otherwise for the
most part, costs that were already expensive have become more expensive.

Differential Revision: https://reviews.llvm.org/D96692
2021-02-18 11:42:17 +00:00
David Green 1fbb3287fc [ARM] MVE ICmp costing tests. NFC 2021-02-18 10:50:34 +00:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
David Green 6d835c5fcd [ARM] Add MVE abs costs
Similar to min/max, this increases the accuracy of abs intrinsics costs
under MVE.
2021-02-17 14:21:09 +00:00
David Green 415deff10b [ARM] MVE abs intrinsic costs. NFC 2021-02-17 13:54:17 +00:00
David Green 0a98efb049 [ARM] Add some basic Min/Max costs
This adds basic MVE costs for SMIN/SMAX/UMIN/UMAX, as well as MINNUM and
MAXNUM representing fmin and fmax. It tightens up the costs, not using a
ICmp+Select cost.

Differential Revision: https://reviews.llvm.org/D96603
2021-02-15 15:06:19 +00:00
Caroline Concatto b52e6c5891 [CostModel]Add cost model for experimental.vector.reverse
This patch uses the function getShuffleCost with SK_Reverse to compute the cost
for experimental.vector.reverse.
For scalable vector type, it adds a table will the legal types on
AArch64TTIImpl::getShuffleCost to not assert in BasicTTIImpl::getShuffleCost,
and for fixed vector, it relies on the existing cost model in BasicTTIImpl.

Depends on D94883

Differential Revision: https://reviews.llvm.org/D95603
2021-02-15 14:23:57 +00:00
David Green 6abe362ed7 [ARM] Fix duplicate fdiv tests, changing them to frem. NFC 2021-02-13 15:16:11 +00:00
David Green 7c2e061188 [ARM] Extra vector shuffle tests of various kinds. NFC 2021-02-13 15:03:10 +00:00
David Green b7c3de8d5a [ARM] MVE min/max cost tests. NFC 2021-02-13 11:12:12 +00:00
Sander de Smalen 1d42ba254f [BasicTTIImpl] Fix getCastInstrCost for scalable vectors by querying for ElementCount.
This fixes an overly restrictive assumption that the vector is a FixedVectorType,
in code that tries to calculate the cost of a cast operation when splitting
a too-wide vector. The algorithm works the same for scalable vectors, so this
patch removes the cast<FixedVectorType>.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D96253
2021-02-12 08:28:52 +00:00
Sander de Smalen 63d787e5d4 [CostModel] An extending load to illegal type is not free.
COST(zext (<4 x i32> load(...) to <4 x i64>)) != 0 when
<4 x i64> is an illegal result type that requires splitting
of the operation.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D96250
2021-02-12 07:59:21 +00:00
David Green b1ef919aad [ARM] Add CostKind to getMVEVectorCostFactor.
This adds the CostKind to getMVEVectorCostFactor, so that it can
automatically account for CodeSize costs, where it returns a cost of 1
not the MVEFactor used for Throughput/Latency. This helps simplify the
caller code and allows us to get the codesize cost more correct in more
cases.
2021-02-11 15:33:59 +00:00
Caroline Concatto 2cbcf3e297 [AArch64][SVE]Add cost model for broadcast shuffle
This patch adds a cost model for  SK_Broadcast in
AArch64TTIImpl::getShuffleCost with scalable vector.
Without this patch, the scalable vector type relies on  BasicTTIImpl cost
implementation and assert.

Differential Revision: https://reviews.llvm.org/D95598
2021-02-03 09:53:22 +00:00
David Green 0175cd00a1 [AArch64] Add vector saturating add intrinsic costs
This adds sadd.sat, uadd.sat, ssub.sat and usub.sat costs for AArch64,
similar to how they were recently added for ARM.

Differential Revision: https://reviews.llvm.org/D95292
2021-01-27 10:38:32 +00:00
Sander de Smalen b9417c3616 [CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D95355
2021-01-26 14:37:51 +00:00
David Green 06ab7953e9 [AArch64] Saturating add cost tests. NFC 2021-01-24 13:49:17 +00:00
David Sherwood 83e7a96c06 Fix build failure caused by 2e080eb00a 2021-01-22 09:56:53 +00:00
David Sherwood 2e080eb00a [SVE] Add support for scalable vectorization of loops with selects and cmps
I have removed an unnecessary assert in LoopVectorizationCostModel::getInstructionCost
that prevented a cost being calculated for select instructions when using
scalable vectors. In addition, I have changed AArch64TTIImpl::getCmpSelInstrCost
to only do special cost calculations for fixed width vectors and fall
back to the base version for scalable vectors.

I have added a simple cost model test for cmps and selects:

  test/Analysis/CostModel/sve-cmpsel.ll

and some simple tests that show we vectorize loops with cmp and select:

  test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll

Differential Revision: https://reviews.llvm.org/D95039
2021-01-22 09:48:13 +00:00
Jeroen Dobbelaere 121cac01e8 [noalias.decl] Look through llvm.experimental.noalias.scope.decl
Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93042
2021-01-19 20:09:42 +01:00
David Green 6a563eef13 [ARM] Expand vXi1 VSELECT's
We have no lowering for VSELECT vXi1, vXi1, vXi1, so mark them as
expanded to turn them into a series of logical operations.

Differential Revision: https://reviews.llvm.org/D94946
2021-01-19 17:56:50 +00:00
David Green f373b30923 [ARM] Add MVE add.sat costs
This adds some basic MVE sadd_sat/ssub_sat/uadd_sat/usub_sat costs,
based on when the instruction is legal. With smaller than legal types
that are promoted we generate shr(qadd(shl, shl)), so the cost is 4
appropriately.

Differential Revision: https://reviews.llvm.org/D94958
2021-01-19 15:38:46 +00:00
David Green 54e38440e7 [ARM] Expand add.sat/sub.sat cost checks. NFC 2021-01-19 15:06:06 +00:00
Caroline Concatto 172f1f8952 [AArch64][SVE]Add cost model for vector reduce for scalable vector
This patch computes the cost for vector.reduce<operand> for scalable vectors.
The cost is split into two parts:  the legalization cost and the horizontal
reduction.

Differential Revision: https://reviews.llvm.org/D93639
2021-01-19 11:54:16 +00:00
David Green dcefcd51e0 [ARM] Update trunc costs
We did not have specific costs for larger than legal truncates that were
not otherwise cheap (where they were next to stores, for example). As
MVE does not have a dedicated instruction for them (and we do not use
loads/stores yet), they should be expensive as they get expanded to a
series of lane moves.

Differential Revision: https://reviews.llvm.org/D94260
2021-01-11 08:59:28 +00:00
David Green 0c8b748f32 [ARM] Additional trunc cost tests. NFC 2021-01-11 08:35:16 +00:00
Caroline Concatto 01c190e907 [AArch64][CostModel]Fix gather scatter cost model
This patch fixes a bug introduced in the patch:
https://reviews.llvm.org/D93030

This patch pulls the test for scalable vector to be the first instruction
to be checked. This avoids the Gather and Scatter cost model for AArch64 to
compute the number of vector elements for something that is not a vector and
therefore crashing.
2021-01-07 14:02:08 +00:00
Peter Waller 3e357ecd44 [llvm][NFC] Disallow all warnings in TypeSize tests
This is a follow-up to a request from a reviewer [0]. The text may change in
the future and these tests should not produce any warning output.

[0] https://reviews.llvm.org/D91806#inline-879243

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D94161
2021-01-06 17:17:07 +00:00
Caroline Concatto 060cfd9795 [AArch64][SVE]Add cost model for masked gather and scatter for scalable vector.
A new TTI interface has been added 'Optional <unsigned>getMaxVScale' that
    returns the maximum vscale for a given target.
    When known getMaxVScale is used to compute the cost of masked gather scatter
    for scalable vector.

    Depends on D92094

    Differential Revision: https://reviews.llvm.org/D93030
2021-01-04 13:59:58 +00:00
Juneyoung Lee c1f3033697 Update inselt tests at llvm/test/Analysis to have poison as shufflevector's placeholder (NFC)
File listed by:

grep -R -E "^[^;]*shufflevector <.*> .*, <.*> undef" . | grep inseltpoison

Updated with:

sed -i -E 's/shufflevector <(.*)> (.*), <(.*)> undef/shufflevector <\1> \2, <\3> poison/g' $1
2020-12-31 17:12:37 +09:00
Juneyoung Lee 3036547248 Precommit analysis/etc tests for inselt poison placeholder
This adds tests in directories missing from https://reviews.llvm.org/rGdb7a2f347f132b3920415013d62d1adfb18d8d58
2020-12-24 12:14:24 +09:00
Bradley Smith e0b9c5df26 [CostModel] Add costs for llvm.experimental.vector.{extract,insert} intrinsics
Adds cost model support for the new llvm.experimental.vector.{extract,insert}
intrinsics, using the existing getExtractSubvectorOverhead and
getInsertSubvectorOverhead functions for shuffles.

Previously this case would throw an assertion.

Differential Revision: https://reviews.llvm.org/D93043
2020-12-16 13:39:04 +00:00
Caroline Concatto 60e4698b9a [CostModel]Replace FixedVectorType by VectorType in costgetIntrinsicInstrCost
This patch replaces FixedVectorType by VectorType in getIntrinsicInstrCost
in BasicTTIImpl.h. It re-arranges the scalable type test earlier return
and add tests for scalable types.

Depends on D91532

Differential Revision: https://reviews.llvm.org/D92094
2020-12-16 13:06:23 +00:00
David Green a4823377fd [ARM] Add basic masked load/store costs
This adds some basic MVE masked load/store costs, notably changing the
cost of legal loads/stores to the MVECostFactor and the cost of
scalarized instructions to 8*NumElts.

Differential Revision: https://reviews.llvm.org/D86538
2020-12-12 15:26:32 +00:00
Simon Pilgrim db900995ed [CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.
2020-12-06 14:08:26 +00:00
Sanjay Patel 136f98e523 [x86] adjust cost model values for minnum/maxnum with fast-math-flags
Without FMF, we lower these intrinsics into something like this:

vmaxsd	%xmm0, %xmm1, %xmm2
vcmpunordsd	%xmm0, %xmm0, %xmm0
vblendvpd	%xmm0, %xmm1, %xmm2, %xmm0

But if we can ignore NANs, the single min/max instruction is enough
because there is no need to fix up the x86 logic that corresponds to
X > Y ? X : Y.

We probably want to make other adjustments for FP intrinsics with FMF
to account for specialized codegen (for example, FSQRT).

Differential Revision: https://reviews.llvm.org/D92337
2020-12-01 10:45:53 -05:00
Sanjay Patel 40dc535b5a [x86] add tests for maxnum/minnum with nnan; NFC 2020-11-30 14:30:28 -05:00
Sjoerd Meijer 5110ff0817 [AArch64][CostModel] Fix cost for mul <2 x i64>
This was modeled to have a cost of 1, but since we do not have a MUL.2d this is
scalarized into vector inserts/extracts and scalar muls.

Motivating precommitted test is test/Transforms/SLPVectorizer/AArch64/mul.ll,
which we don't want to SLP vectorize.

Test Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll
unfortunately needed changing, but the reason is documented in
LoopVectorize.cpp:6855:

  // The cost of executing VF copies of the scalar instruction. This opcode
  // is unknown. Assume that it is the same as 'mul'.

which I will address next as a follow up of this.

Differential Revision: https://reviews.llvm.org/D92208
2020-11-30 11:36:55 +00:00
Simon Pilgrim 969918e177 [DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal
If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion.

Allows us to move the x86-specific lowering to the generic expansion code.

Differential Revision: https://reviews.llvm.org/D92183
2020-11-27 11:18:58 +00:00
Sjoerd Meijer a3b1fcbc0c [AArch64][CostModel] Precommit some vector mul tests. NFC.
The cost-model is not getting the cost right for a mul with <2 x i64>
operands, i.e. we don't have a MUL.2d, and this is precommitting some
tests before adjusting this.
2020-11-26 13:23:11 +00:00
Florian Hahn 926681b6be
[CostModel] Add basic implementation of getGatherScatterOpCost.
Add a basic implementation of getGatherScatterOpCost to BasicTTIImpl.

The implementation estimates the cost of scalarizing the loads/stores,
the cost of packing/extracting the individual lanes and the cost of
only selecting enabled lanes.

This more accurately reflects the current cost on targets like AArch64.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91984
2020-11-26 12:02:25 +00:00
Simon Pilgrim 385a27d6cd [CostModel][X86] Refresh ISD::ABS costs
Update costs now that D92095 and D92102 have tweaked the SSE2 implementation

The SSE42 BLENDVPD cost can actually be used on SSE41 as we don't attempt to generate PCMPGT anymore

Add scalar i16/i32/i64 costs as we can do this cheaply with CMOV
2020-11-25 18:40:19 +00:00
Florian Hahn 14c0185bfe
[AArch64] Add scatter cost model tests. 2020-11-23 18:36:56 +00:00