Commit Graph

902 Commits

Author SHA1 Message Date
Craig Topper d32ed9b27e [RISCV] Use a ComplexPattern to merge the PatFrags for removing unneeded masks on shift amounts.
Rather than having patterns with and without an AND, use a
ComplexPattern to handle both cases.

Reduces the isel table by about 700 bytes.
2021-02-12 14:03:23 -08:00
Craig Topper 1697cc78b1 [RISCV] Add support for integer fixed vector setcc
I believe I've covered all orderings of splat operands here. Better
canonicalization in lowering might help reduce this. I did not handle
the immediate adjustments needed for set(u)gt/set(u)lt.

Testing here is limited to byte types because the scalable vector
type used for masks for the store is calculated assuming 8 byte
elements. But for the setcc its based on the element count of the
container type for the setcc input. So they don't agree. We'll need
to enhanced D96352 to handle this I think.

Differential Revision: https://reviews.llvm.org/D96443
2021-02-12 09:29:41 -08:00
Craig Topper 875c76de2b [RISCV] Add support for matching .vx and .vi forms of binary instructions for fixed vectors.
Unlike scalable vectors, I'm only using a ComplexPattern for
the immediate itself. The vmv_v_x is matched explicitly. We igore
the VL argument when matching a binary operator, but we do check
it when matching splat directly.

I left out tests for vXi64 as they fail on rv32 right now.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96365
2021-02-12 09:18:10 -08:00
luxufan feaf1d81e3 [RISCV] Change parseVTypeI function
Change parseVTypeI function to Make the added vset instruction test cases report more concrete error message.

Differential Revision: https://reviews.llvm.org/D96218
2021-02-12 19:38:34 +08:00
Fraser Cormack e88da1d677 [RISCV] Add support for integer fixed min/max
This patch extends the initial fixed-length vector support to include
smin, smax, umin, and umax.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96491
2021-02-12 09:19:45 +00:00
Craig Topper 7a7836b4d8 [RISCV] Add a pattern for a scalable vector mask vnot.
We can use a vnand.mm with the same register for both inputs.
This avoids materializing an alls ones constant with vmset.mm.
2021-02-11 15:34:58 -08:00
ShihPo Hung 9e62c9146d [RISCV] Initial support for insert/extract subvector
This patch handles cast-like insert_subvector & extract_subvector
in which case:
1. index starts from 0.
2. inserting a fixed-width vector into a scalable vector,
   or extracting a fixed-width vector from a scalable vector.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D96352
2021-02-11 14:35:49 -08:00
Craig Topper 033b1bd185 [RISCV] Add support loads, stores, and splats of vXi1 fixed vectors.
This refines how we determine which masks types are legal and adds
support for loads, stores, and all ones/zeros splats.

I left a fixme in store handling where I think we need to zero
extra bits if the type isn't a multiple of a byte. If I remember
right from X86 there was some case we could have a store of a
1, 2, or 4 bit mask and have a scalar zextload that then expected the
bits to be 0. Its tricky to zero the bits with RVV. We need to do
something like round VL up, zero a register, lower the VL back down,
then do a tail undisturbed move into the zero register. Another
option might be to generate a mask of 1/2/4 bits set with a VL of 8
and use that to mask off the bits.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96468
2021-02-11 09:13:16 -08:00
Jessica Clarke ca606dc988 [RISCV] More whitespace and comment typo fixes in RISCVInstrInfoC.td 2021-02-11 02:32:36 +00:00
Jessica Clarke 0973ce8596 [RISCV] Fix whitespace in RISCVInstrInfoC.td 2021-02-11 02:23:09 +00:00
Craig Topper 350ab4e617 [RISCV] Use OperandTransform field of ImmLeaf to slightly simplify a couple bitmanip patterns. NFC
This binds the SDNodeXForm to the ImmLeaf so we only need to mention
the ImmLeaf in both the input and output pattern.
2021-02-10 17:52:07 -08:00
Craig Topper fc4d780eaf [RISCV] Remove superfluous semicolon. NFC 2021-02-10 11:20:29 -08:00
Craig Topper cb161b3a88 [RISCV] Add support for matching .vf forms of fadd/fsub/fmul/fdiv/fma for fixed vectors.
fma+neg will come in a different patch since I haven't done it for .vv
yet either.

Differential Revision: https://reviews.llvm.org/D96375
2021-02-10 10:16:27 -08:00
Craig Topper 0c254b4a69 [RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles.
The test cases extract a fixed element from a vector and splat it
into a vector. This gets DAG combined into a splat shuffle.

I've used some very wide vectors in the test to make sure we have
at least a couple tests where the element doesn't fit into the
uimm5 immediate of vrgather.vi so we fall back to vrgather.vx.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96186
2021-02-10 10:01:56 -08:00
Fraser Cormack a3c74d6d53 [RISCV] Add support for selecting vid.v from build_vector
This patch optimizes a build_vector "index sequence" and lowers it to
the existing custom RISCVISD::VID node. This pattern is common in
autovectorized code.

The custom node was updated to allow it to be used by both scalable and
fixed-length vectors, thus avoiding pattern duplication.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96332
2021-02-10 10:58:40 +00:00
Craig Topper 18ff7e045a [RISCV] Make the min and max vector width command line options more consistent and check their relationship to each other. 2021-02-09 10:47:23 -08:00
Craig Topper fd5adae02c [RISCV] Remove SRO* and SLO* instructions from bitmanip.
As of the current draft these are no longer being considered
for the bitmanip spec. It wasn't clear what sub extension they
belonged in in the 0.93 spec.

So remove them. They can always be added back if something changes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96157
2021-02-09 09:35:05 -08:00
Nemanja Ivanovic f6e4b9fc06 [RISCV] Fix shared libs build
Commit a2d19bad07 introduced a
dependency in the RISCV disassembler on two additional libraries
(MC, RISCVDesc) which wasn't added to the CMakeLists.txt. This
causes shared library builds to break. This patch just adds them
to fix failures seen on some bots, such as the PPC64LE Multistage.
2021-02-09 06:14:25 -06:00
Hsiangkai Wang a2d19bad07 [RISCV] Use whole register load/store for generic load/store.
In vector v0.10, there are whole vector register load/store
instructions. I suggest to use the whole register load/store
instructions for generic load/store for scalable vector types. It could
save up vset{i}vl{i} for these load/store.

For fractional LMUL, I keep to use vle{eew}.v/vse{eew}.v instructions to
load/store partial vector registers.

Differential Revision: https://reviews.llvm.org/D95853
2021-02-09 15:52:04 +08:00
Hsiangkai Wang a5b07a221a [RISCV] Initial support of LoopVectorizer for RISC-V Vector.
Define an option -riscv-vector-bits-max to specify the maximum vector
bits for vectorizer. Loop vectorizer will use the value to check if it
is safe to use the whole vector registers to vectorize the loop.

It is not the optimum solution for loop vectorizing for scalable vector.
It assumed the whole vector registers will be used to vectorize the code.
If it is possible, we should configure vl to do vectorize instead of
using whole vector registers.

We only consider LMUL = 1 in this patch.

This patch just an initial work for loop vectorizer for RISC-V Vector.

Differential Revision: https://reviews.llvm.org/D95659
2021-02-09 06:32:18 +08:00
Craig Topper b49aaed8c7 [RISCV] Use _COMMUTABLE fma pseudos for fixed vectors.
This matches what we do in the VLMAX SDNode patterns.
2021-02-08 11:27:23 -08:00
Craig Topper 8d8cafa32e [RISCV] Add support for splat fixed length build_vectors using RVV.
Building on the fixed vector support from D95705

I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to
lowering the intrinsics to it. This allows us to share the same
isel patterns for both.

This doesn't handle splats of i64 on RV32 yet. The build_vector
gets converted to a vXi32 build_vector+bitcast during type
legalization. Not sure the best way to handle this at the moment.

Differential Revision: https://reviews.llvm.org/D96108
2021-02-08 11:12:56 -08:00
Craig Topper b8d719fbe8 [RISCV] Add support for fixed vector FMA.
Follow up to D95705. Does not include the commuting support from D95800.

Differential Revision: https://reviews.llvm.org/D96103
2021-02-08 11:12:56 -08:00
Craig Topper a719b667a9 [RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions.
This is an alternative to D95563.

This is modeled after a similar feature for AArch64's SVE that uses
predicated scalable vector instructions.a

Rather than use predication, this patch uses an explicit VL operand.
I've limited it to always use LMUL=1 for now, but we can improve this
in the future.

This requires a bunch of new ISD opcodes to carry the VL operand.
I think we can probably lower intrinsics to these ISD opcodes to
cut down on the size of the isel table. Which is why I've added
patterns for all integer/float types and not just LMUL=1.

I'm only testing one vector width right now, but the width is
programmable via the command line.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95705
2021-02-08 10:41:30 -08:00
Craig Topper b7b4f4cbc3 [RISCV] Make scalable vector FMA commutable for register allocation.
This adds support for commuting operands and converting between
vfmadd and vfmacc to avoid register copies.

To avoid messing up intrinsic behavior, I've added new pseudo
instructions that have the isCommutable flag set. These pseudos also
force a tail agnostic policy. The intrinsic version still use
the tail undisturbed policy.

For best results it looks like we need to start with fmadd and only
pick fmacc if its beneficial. MachineCSE commutes without contraining
the operands and then commutes back if it didn't help with CSE. So
I've made sure that when the operand choice isn't constrained, we
will keep fmadd for MachineCSE and when it does the second commute,
we get back the original instruction.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95800
2021-02-08 10:05:33 -08:00
Craig Topper cc2c45dc54 [RISCV] Use SplatPat/SplatPat_simm5 to handle PseudoVMV_V_X_/PseudoVMV_V_I_ selection as well.
This ensures that we'll match immediates consistently regardless
of whether we match them as a standalone splat or as part of
another operation.

While I was there I added complexities to the simm5/uimm5 patterns so
we didn't have to assume that the 1 on the non-immediate was lower
than what tablegen inferred.

I had to make a minor tweak to tablegen to fix one place that
didn't expect to see a ComplexPattern that wasn't a "leaf".

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96199
2021-02-08 09:48:27 -08:00
Mikael Holmen eb8c27c60c [RISCV] Use std::make_tuple to make some toolchains happy again
My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with:

12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1717:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_FADD, Op.getOperand(0),
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1720:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_SEQ_FADD, Op.getOperand(1), Op.getOperand(0)};
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 2 errors generated.

This commit adds explicit calls to std::make_tuple to work around
the problem.
2021-02-08 14:37:25 +01:00
Fraser Cormack b46aac125d [RISCV] Support the scalable-vector fadd reduction intrinsic
This patch adds support for both the fadd reduction intrinsic, in both
the ordered and unordered modes.

The fmin and fmax intrinsics are not currently supported due to a
discrepancy between the LLVM semantics and the RVV ISA behaviour with
regards to signaling NaNs. This behaviour is likely fixed in version 2.3
of the RISC-V F/D/Q extension, but until then the intrinsics can be left
unsupported.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95870
2021-02-08 09:52:27 +00:00
Craig Topper 3c767b96dc [RISCV] Correct types in tablegen multiclasses found by D95874. 2021-02-05 11:55:58 -08:00
Fraser Cormack e046c0c28b [RISCV] Support scalable-vector integer reduction intrinsics
This patch adds support for the integer reduction intrinsics supported
by RVV. This excludes "mul" which has no corresponding instruction.

The reduction instructions in RVV have slightly complicated type
constraints given they always produce a single "M1" vector register.

They are lowered to custom nodes including the second "scalar" reduction
operand to simplify the patterns and in the hope that they can be useful
for future DAG combines.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95620
2021-02-05 10:10:08 +00:00
Fraser Cormack c3eb2da6c4 [RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes
This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where
SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from
the result. This allows us to eliminate redundant sign extensions.

For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way
so that we don't need TableGen patterns for some and not others.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95741
2021-02-05 10:05:22 +00:00
Fraser Cormack af48d2bfc2 [RISCV] Add patterns for scalable-vector fsqrt
This patch adds support for lowering the sqrt intrinsic to the RVV
vfsqrt instruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96012
2021-02-05 09:39:19 +00:00
Craig Topper 6b280ce34c [RISCV] Use LLVMScalarOrSameVectorWidth to make avoid needing to mention the index type for vrgatherei16 intrinsics.
Add .vv to the intrinsic name to be consistent with D95979.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D95981
2021-02-04 20:26:45 -08:00
Craig Topper 25ff302a79 [RISCV] Split vrgather intrinsics into separate vrgather.vv and vrgather.vx intrinsics.
The vrgather.vv instruction uses a vector of indices with the same
SEW as operand 0. The vrgather.vx instructions use a scalar index
operand of XLen bits.

By splitting this into 2 intrinsics we are able to use LLVMatchType
in the definition to avoid specifying the type for the index operand
when creating the IR for the intrinsic. For .vv it will match the
operand 0 type. And for .vx it will match the type of the vl operand
we already needed to specify a type for.

I'm considering splitting more intrinsics. This was a somewhat
odd one because the .vx doesn't use the element type, it always
use XLen.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D95979
2021-02-04 19:50:12 -08:00
Hsiangkai Wang 63baeec66e [RISCV] Load/store vector mask types.
Use vle1.v/vse1.v to load/store vector mask types.

Differential Revision: https://reviews.llvm.org/D93364
2021-02-03 13:44:15 +08:00
Hsiangkai Wang c7189ba785 [RISCV] Add new vector instructions in v0.10.
* Add new vector instructions in v0.10.
 - load/store for mask value vle1.v vse1.v
 - vsetivli for 0-31 immediate vector length.
* Rename vector instructions in v0.10.
 - vfrsqrte7 -> vfrsqrt7
 - vfrece7 -> vfrec7
* Reserve memory width encodings for EEW>128b.

Differential Revision: https://reviews.llvm.org/D95781
2021-02-03 13:28:58 +08:00
Fraser Cormack b4106f9c7b [RISCV] Fix incorrect RVV sdiv/udiv lowering
Due to a clerical error, the sdiv operation was mapping to vdivu and
udiv to vdiv, when the opposite mapping is the correct one.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95869
2021-02-02 18:35:53 +00:00
Craig Topper c4fd1981a7 [RISCV] Correct types in tablegen multiclasses found by D95874. 2021-02-02 10:39:47 -08:00
Craig Topper 912306ef21 [RISCV] Use a ComplexPattern to merge isel patterns for vector load/store with GPR and FrameIndex addresses.
This reduces the isel table size by about 3000 bytes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95844
2021-02-02 10:20:52 -08:00
Craig Topper e7f9a83499 [RISCV] Replace NoX0 SDNodeXForm with a ComplexPattern to do the selection of the VL operand.
I think this is a more standard way of doing this.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D95833
2021-02-02 00:08:58 -08:00
Craig Topper 72b31ad4b8 [RISCV] Add scalable vector support for floating point FMA instructions
A follow up patch will add support for commuting operands or
changing opcode to vfmacc and friends.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95662
2021-02-01 09:52:43 -08:00
Craig Topper 6a3ab66625 [RISCV] Update comment text from D95774. NFC 2021-02-01 09:52:43 -08:00
Craig Topper 1097ee61bf [RISCV] Optimize (srl (and X, 0xffff), C) -> (srli (slli X, 16), 16 + C).
Rather than materializing the 0xffff immediate for the AND, use
a shift left to remove the upper bits and then shift in zeros
from the right.

This pattern occurs when type legalizing an i16 right shift.

I've implemented this with custom selection code for a number of
reasons. I've limited this to the AND having a single use. We need
to compensate for SimplifyDemandedBits altering the AND mask. I'm
using *W opcodes on RV64. We may want to generlize this in the
future. For all these reason it seemed easiest to do it this way.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95774
2021-02-01 09:37:55 -08:00
Craig Topper 44cc5abbf9 [RISCV] Custom lower fshl/fshr with Zbt extension.
We need to add a mask to the shift amount for these operations
to use the FSR/FSL instructions. We were previously doing this
in isel patterns, but custom lowering will make the mask
visible to optimizations earlier.
2021-01-31 17:49:15 -08:00
Craig Topper 3fdf2a56dd [RISCV] Use MVT instead of EVT in RISCVISelDAGToDAG.cpp
All this code runs post type legalization so we should have
exclusively legal types. The methods on MVT should be more
efficient than EVT.
2021-01-30 15:57:15 -08:00
Hsiangkai Wang 9847023660 [RISCV] Update the version number to v0.10 for vector. 2021-01-30 07:55:58 +08:00
Hsiangkai Wang 282aca10ae [RISCV] Update the version number to v0.10 for vector.
v0.10 is tagged in V specification. Update the version to v0.10.

Differential Revision: https://reviews.llvm.org/D95680
2021-01-30 07:20:05 +08:00
Hsiangkai Wang e08b67f3a8 [NFC][RISCV] Remove redundant pseudo instructions for vector load/store.
Not all combinations of SEW and LMUL we need to support. For example, we
only need to support [M1, M2, M4, M8] for SEW = 64. There is no need to
define pseudos for PseudoVLSE64MF8, PseudoVLSE64MF4, and PseudoVLSE64MF2.

Differential Revision: https://reviews.llvm.org/D95667
2021-01-30 07:20:05 +08:00
Kazu Hirata 046cfb8565 [llvm] Forward-declare formatted_raw_ostream (NFC)
Various *TargetStreamer.h need formatted_raw_ostream but rely on a
forward declaration of formatted_raw_ostream in MCStreamer.h.  This
patch adds forward declarations right in *TargetStreamer.h.

While we are at it, this patch removes the one in MCStreamer.h, where
it is unnecessary.
2021-01-28 22:21:13 -08:00
Christudasan Devadasan 892e4567e1 Support a list of CostPerUse values
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.

For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D86836
2021-01-29 10:14:52 +05:30