Commit Graph

394 Commits

Author SHA1 Message Date
Fraser Cormack 0c5b789c73 [RISCV] Support fixed-length vectors in the calling convention
This patch adds fixed-length vector support to the calling convention
when RVV is used to lower fixed-length vectors. The scheme follows the
regular vector calling convention for the argument/return registers, but
uses scalable vector container types as the LocVTs, and converts to/from
the fixed-length vector value types as required.

Fixed-length vector types may be split when the combination of minimum
VLEN and the maximum allowable LMUL is not large enough to fully contain
the vector. In this case the behaviour differs between fixed-length
vectors passed as parameters and as return values:
1. For return values, vectors must be passed entirely via registers or
via the stack.
2. For parameters, unlike scalar values, split vectors continue to be
passed by value, and are split across multiple registers until there are
no remaining registers. Thus vector parameters may be found partly in
registers and partly on the stack.

As with scalable vectors, the first fixed-length mask vector is passed
via v0. Split mask fixed-length vectors are passed first via v0 and then
via the next available vector register: v8,v9,etc.

The handling of vector return values uses all available argument
registers v8-v23 which does not adhere to the calling convention we're
supposedly implementing, but since this issue affects both fixed-length
and scalable-vector values, it was left as-is.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97954
2021-03-15 10:43:51 +00:00
Hsiangkai Wang a81dff1e58 [RISCV] Support inline asm for vector instructions.
Types of fractional LMUL and LMUL=1 are all using VR register class. When
using inline asm, it will use the first type in the register class as the
type for the register. It is not necessary the same as the value type. We
need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal
to put the value in the corresponding register class.

Differential Revision: https://reviews.llvm.org/D97480
2021-03-15 11:02:18 +08:00
Craig Topper 51151828ac [RISCV] Teach normaliseSetCC to canonicalize X > -1 to X >= 0 and X < 1 to 0 >= X.
This allows the use of BGE with X0 instead of puting -1/1 in a
register.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D98542
2021-03-12 11:50:10 -08:00
Fraser Cormack 641f5700f9 [RISCV] Optimize INSERT_VECTOR_ELT sequences
This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways.
Primarily, it removes the use of vslidedown during lowering, and the
vector element is inserted entirely using vslideup with a custom VL and
slide index.

Additionally, lowering of i64-element vectors on RV32 has been optimized
in several ways. When the 64-bit value to insert is the same as the
sign-extension of the lower 32-bits, the codegen can follow the regular
path. When this is not possible, a new sequence of two i32 vslide1up
instructions is used to get the vector element into a vector. This
sequence was suggested by @craig.topper. From there, the value is slid
into the final position for more consistent lowering across RV32 and
RV64.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98250
2021-03-12 09:13:38 +00:00
Fraser Cormack 4d2d5855c7 [RISCV] Fix up stale VECREDUCE comments. NFC.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98399
2021-03-12 08:49:46 +00:00
Craig Topper 1d26bbcf9b [RISCV] Return false from isShuffleMaskLegal except for splats.
We don't support any other shuffles currently.

This changes the bswap/bitreverse tests that check for this in
their expansion code. Previously we expanded a byte swapping
shuffle through memory. Now we're scalarizing and doing bit
operations on scalars to swap bytes.

In the future we can probably use vrgather.vx to do a byte swap
shuffle.
2021-03-11 20:02:49 -08:00
Craig Topper c82f442954 [RISCV] Support fixed vector copysign.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98394
2021-03-11 09:57:24 -08:00
Craig Topper 0dff8a9627 [RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98372
2021-03-11 09:39:50 -08:00
Craig Topper 9c841cb8e8 [RISCV] Support extract_vector_elt for fixed and scalable masked registers.
This uses a really simple approach of converting to an i8 vector
and extracting. This is probably not the best approach especially
if you know the index is constant.

Other ideas:
-Store to stack temporary using vse1, load as scalar and shift.
-Sort of bitcast the vector to a vector of i8, slide down the
 appropriate 8 bit element, copy to scalar, shift down the
 correct bit within the 8 bits we extracted. Not exactly sure
 how to describe such a bitcast from i1 vector to i8 vector
 within the type system for elements less than 8.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98310
2021-03-11 09:26:44 -08:00
Craig Topper 9106d04554 [RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.
On riscv32, i64 isn't a legal scalar type but we would like to
support scalable vectors of i64.

This patch introduces a new node that can represent a splat made
of multiple scalar values. I've used this new node to solve the current
crashes we experience when getConstant is used after type legalization.

For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS
when needed and then handling the SPLAT_VECTOR_PARTS later during
LegalizeOps. I've remove the special case I previously put in for
ABS for D97991 as the default expansion is now able to succesfully
use getConstant.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98004
2021-03-10 09:46:18 -08:00
Craig Topper 0c73a506e8 [RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32.
Currently we crash in type legalization any time an intrinsic
uses a scalar i64 on RV32.

This patch adds support for type legalizing this to prevent
crashing. I don't promise that it uses the best possible codegen
just that it is functional.

This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x
intrinsic and intrinsics that take a scalar input, splat it and
then do some operation.

For vmv.v.x we'll either rely on hardware sign extension for
constants or we'll convert it to multiple splats and bit
manipulation.

For vmv.s.x we use a really unoptimal sequence inspired by what
we do for an INSERT_VECTOR_ELT.

For the third case we'll either try to use the .vi form for
constants or convert to a complicated splat and bitmanip and use
the .vv form of the operation.

I've renamed the ExtendOperand field to SplatOperand now use it
specifically for the third case. The first two cases are handled
by custom lowering specifically for those intrinsics.

I haven't updated all tests yet, but I tried to cover a subset
that includes single-width, widening, and narrowing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97895
2021-03-10 09:45:38 -08:00
Craig Topper 1e39118638 [RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32.
The type legalizer will visit the result before the operands. To
avoid creating an illegal target specific node or falling back to
scalarization, we need to manually split vector operands.

This still doesn't handle the case of non-power of 2 operands
which need to be widened. I'm not sure the type legalizer is
ready for it. I think we would need to insert an
INSERT_SUBVECTOR with the power of 2 type we want, with an undef
first operand, and the non-power of 2 orignal operand as the vector
to insert. Then fill in the neutral elements into the elements the
padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector.
From there we carry on splitting if needed to get to a legal type
then do the target specific code.

The problem with this is the type legalizer doesn't know how to
widen an insert_subvector yet. We would need to add that including
the handling for a non-undef first vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98292
2021-03-10 09:27:38 -08:00
Craig Topper 351844edf1 [RISCV] Add support for VECTOR_REVERSE for scalable vector types.
I've left mask registers to a future patch as we'll need
to convert them to full vectors, shuffle, and then truncate.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97609
2021-03-09 10:03:45 -08:00
Craig Topper 77ac3166e5 [RISCV] Add support for fixed vector reductions.
I've included tests that require type legalization to split the
vector. The i64 version of these scalarizes on RV32 due to type
legalization visiting the result before the vector type. So we
have to abort our custom expansion to avoid creating target
specific nodes with an illegal type. Then type legalization ends
up scalarizing. We might be able to fix this by doing custom
splitting for large vectors in our handler to get down to a legal
type.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98102
2021-03-09 09:39:59 -08:00
Craig Topper 1c7ad4dd88 [RISCV] Don't modify the SEW immediate on the V extension pseudo instructions after inserting VSETVLI.
Previously we set the value to -1, but the SEW information could
be useful for scheduling.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D98062
2021-03-09 09:02:19 -08:00
Craig Topper 72ecf2f43f [RISCV] Optimize fixed vector ABS. Fix crash on scalable vector ABS for SEW=64 with RV32.
The default fixed vector expansion uses sra+xor+add since it can't
see that smax is legal due to our custom handling. So we select
smax(X, sub(0, X)) manually.

Scalable vectors are able to use the smax expansion automatically
for most cases. It crashes in one case because getConstant can't build a
SPLAT_VECTOR for nxvXi64 when i64 scalars aren't legal. So
we manually emit a SPLAT_VECTOR_I64 for that case.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97991
2021-03-09 08:51:03 -08:00
Craig Topper 7a64cc4a76 [RISCV] Make use of DAG.getNeutralElement in lowerVECREDUCE to avoid repeating the same list of constants. NFC
Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D98091
2021-03-08 09:11:10 -08:00
Fraser Cormack 18173c57bd [RISCV] Add new entry points to getContainerForFixedLengthVector
While working on adding fixed-length vectors to the calling convention,
it was necessary to be able to query for a fixed-length vector container
type without access to an instance of SelectionDAG.

This patch modifies the "main" getContainerForFixedLengthVector function
to use an instance of TargetLowering rather than SelectionDAG, and
preserves the SelectionDAG overload as a wrapper.

An additional non-static version of the function was also added to
simplify the common case in RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97925
2021-03-08 09:26:19 +00:00
Craig Topper c91b3c9e63 [RISCV] Fold (select_cc (setlt X, Y), 0, ne, trueV, falseV) -> (select_cc X, Y, lt, trueV, falseV)
A setcc can be created during LegalizeDAG after select_cc has been
created. This combine will enable us to fold these late setccs.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98132
2021-03-07 09:44:56 -08:00
Craig Topper fdbd5d3206 [RISCV] Fold (select_cc (xor X, Y), 0, eq/ne, trueV, falseV) -> (select_cc X, Y, eq/ne, trueV, falseV)
This pattern occurs when lowering for overflow operations
introduce an xor after select_cc has already been formed.

I had to rework another combine that looked for select_cc of an xor
with 1. That xor will now get combined away so we just need to
look for the RHS of the select_cc being 1.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98130
2021-03-07 09:29:55 -08:00
Fraser Cormack 8e7ceffd0b [RISCV] Fix crash when inserting large fixed-length subvectors
This patch addresses a compiler crash resulting from passing a
fixed-length type to one that expects scalable vector types. An
assertion was added to prevent this regressing in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97868
2021-03-04 09:27:16 +00:00
Fraser Cormack d8e1d2ebf4 [RISCV] Preserve fixed-length VL on insert_vector_elt in more cases
This patch fixes up one case where the fixed-length-vector VL was
dropped (falling back to VLMAX) when inserting vector elements, as the
code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the
fixed-length vector information.

To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL
operand through to the final instruction. This node wraps the RVV
vmv.s.x and vmv.s.f instructions, which were being selected by
insert_vector_elt anyway.

There should be no observable difference in scalable-vector codegen.

There is still one outstanding drop from fixed-length VL to VLMAX, when
an i64 element is inserted into a vector on RV32; the splat (which is
custom legalized) has no notion of the original fixed-length vector
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97842
2021-03-04 09:21:10 +00:00
Fraser Cormack c1695ddf7d [RISCV] Support fixed-length INSERT_VECTOR_ELT
This patch enables support for lowering INSERT_VECTOR_ELT on
fixed-length vector types. The strategy follows that for scalable vector
types.

This patch also includes a quick fix to prevent the compiler infinitely
looping between lowering BUILD_VECTOR as VECTOR_SHUFFLE and back again.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97698
2021-03-02 16:48:38 +00:00
Fraser Cormack de2b70010a [RISCV] Lower CONCAT_VECTORS to INSERT_SUBVECTOR nodes
The default expansion of CONCAT_VECTORS goes through the stack. This
patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series
of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this
is a good start.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97692
2021-03-02 11:13:59 +00:00
Fraser Cormack 3fea9226ee [RISCV] Support INSERT_SUBVECTOR on vector masks
Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem
for vector masks as RVV isn't able to slide mask types around. We choose
instead to bitcast to equivalently-sized i8 types where we can, else we
zero-extend, perform the operation, and truncate back down.

One test was left disabled due to a crash in the legalizer.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97559
2021-03-01 12:04:11 +00:00
Fraser Cormack e80ca3af82 [RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types
This patch fixes a bug where the lowering for INSERT_SUBVECTOR and
EXTRACT_SUBVECTOR would insist on first extracting a register-aligned
LMUL1 vector type before perfoming the slide up/down. This was even if
the vector was a fractional LMUL type, in which case the aligned
EXTRACT_SUBVECTOR was invalid.

This issue only occurred for scalable vector types, but a variety of
tests for both scalable and fixed-length vectors have been added to
ensure this does not regress in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97556
2021-03-01 11:51:05 +00:00
Fraser Cormack 4ea734e6ec [RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering
This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR
operations under one roof. Consequently, with this patch it is possible to
support any fixed-length subvector insertion, not just "cast-like" ones.

As before, support for the insertion of mask vectors will come in a
separate patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97543
2021-03-01 11:38:47 +00:00
Fraser Cormack bd4d421688 [RISCV] Support EXTRACT_SUBVECTOR on vector masks
This patch adds support for extracting subvectors from vector masks.
This can be either extracting a scalable vector from another, or a fixed-length
vector from a fixed-length or scalable vector.

Since RVV lacks a way to slide vector masks down on an element-wise
basis and we don't know the true length of the vector registers, in many
cases we must resort to using equivalently-sized i8 vectors to perform
the operation. When this is not possible we fall back and extend to a
suitable i8 vector.

Support was also added for fixed-length truncation to mask types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97475
2021-03-01 11:20:09 +00:00
Fraser Cormack 37014db013 [RISCV] Use existing method for the LMUL1 type. NFCI.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97467
2021-02-26 09:44:05 +00:00
Craig Topper d7fca3f0bf [RISCV] Support fixed vector extract_element for FP types. 2021-02-25 16:30:28 -08:00
Fraser Cormack 02f435db0b [RISCV] Support fixed-length vector i2fp/fp2i conversions
This patch extends the support for scalable-vector int->fp and fp->int
conversions by additionally handling fixed-length vectors.

The existing scalable-vector lowering re-expresses widening/narrowing by
x4+ conversions as standard nodes. The fixed-length vector support slots
in at "the end" of this process by lowering the now equally-sized and
widening/narrowing by x2 nodes to our custom VL versions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97374
2021-02-25 13:47:58 +00:00
Fraser Cormack 9620ce90d7 [RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND
This patch extends the support for vector FP_ROUND and FP_EXTEND by
including support for fixed-length vector types. Since fixed-length
vectors use "VL" nodes and scalable vectors can use the standard nodes,
there is slightly more to do in the fixed-length case. A helper function
was introduced to try and reduce the divergent paths. It is expected
that this function will similarly come in useful for lowering the
int-to-fp and fp-to-int operations for fixed-length vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97301
2021-02-25 12:16:06 +00:00
Fraser Cormack 84413e1947 [RISCV] Support fixed-length vector truncates
This patch extends support for our custom-lowering of scalable-vector
truncates to include those of fixed-length vectors. It does this by
co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and
VL operands. This avoids unnecessary duplication of patterns and
inflation of the ISel table.

Some truncates go through CONCAT_VECTORS which currently isn't
efficiently handled, as it goes through the stack. This can be improved
upon in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97202
2021-02-25 12:11:34 +00:00
Fraser Cormack 3bc5ed3875 [RISCV] Support fixed-length vector sign/zero extension
This patch adds support for the custom lowering sign- and zero-extension
of fixed-length vector types. It does so through custom nodes. Since the
source and destination types are (necessarily) of different sizes, it is
possible that the source type is legal whilst the larger destination
type isn't. In this case the legalization makes heavy use of
EXTRACT_SUBVECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97194
2021-02-25 12:05:17 +00:00
Fraser Cormack 821f8bb29a [RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering
This patch unifies the two disparate paths for lowering
EXTRACT_SUBVECTOR operations under one roof. Consequently, with this
patch it is possible to support any fixed-length subvector extraction,
not just "cast-like" ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97192
2021-02-25 11:46:57 +00:00
Craig Topper efcdd598b7 [RISCV] Teach VSETVLI inserter to use VSETIVLI when possible.
We always create the VL operand using a register, but if we can
determine that it came from an ADDI X0, imm with a sufficiently
small immediate, we can use VSETIVLI.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97332
2021-02-24 16:07:33 -08:00
Craig Topper 086670d367 [RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element.
I've changed to use VL=1 for slidedown and shifts to avoid extra
element processing that we don't need.

The i64 fixed vector handling on i32 isn't great if the vector type
isn't legal due to an ordering issue in type legalization. If the
vector type isn't legal, we fall back to default legalization
which will bitcast the vector to vXi32 and use two independent extracts.
Doing better will require handling several different cases by
manually inserting insert_subvector/extract_subvector to adjust the type
to a legal vector before emitting custom nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97319
2021-02-24 10:17:00 -08:00
Fraser Cormack dd68f3cf28 [RISCV] Support insertion of misaligned subvectors
This patch extends the support for RVV INSERT_SUBVECTOR to cover those
which don't align to a vector register boundary. Like the support for
EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the
nearest register-sized subvector (a subregister operation), then sliding
the vector down with VSLIDEDOWN, inserting the subvector to the first
position, and sliding the vector back up again afterwards.

Unlike subvector extraction, for vectors that occupy less than a full
vector register we must preserve the untouched elements. We do this by
lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and
lowering that to a VSLIDEUP with a zero offset. This uses a
tail-undisturbed policy and so has the effect of "sliding in" the
subvector elements while preserving the surrounding ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96972
2021-02-23 10:31:06 +00:00
Craig Topper 1cd2a5a7da [RISCV] Add isel support for bitcasts between fixed vector types.
This should fix the issue reported in D96972.

I don't have a good test case for this without those changes.

Differential Revision: https://reviews.llvm.org/D97082
2021-02-22 12:05:46 -08:00
Craig Topper 1aeb927fed [RISCV] Custom isel the rest of the vector load/store intrinsics.
A previous patch moved the index versions. This moves the rest.
I also removed the custom lowering for VLEFF since we can now
do everything directly in the isel handling.

I had to update getLMUL to handle mask registers to index the
pseudo table correctly for VLE1/VSE1.

This is good for another 15K reduction in llc size.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97097
2021-02-22 09:53:46 -08:00
Fraser Cormack 3e1317fd32 [RISCV] Support extraction of misaligned subvectors
This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those
which don't align to a vector register boundary. It accomplishes this by
extracting the nearest register-sized subvector (a subregister
operation), then sliding the vector down with VSLIDEDOWN and extracting
the subvector from the first position (a COPY operation).

Since this procedure involves the use of VSCALE and multiplication, the
handling of such operations is done during lowering to simplify the
implementation and make use of DAG combining. This necessitated moving
some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96959
2021-02-20 15:43:54 +00:00
Craig Topper 98dff5e804 [RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64
We previously used isel patterns for this, but that used quite
a bit of space in the isel table due to OR being associative
and commutative. It also wouldn't handle shifts/ands being in
reversed order.

This generalizes the shift/and matching from GREVI to
take the expected mask table as input so we can reuse it for
SHFLI.

There is no SHFLIW instruction, but we can promote a 32-bit
SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't
set, a 64-bit SHFLI will preserve 33 sign bits if the input had
at least 33 sign bits. ComputeNumSignBits has been updated to
account for that to avoid sext.w in the tests.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96661
2021-02-19 10:07:12 -08:00
Craig Topper 156fc07e19 [RISCV] Add support for fixed vector MULHU/MULHS.
This uses to division by constant optimization to use MULHU/MULHS.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D96934
2021-02-18 09:15:08 -08:00
Craig Topper 792627be35 [RISCV] Add support for fixed vector sign/zero extend from mask types.
Due to vXi64 on RV32, I've directly emitted this using _VL ISD
opcodes. If it wasn't for that we could just use fixed vector
BUILD_VECTOR and VSELECT and let those each be legalized.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96910
2021-02-18 09:08:10 -08:00
Craig Topper 016eca8f90 [RISCV] Guard LowerINSERT_VECTOR_ELT against fixed vectors.
The type legalizer can call this code based on the scalar type so
we need to verify the vector type is a scalable vector.

I think due to how type legalization visits nodes, the vector type
will have already been legalized so we don't have an issue with
using MVT here like we did for EXTRACT_VECTOR_ELT.
I've added a test just in case.
2021-02-17 19:27:08 -08:00
Craig Topper 00c4e0a8f6 [RISCV] Guard the ISD::EXTRACT_VECTOR_ELT handling in ReplaceNodeResults against fixed vectors and non-MVT types.
The type legalizer is calling this code based on the scalar type so
we need to verify the input type is a scalable vector.

The vector type has also not been legalized yet when this is called
so we need to use EVT for it.
2021-02-17 18:25:38 -08:00
Craig Topper 3bdd02735b [RISCV] Localize RISCVZvlssegTable to RISCVISelDAGToDAG.cpp, the only place it is used. 2021-02-17 11:37:28 -08:00
Fraser Cormack d81161646a [RISCV] Add support for fixed vector vselect
This patch adds support for fixed-length vector vselect. It does so by
lowering them to a custom unmasked VSELECT_VL node with a vector length
operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96768
2021-02-17 10:59:00 +00:00
Craig Topper 07ca13fe07 [RISCV] Add support for fixed vector mask logic operations.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96741
2021-02-16 09:34:00 -08:00
Fraser Cormack 04977ce5ce [RISCV] Fix a crash in fixed-length build_vector lowering
Non-splatted non-integer build_vector nodes were mistakenly being
lowered as VID expressions, which should not happen. VID can only be
used to select integer build_vector nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96718
2021-02-16 10:25:15 +00:00
Fraser Cormack b870199020 [RISCV] Add patterns for scalable-vector fabs & fcopysign
The patterns mostly follow the scalar counterparts, save for some extra
optimizations to match the vector/scalar forms.

The patch adds a DAGCombine for ISD::FCOPYSIGN to try and reorder
ISD::FNEG around any ISD::FP_EXTEND or ISD::FP_TRUNC of the second
operand. This helps us achieve better codegen to match vfsgnjn.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96028
2021-02-16 10:21:09 +00:00
Craig Topper 7ba2e1c601 [RISCV] Add support for fixed vector floating point setcc.
This is annoying because the condition code legalization belongs
to LegalizeDAG, but our custom handler runs in Legalize vector ops
which occurs earlier.

This adds some of the mask binary operations so that we can combine
multiple compares that we need for expansion.

I've also fixed up RISCVISelDAGToDAG.cpp to handle copies of masks.

This patch contains a subset of the integer setcc patch as well.
That patch is dependent on the integer binary ops patch. I'll rebase
based on what order the patches go in.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96567
2021-02-15 12:52:25 -08:00
Fraser Cormack 4bd5bd4009 [RISCV] Convert VSLIDE(UP|DOWN) nodes to "VL" versions (NFC)
This patch prepares the RISCV VSLIDEUP and VSLIDEDOWN custom nodes to
ones carrying additional mask and vector-length operands. This is
primarily so they can be used by both systems.

This also takes the opportunity to create some helper functions to deal
with the common task of getting the default (unmasked) VL operands.

Reviewed By: craig.topper, arcbbb

Differential Revision: https://reviews.llvm.org/D96505
2021-02-15 10:32:56 +00:00
Craig Topper 4220a81c84 [RISCV] Add support for fixed vector fabs 2021-02-12 15:33:36 -08:00
Craig Topper 36658376d5 [RISCV] Add support for fixed vector sqrt. 2021-02-12 15:33:29 -08:00
Craig Topper 1697cc78b1 [RISCV] Add support for integer fixed vector setcc
I believe I've covered all orderings of splat operands here. Better
canonicalization in lowering might help reduce this. I did not handle
the immediate adjustments needed for set(u)gt/set(u)lt.

Testing here is limited to byte types because the scalable vector
type used for masks for the store is calculated assuming 8 byte
elements. But for the setcc its based on the element count of the
container type for the setcc input. So they don't agree. We'll need
to enhanced D96352 to handle this I think.

Differential Revision: https://reviews.llvm.org/D96443
2021-02-12 09:29:41 -08:00
Fraser Cormack e88da1d677 [RISCV] Add support for integer fixed min/max
This patch extends the initial fixed-length vector support to include
smin, smax, umin, and umax.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96491
2021-02-12 09:19:45 +00:00
Craig Topper 033b1bd185 [RISCV] Add support loads, stores, and splats of vXi1 fixed vectors.
This refines how we determine which masks types are legal and adds
support for loads, stores, and all ones/zeros splats.

I left a fixme in store handling where I think we need to zero
extra bits if the type isn't a multiple of a byte. If I remember
right from X86 there was some case we could have a store of a
1, 2, or 4 bit mask and have a scalar zextload that then expected the
bits to be 0. Its tricky to zero the bits with RVV. We need to do
something like round VL up, zero a register, lower the VL back down,
then do a tail undisturbed move into the zero register. Another
option might be to generate a mask of 1/2/4 bits set with a VL of 8
and use that to mask off the bits.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96468
2021-02-11 09:13:16 -08:00
Craig Topper 0c254b4a69 [RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles.
The test cases extract a fixed element from a vector and splat it
into a vector. This gets DAG combined into a splat shuffle.

I've used some very wide vectors in the test to make sure we have
at least a couple tests where the element doesn't fit into the
uimm5 immediate of vrgather.vi so we fall back to vrgather.vx.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96186
2021-02-10 10:01:56 -08:00
Fraser Cormack a3c74d6d53 [RISCV] Add support for selecting vid.v from build_vector
This patch optimizes a build_vector "index sequence" and lowers it to
the existing custom RISCVISD::VID node. This pattern is common in
autovectorized code.

The custom node was updated to allow it to be used by both scalable and
fixed-length vectors, thus avoiding pattern duplication.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96332
2021-02-10 10:58:40 +00:00
Hsiangkai Wang a5b07a221a [RISCV] Initial support of LoopVectorizer for RISC-V Vector.
Define an option -riscv-vector-bits-max to specify the maximum vector
bits for vectorizer. Loop vectorizer will use the value to check if it
is safe to use the whole vector registers to vectorize the loop.

It is not the optimum solution for loop vectorizing for scalable vector.
It assumed the whole vector registers will be used to vectorize the code.
If it is possible, we should configure vl to do vectorize instead of
using whole vector registers.

We only consider LMUL = 1 in this patch.

This patch just an initial work for loop vectorizer for RISC-V Vector.

Differential Revision: https://reviews.llvm.org/D95659
2021-02-09 06:32:18 +08:00
Craig Topper 8d8cafa32e [RISCV] Add support for splat fixed length build_vectors using RVV.
Building on the fixed vector support from D95705

I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to
lowering the intrinsics to it. This allows us to share the same
isel patterns for both.

This doesn't handle splats of i64 on RV32 yet. The build_vector
gets converted to a vXi32 build_vector+bitcast during type
legalization. Not sure the best way to handle this at the moment.

Differential Revision: https://reviews.llvm.org/D96108
2021-02-08 11:12:56 -08:00
Craig Topper b8d719fbe8 [RISCV] Add support for fixed vector FMA.
Follow up to D95705. Does not include the commuting support from D95800.

Differential Revision: https://reviews.llvm.org/D96103
2021-02-08 11:12:56 -08:00
Craig Topper a719b667a9 [RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions.
This is an alternative to D95563.

This is modeled after a similar feature for AArch64's SVE that uses
predicated scalable vector instructions.a

Rather than use predication, this patch uses an explicit VL operand.
I've limited it to always use LMUL=1 for now, but we can improve this
in the future.

This requires a bunch of new ISD opcodes to carry the VL operand.
I think we can probably lower intrinsics to these ISD opcodes to
cut down on the size of the isel table. Which is why I've added
patterns for all integer/float types and not just LMUL=1.

I'm only testing one vector width right now, but the width is
programmable via the command line.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95705
2021-02-08 10:41:30 -08:00
Craig Topper b7b4f4cbc3 [RISCV] Make scalable vector FMA commutable for register allocation.
This adds support for commuting operands and converting between
vfmadd and vfmacc to avoid register copies.

To avoid messing up intrinsic behavior, I've added new pseudo
instructions that have the isCommutable flag set. These pseudos also
force a tail agnostic policy. The intrinsic version still use
the tail undisturbed policy.

For best results it looks like we need to start with fmadd and only
pick fmacc if its beneficial. MachineCSE commutes without contraining
the operands and then commutes back if it didn't help with CSE. So
I've made sure that when the operand choice isn't constrained, we
will keep fmadd for MachineCSE and when it does the second commute,
we get back the original instruction.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95800
2021-02-08 10:05:33 -08:00
Mikael Holmen eb8c27c60c [RISCV] Use std::make_tuple to make some toolchains happy again
My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with:

12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1717:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_FADD, Op.getOperand(0),
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1720:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_SEQ_FADD, Op.getOperand(1), Op.getOperand(0)};
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 2 errors generated.

This commit adds explicit calls to std::make_tuple to work around
the problem.
2021-02-08 14:37:25 +01:00
Fraser Cormack b46aac125d [RISCV] Support the scalable-vector fadd reduction intrinsic
This patch adds support for both the fadd reduction intrinsic, in both
the ordered and unordered modes.

The fmin and fmax intrinsics are not currently supported due to a
discrepancy between the LLVM semantics and the RVV ISA behaviour with
regards to signaling NaNs. This behaviour is likely fixed in version 2.3
of the RISC-V F/D/Q extension, but until then the intrinsics can be left
unsupported.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95870
2021-02-08 09:52:27 +00:00
Fraser Cormack e046c0c28b [RISCV] Support scalable-vector integer reduction intrinsics
This patch adds support for the integer reduction intrinsics supported
by RVV. This excludes "mul" which has no corresponding instruction.

The reduction instructions in RVV have slightly complicated type
constraints given they always produce a single "M1" vector register.

They are lowered to custom nodes including the second "scalar" reduction
operand to simplify the patterns and in the hope that they can be useful
for future DAG combines.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95620
2021-02-05 10:10:08 +00:00
Fraser Cormack c3eb2da6c4 [RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes
This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where
SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from
the result. This allows us to eliminate redundant sign extensions.

For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way
so that we don't need TableGen patterns for some and not others.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95741
2021-02-05 10:05:22 +00:00
Craig Topper 44cc5abbf9 [RISCV] Custom lower fshl/fshr with Zbt extension.
We need to add a mask to the shift amount for these operations
to use the FSR/FSL instructions. We were previously doing this
in isel patterns, but custom lowering will make the mask
visible to optimizations earlier.
2021-01-31 17:49:15 -08:00
Fraser Cormack fc2f27ccf3 [RISCV] Add support for RVV int<->fp & fp<->fp conversions
This patch adds support for the full range of vector int-to-float,
float-to-int, and float-to-float conversions on legal types.

Many conversions are supported natively in RVV so are lowered with
patterns. These include conversions between (element) types of the same
size, and those that are half/double the size of the input. When
conversions take place between types that are less than half or more
than double the size we must lower them using sequences of instructions
which go via intermediate types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95447
2021-01-28 09:50:32 +00:00
Craig Topper a40e01e442 [RISCV] Rework fault first only load isel.
-Remove the ISD opcode for READ_VL. Just emit the MachineSDNode directly.
-Move segmented fault first only load intrinsic handling completely to
 RISCVISelDAGToDAG.cpp and emit the ReadVL MachineSDNode there
 instead of lowering to ISD opcodes first.
2021-01-27 11:51:41 -08:00
Craig Topper 04570e98c8 [RISCV] Group the legal vector types into lists we can iterator over in the RISCVISelLowering constructor
Remove the RISCVVMVTs namespace because I don't think it provides
a lot of value. If we change the mappings we'd likely have to add
or remove things from the list anyway.

Add a wrapper around addRegisterClass that can determine the
register class from the fixed size of the type.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D95491
2021-01-27 10:20:12 -08:00
Fraser Cormack 9a75a808c2 [RISCV] Fix a codegen crash in getSetCCResultType
This patch fixes some crashes coming from
`RISCVISelLowering::getSetCCResultType`, which would occasionally return
an EVT constructed from an invalid MVT, which has a null Type pointer.

The attached test shows this happening currently for some fixed-length
vectors, which hit this issue when the V extension was enabled, even
though they're not legal types under the V extension. The fix was also
pre-emptively extended to scalable vectors which can't be represented as
an MVT, even though a test case couldn't be found for them.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95434
2021-01-27 10:22:54 +00:00
Craig Topper f9d7f77267 [RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well.
239cfbccb0 add support for legalizing
i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate
to i8/i16 if we're legalizing one of those.
2021-01-26 10:50:03 -08:00
Hsiangkai Wang b69932b550 [RISCV] Implement vlsegff intrinsics.
Differential Revision: https://reviews.llvm.org/D95303
2021-01-26 12:02:43 +08:00
Fraser Cormack 15141cd115 [RISCV] Add RVV insertelt/extractelt scalable-vector patterns
Original patch by @rogfer01.

This patch adds support for insertelt and extractelt operations on
scalable vectors.

Special care must be taken on RV32 when dealing with i64 vectors as
there are no straightforward ways to insert a 64-bit element without a
register of that size. To that end, both are custom-lowered to different
sequences.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94615
2021-01-25 22:03:52 +00:00
Craig Topper 239cfbccb0 [RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw.
This makes our i8/i16 codegen more similar to the i32 codegen.

I've also added computeKnownBits support for DIVUW/REMUW so
that we can remove zero extending ANDs from the output. Without
this we end up turning DIVUW/REMUW back into DIVU/REMU via some
isel patterns.

Reviewed By: frasercrmck, luismarques

Differential Revision: https://reviews.llvm.org/D95322
2021-01-25 10:47:22 -08:00
Craig Topper 4eb4f8963f [RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64.
As far as I know 32 bits arguments and returns on RV64 are always
sign extended to i64. So I think we should be taking this into
account around libcalls.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95285
2021-01-25 09:33:48 -08:00
Fraser Cormack fde2466171 [SelectionDAG] Support scalable-vector splats in more cases
This patch adds support for scalable-vector splats in DAGCombiner's
`isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions,
which enable the SelectionDAG div/rem-by-constant optimizations for
scalable vector types.

It also fixes up one case where the UDIV optimization was generating a
SETCC without first consulting the target for its preferred SETCC result
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94501
2021-01-25 10:58:15 +00:00
Craig Topper 4d5aa760a7 [RISCV] Add support for rev8 and orc.b to Zbb.
These instructions use a portion of the encodings for grevi and
gorci. The full encodings are only supported with Zbp. Note,
rev8 has a different encoding between rv32 and rv64.

Zbb is closer to being finalized that Zbp which has motivated
some decisions in this patch.

I'm treating rev8 and orc.b as separate instructions when
either Zbb or Zbp is enabled. This allows us to print to suggest
that either feature needs to be enabled to support these mnemonics.
I had tried to put HasStdExtZbbAndNotZbp on the Zbb instructions,
but that caused a diagnostic that said Zbp is required if neither
feature is enabled. We should really mention Zbb since its closer
to final.

This does require extra isel patterns for the different cases so
that bswap will always print as rev8 in assembly listing since
we can't use an InstAlias.

llvm-objdump disassembling should always pick the rev8 or orc.b
instructions. llvm-mc parsing and printing text will not convert
the grevi/gorci spellings to rev8/gorc.b. We could probably fix
this with a special case in processInstruction in the assembly
parser if it its important.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94944
2021-01-22 12:49:10 -08:00
Craig Topper 3b5430eb0d [RISCV] Add a VL output to vleff intrinsics.
The fault-only-first-load instructions can reduce VL if an element
other than element 0 triggers a memory fault. This can be used to
vectorize loops with data dependent exit conditions like strcmp or
strlen.

This patch adds a VL output to these intrinsics so that the new
VL value can be captured by software. This will be expanded to
'csrr gpr, vl' after the vleff instruction during SelectionDAG.

By doing this with one intrinsic we are able to guarantee that the
csrr reads the VL value produced by the vleff instruction. Having
it as a separate intrinsic would make it impossible to guarantee
ordering without making every other vector intrinsic have side
effects.

The intrinsics are expanded during lowering into two ISD nodes
that are glued together. These ISD nodes will go
through isel separately, but should maintain the glue so that they
get emitted adjacently by InstrEmitter.

I've only ran the chain through the vleff instruction, allowing
the READ_VL to be deleted if it is unused.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94286
2021-01-21 17:19:58 -08:00
Hsiangkai Wang 6e360460f1 [RISCV] Use v8-v23 as argument registers to conform to the proposal.
The maximum LMUL is 8. We need 16 vector registers for two LMUL-8
arguments. The modification follows the proposal of psABI in
https://github.com/riscv/riscv-elf-psabi-doc/pull/171

Differential Revision: https://reviews.llvm.org/D95134
2021-01-22 07:55:24 +08:00
Michael Munday 4ab0f51a75 Recommit "[RISCV] Legalize select when Zbt extension available"
This recommits 71ed4b6ce5 with
the polarity of some of the pattern corrected.

Original commit message:
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.

Reviewed By: luismarques, craig.topper

Differential Revision: https://reviews.llvm.org/D93767
2021-01-21 12:07:44 -08:00
Craig Topper 9d792fef57 [RISCV] Remove unnecessary APInt copy. NFC
getAPIntValue returns a const APInt& so keep it as a reference.
2021-01-20 10:33:09 -08:00
Hsiangkai Wang 8ca4b174d7 [RISCV] Implement vlseg intrinsics.
For Zvlsseg, we need continuous vector registers for the values. We need
to define new register classes for the different combinations of (number
of fields and LMUL). For example,

when the number of fields(NF) = 3, LMUL = 2, the values will be assigned
to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ...

We define the vlseg intrinsics with multiple outputs. There is no way to
describe the codegen patterns with multiple outputs in the tablegen
files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to
extract the values of output.

The multiple scalable vector values will be put into a struct. This
patch is depended on the support for scalable vector struct.

Differential Revision: https://reviews.llvm.org/D94229
2021-01-20 14:26:04 +08:00
Craig Topper ce8b3937dd [RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1.
If we are able to compare with 0 instead of 1, we might be able
to fold the setcc into a beqz/bnez.

Often these setccs start life as an xor that gets converted to
a setcc by DAG combiner's rebuildSetcc. I looked into a detecting
(xor X, 1) and converting to (seteq X, 0) based on boolean contents
being 0/1 in rebuildSetcc instead of using computeKnownBits. It was
very perturbing to AMDGPU tests which I didn't look closely at.
It had a few changes on a couple other targets, but didn't seem
to be much if any improvement.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D94730
2021-01-19 11:21:48 -08:00
Fraser Cormack 9c6a00fe99 [RISCV] Add ISel patterns for scalable mask exts & truncs
Original patch by @rogfer01.

This patch adds support for sign-, zero-, and any-extension from
scalable mask vector types to integer vector types, as well as
truncation in the opposite direction.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94590
2021-01-19 18:13:15 +00:00
Fraser Cormack ac603c8d38 [RISCV] Add scalable vector truncate patterns
Original patch by @rogfer01.

This patch supports vector truncates, which on RVV must be done in a
series of instructions truncating by one power-of-two at a time. This is
done through custom-lowering and a custom node to avoid LLVM
re-combining the split TRUNCATE nodes.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94796
2021-01-18 10:18:43 +00:00
Craig Topper 383b6501ff [RISCV] Use tail agnostic policy for instructions with tied defs if the use operand is IMPLICIT_DEF.
The vcompress intrinsic is defined such that it requires a tail
undisturbed policy. This patch makes it so we can use the tail
agnostic policy if the user has passed vundefined to the dest
operand.

We need to do something similar for masked policy, but we need
annotation of which instructions use the mask policy first.

Not sure if this is sufficient for scheduling or if we'll need to
select different pseudos that don't have a tied def.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D94566
2021-01-17 23:47:58 -08:00
Craig Topper 86e604c4d6 [RISCV] Add implementation of targetShrinkDemandedConstant to optimize AND immediates.
SimplifyDemandedBits can remove set bits from immediates from instructions
like AND/OR/XOR. This can prevent them from being efficiently
codegened on RISCV.

This adds an initial version that tries to keep or form 12 bit
sign extended immediates for AND operations to enable use of ANDI.
If that doesn't work we'll try to create a 32 bit sign extended immediate
to use LUI+ADDIW.

More optimizations are possible for different size immediates or
different operations. But this is a good starting point that already
has test coverage.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94628
2021-01-15 11:14:14 -08:00
Craig Topper b894a9fb23 [RISCV] Optimize select_cc after fp compare expansion
Some FP compares expand to a sequence ending with (xor X, 1) to invert the result. If
the consumer is a select_cc we can likely get rid of this xor by fixing
up the select_cc condition.

This patch combines (select_cc (xor X, 1), 0, setne, trueV, falseV) -
(select_cc X, 0, seteq, trueV, falseV) if we can prove X is 0/1.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D94546
2021-01-14 13:41:40 -08:00
Craig Topper 387d3c2479 [RISCV] Merge Utils library into MCTargetDesc
MCTargetDesc includes headers from Utils and Utils includes headers
from MCTargetDesc. So from a library layering perspective it makes sense
for them to be in the same library. I guess the other option might be to
move the tablegen includes from RISCVMCTargetDesc.h to RISCVBaseInfo.h
so that RISCVBaseInfo.h didn't need to include RISCVMCTargetDesc.h.
Everything else that depends on Utils also depends on MCTargetDesc so
having one library seemed simpler.

Differential Revision: https://reviews.llvm.org/D93168
2021-01-14 11:47:30 -08:00
Sam Elliott 7c9c2a2ea5 Revert "[RISCV] Legalize select when Zbt extension available"
We found issues with this patch in additional testing. Backing out while
we work on a fix.

This reverts commit 71ed4b6ce5.
2021-01-14 16:44:34 +00:00
Craig Topper dfc1901d51 [RISCV] Custom lower ISD::VSCALE.
This patch custom lowers ISD::VSCALE into a csrr vlenb followed
by a shift right by 3 followed by a multiply by the scale amount.

I've added computeKnownBits support to indicate that the csrr vlenb
always produces 3 trailng bits of 0s so the shift right is "exact".
This allows the shift and multiply sequence to be nicely optimized
into a single shift or removed completely when the scale amount is
a power of 2.

The non power of 2 case multiplying by 24 is still producing
suboptimal code. We could remove the right shift and use a
multiply by 3. Hopefully we can improve DAG combine to fix that
since it's not unique to this sequence.

This replaces D94144.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94249
2021-01-13 17:14:49 -08:00
Michael Munday 71ed4b6ce5 [RISCV] Legalize select when Zbt extension available
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.

Reviewed By: lenary, craig.topper

Differential Revision: https://reviews.llvm.org/D93767
2021-01-12 21:24:38 +00:00
Fraser Cormack 37b41bd087 [RISCV] Add scalable vector fcmp ISel patterns
Original patch by @rogfer01.

All ordered comparisons except ONE are supported natively, and all
unordered comparisons except UNE are expanded into sequences involving
explicit NaN checks and mask arithmetic.

Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and
pattern-match those back to the "original", swapping operands once more. This
way we catch both operations and both "vf" and "fv" forms with fewer patterns.

Also add support for floating-point splat_vector, with an optimization for
splatting fpimm0.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94242
2021-01-11 19:38:56 +00:00
Craig Topper 5cf73dca77 [RISCV] Convert most of the information about RVV Pseudos into bits in TSFlags.
This patch moves all but the BaseInstr to bits in TSFlags.

For the index fields, we can just use a bit to indicate their presence.
The locations of the operands are well defined.

This reduces the llc binary by about 32K on my build. It also
removes the binary search of the table from the custom inserter.
Instead we just check that the SEW op is present.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D94375
2021-01-10 19:15:45 -08:00
Ben Shi 55f0a1b066 [RISCV] Optimize multiplication with constant
1. Break MUL with specific constant to a SLLI and an ADD/SUB on riscv32
   with the M extension.
2. Break MUL with specific constant to two SLLI and an ADD/SUB, if the
   constant needs a pair of LUI/ADDI to construct.

Reviewed by: craig.topper

Differential Revision: https://reviews.llvm.org/D93619
2021-01-09 10:37:21 +08:00
Craig Topper c68faed041 [RISCV] Return a vXi1 vector type from getSetCCResultType if V extension is enabled.
nvxXi1 types are legal with V extension and that's the result
vmseq/vmsne/vmslt/etc instructions return.

No test cases yet because the setcc isel patterns aren't in
and we'll need more than basic tests to observe this. I locally
tested that this plus D947078, D94168, D94142, and D94149
was enough to be able to handle the overflow result from
llvm.sadd.overflow.
2021-01-06 11:50:15 -08:00
Fraser Cormack 1d4411e9ea [RISCV] Add vector integer min/max ISel patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94012
2021-01-05 09:15:50 +00:00
Craig Topper 79cbb003c5 [RISCV] Don't use tail agnostic policy on instructions where destination is tied to source
If the destination is tied, then user has some control of the
register used for input. They would have the ability to control
the value of any tail elements. By using tail agnostic we take
this option away from them.

Its not clear that the intrinsics are defined such that this isn't
supposed to work. And undisturbed is a valid implementation for agnostic
so code wouldn't even fail to work on all systems if we always used
agnostic.

The vcompress intrinsic is defined to require tail undisturbed so
at minimum we need this for that instruction or need to redefine
the intrinsic.

I've made an exception here for vmv.s.x/fmv.s.f and reduction
instructions which only write to element 0 regardless of the tail
policy. This allows us to keep the agnostic policy on those which
should allow better redundant vsetvli removal.

An enhancement would be to check for undef input and keep the
agnostic policy, but we don't have good test coverage for that yet.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D93878
2020-12-29 10:37:58 -08:00
Fraser Cormack aebb4a6052 [RISCV] Rewrite and simplify helper function. NFC.
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D93851
2020-12-29 11:29:44 +00:00
Kazu Hirata d6ff5cf995 [Target] Use llvm::any_of (NFC) 2020-12-24 19:43:26 -08:00
Fraser Cormack 1a7ac29a89 [RISCV] Add ISel support for RVV vector/scalar forms
This patch extends the SDNode ISel support for RVV from only the
vector/vector instructions to include the vector/scalar and
vector/immediate forms.

It uses splat_vector to carry the scalar in each case, except when
XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for
type-legalization and to encode the fact that the value is sign-extended
to SEW. When the scalar is a full 64-bit value we use a sequence to
materialize the constant into the vector register.

The non-intrinsic ISel patterns have also been split into their own
file.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93312
2020-12-23 20:16:18 +00:00
Nandor Licker 0586f048d7 [RISCV] Basic jump table lowering
This patch enables jump table lowering in the RISC-V backend.

In addition to the test case included, the new lowering was
tested by compiling the OCaml runtime and running it under qemu.

Differential Revision: https://reviews.llvm.org/D92097
2020-12-22 15:05:54 +00:00
Craig Topper 09468a9148 [RISCV] Sign extend constant arguments to V intrinsics when promoting to XLen.
The default behavior for any_extend of a constant is to zero extend.
This occurs inside of getNode rather than allowing type legalization
to promote the constant which would sign extend. By using sign extend
with getNode the constant will be sign extended. This gives a better
chance for isel to find a simm5 immediate since all xlen bits are
examined there.

For instructions that use a uimm5 immediate, this change only affects
constants >= 128 for i8 or >= 32768 for i16. Constants that large
already wouldn't have been eligible for uimm5 and would need to use a
scalar register.

If the instruction isn't able to use simm5 or the immediate is
too large, we'll need to materialize the immediate in a register.
As far as I know constants with all 1s in the upper bits should
materialize as well or better than all 0s.

Longer term we should probably have a SEW aware PatFrag to ignore
the bits above SEW before checking simm5.

I updated about half the test cases in some tests to use a negative
constant to get coverage for this.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D93487
2020-12-18 11:43:38 -08:00
Craig Topper 86d282baed [RISCV] Add intrinsics for vmv.x.s and vmv.s.x
This adds intrinsics for vmv.x.s and vmv.s.x.

I've used stricter type constraints on these intrinsics than what we've been doing on the arithmetic intrinsics so far. This will allow us to not need to pass the scalar type to the Intrinsic::getDeclaration call when creating these intrinsics.

A custom ISD is used for vmv.x.s in order to implement the change in computeNumSignBitsForTargetNode which can remove sign extends on the result.

I also modified the MC layer description of these instructions to show the tied source/dest operand. This is different than what we do for masked instructions where we drop the tied source operand when converting to MC. But it is a more accurate description of the instruction. We can't do this for masked instructions since we use the same MC instruction for masked and unmasked. Tools like llvm-mca operate in the MC layer and rely on ins/outs and Uses/Defs for analysis so I don't know if we'll be able to maintain the current behavior for masked instructions. So I went with the accurate description here since it was easy.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D93365
2020-12-18 10:30:48 -08:00
Monk Chiang ee2cb90e3b [RISCV] Define vsadd/vsaddu/vssub/vssubu intrinsics.
We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93366
2020-12-18 10:24:24 +08:00
Hsiangkai Wang f03609b5c7 [RISCV] V does not imply F.
If users want to use vector floating point instructions, they need to
specify 'F' extension additionally.

Differential Revision: https://reviews.llvm.org/D93282
2020-12-17 10:57:36 +08:00
Craig Topper 028efac2d7 [RISCV] Only custom legalize i32 arguments to vector intrinsics on RV64. 2020-12-15 13:54:41 -08:00
Hsiangkai Wang 14a91d676b [RISCV][NFC] Define scalable vectors for half types.
This is a preperation work for vfadd intrinsics.

Differential Revision: https://reviews.llvm.org/D93275
2020-12-15 16:23:22 +08:00
Hsiangkai Wang a6805a0e02 [RISCV] Define vadd/vsub/vrsub intrinsics and lower to V instructions.
This patch is based on the proposal from Roger Ferrer Ibanez.
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html

Differential Revision: https://reviews.llvm.org/D93013
2020-12-15 12:56:49 +08:00
Craig Topper b90e2d850e [RISCV] Use tail agnostic policy for vsetvli instruction emitted in the custom inserter
The compiler is making no effort to preserve upper elements. To do so would require another source operand tied with the destination and a different intrinsic interface to give control of this source to the programmer.

This patch changes the tail policy to agnostic so that the CPU doesn't need to make an effort to preserve them.

This is consistent with the RVV intrinsic spec here https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#configuration-setting

Differential Revision: https://reviews.llvm.org/D93080
2020-12-10 19:48:03 -08:00
Craig Topper a1ae3c6ac9 [RISCV][LegalizeDAG] Expand SETO and SETUO comparisons. Teach LegalizeDAG to expand SETUO expansion when UNE isn't legal.
If SETUNE isn't legal, UO can use the NOT of the SETO expansion.

Removes some complex isel patterns. Most of the test changes are
from using XORI instead of SEQZ.

Differential Revision: https://reviews.llvm.org/D92008
2020-12-10 09:15:52 -08:00
Fraser Cormack af5fd65895 [RISCV] Fix missing def operand when creating VSETVLI pseudos
The register operand was not being marked as a def when it should be. No tests
for this in the main branch as there are not yet any pseudos without a
non-negative VLIndex.

Also change the type of a virtual register operand from unsigned to Register
and adjust formatting.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D92823
2020-12-09 09:35:28 +00:00
Craig Topper 846f576bea [RISCV] Add a table showing the layout of the fields in VTYPE. Rename MaskedOffAgnostic->MaskAgnostic. NFC 2020-12-08 20:41:57 -08:00
Craig Topper a64998be99 [RISCV] Share VTYPE encoding code between the assembler and the CustomInserter for adding VSETVLI before vector instructions
This merges the SEW and LMUL enums that each used into singles enums in RISCVBaseInfo.h. The patch also adds a new encoding helper to take SEW, LMUL, tail agnostic, mask agnostic and turn it into a vtype immediate.

I also stopped storing the Encoding in the VTYPE operand in the assembler. It is easy to calculate when adding the operand which should only happen once per instruction.

Differential Revision: https://reviews.llvm.org/D92813
2020-12-08 16:04:20 -08:00
Craig Topper 5c819eb389 [RISCV] Form GORCI from (or (rotl/rotr X, Bitwidth/2), X).
A rotate by half the bitwidth swaps the bottom and top half which is the same as one of the MSB GREVI stage.

We have to do this as a special combine because we prefer to keep (rotl/rotr X, BitWidth/2) as a rotate rather than a single stage GREVI.

Differential Revision: https://reviews.llvm.org/D92286
2020-12-07 10:28:04 -08:00
Craig Topper 5baef6353e [RISCV] Initial infrastructure for code generation of the RISC-V V-extension
The companion RFC (http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html) gives lots of details on the overall strategy, but we summarize it here:

LLVM IR involving vector types is going to be selected using pseudo instructions (only MachineInstr). These pseudo instructions contain dummy operands to represent the vector type being operated and the vector length for the operation.
These two dummy operands, as set by instruction selection, will be used by the custom inserter to prepend every operation with an appropriate vsetvli instruction that ensures the vector architecture is properly configured for the operation. Not in this patch: later passes will remove the redundant vsetvli instructions.
Register classes of tuples of vector registers are used to represent vector register groups (LMUL > 1).
Those pseudos are eventually lowered into the actual instructions when emitting the MCInsts.
About the patch:

Because there is a bit of initial infrastructure required, this is the minimal patch that allows us to select instructions for 3 LLVM IR instructions: load, add and store vectors of integers. LLVM IR operations have "whole-vector" semantics (as in they generate values for all the elements).

Later patches will extend the information represented in TableGen.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Co-Authored-by: Craig Topper <craig.topper@sifive.com>

Differential Revision: https://reviews.llvm.org/D89449
2020-12-04 11:39:30 -08:00
Craig Topper 3fcdf9ca78 [RISCV] Rename FPCCToExtend->FPOpToExpand and FPOpToExtend->FPOpToExpand. NFC
These are used to call setOperationAction/setCondCodeAction with
the Expand action so it seems that Expand is a better name than
Extend.
2020-12-03 16:00:49 -08:00
Craig Topper a18d5e3e9f [RISCV] Merge FMV_H_X_RV32/FMV_H_X_RV64 into a single opcode. Same with FMV_X_ANYEXTH_RV32/RV64
Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes.

Differential Revision: https://reviews.llvm.org/D92538
2020-12-03 11:12:40 -08:00
Craig Topper e52a91e156 [RISCV] Add f16 to isFMAFasterThanFMulAndFAdd now that the Zfh extension is supported 2020-12-02 20:31:43 -08:00
Hsiangkai Wang f7bc7c2981 [RISCV] Support Zfh half-precision floating-point extension.
Support "Zfh" extension according to
https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex

Differential Revision: https://reviews.llvm.org/D90738
2020-12-03 09:16:33 +08:00
Craig Topper bfc4f29f46 [RISCV] Combine (GORCI (GORCI x, C2), C1) -> (GORCI x, C1|C2).
Unlike GREVI, GORCI stages can't be undone, but they are
redundant if done more than once.

Differential Revision: https://reviews.llvm.org/D92295
2020-11-30 08:42:46 -08:00
Craig Topper 76d1026b59 [RISCV] Custom legalize bswap/bitreverse to GREVI with Zbp extension to enable them to combine with other GREVI instructions
This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines.

I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW.

Differential Revision: https://reviews.llvm.org/D92253
2020-11-30 08:30:40 -08:00
Craig Topper cbbd7021f1 [RISCV] Only combine (or (GREVI x, shamt), x) -> GORCI if shamt is a power of 2.
GORCI performs an OR between each stage. So we need to ensure only
one stage is active before doing this combine.

Initial attempts at finding a test case for this failed due to
the order things get combined. It's most likely that we'll form
one stage of GREVI then combine to GORCI before the two stages of
GREVI are able to be formed and combined with each other to form
a multi stage GREVI.

Differential Revision: https://reviews.llvm.org/D92289
2020-11-30 08:10:39 -08:00
Craig Topper 8709d9d872 [RISCV] Replace getSimpleValueType() with getValueType() in DAG combines to prevent asserts with weird types. 2020-11-27 12:49:12 -08:00
Craig Topper ed95cafbc5 [RISCV] Add an implementation of isFMAFasterThanFMulAndFAdd
Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true
on some particular implementation we can add a tuning parameter in the future.

I've update the fmuladd test cases and added new test cases for fast math flag
based contraction.

Differential Revision: https://reviews.llvm.org/D91987
2020-11-25 15:07:34 -08:00
Craig Topper 751b0d970e [RISCV] Make SMIN/SMAX/UMIN/UMAX legal with Zbb extension.
This is the logically correct thing to do. But it generates worse
code for i32 umin/umax on the rv64 due to type legalize requesting
zext even though the arguments are sext. Maybe we can teach type
legalizer to use sext for umin/umax for RISCV.

It's also producing possibly worse code on i64 on RV32 since we
still end up with selects that become branches. But this seems
like something we could improve in type legalization or DAG combine.

Hopefully this makes D92095 work for RISCV with Zbb.
2020-11-25 12:48:43 -08:00
Craig Topper c26e8697d7 [RISCV] Custom type legalize i32 fshl/fshr on RV64 with Zbt.
This adds custom opcodes for FSLW/FSRW so we can type legalize
fshl/fshr without needing to match a sign_extend_inreg.

I've used the operand order from fshl/fshr to make the isel
pattern similar to the non-W form. It was also hard to decide
another order since the register instruction has the shift amount
as the second operand, but the immediate instruction has it as
the third operand.

Differential Revision: https://reviews.llvm.org/D91479
2020-11-25 10:01:47 -08:00
Luís Marques a8dc2110cd [RISCV] Add GHC calling convention
This is a special calling convention to be used by the GHC compiler.

Patch by Andreas Schwab (schwab)

Differential Revision: https://reviews.llvm.org/D89788
2020-11-24 22:35:23 +00:00
Luís Marques e4d9380245 Revert "[RISCV] Add GHC calling convention"
This reverts commit f8317bb256 due to lack of
proper attribution.
2020-11-24 22:34:20 +00:00
Luís Marques f8317bb256 [RISCV] Add GHC calling convention
This is a special calling convention to be used by the GHC compiler.

Differential Revision: https://reviews.llvm.org/D89788
2020-11-24 21:56:28 +00:00
Fraser Cormack ca1f2f2716 [RISCV] Combine GREVI sequences
This combine step performs the following type of transformation:

    rev.p a0, a0   # grevi a0, a0, 0b01
    rev2.n a0, a0  # grevi a0, a0, 0b10
    -->
    rev.n a0, a0   # grevi a0, a0, 0b11

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D91877
2020-11-24 12:07:13 +00:00
Craig Topper 84b8222705 [RISCV] Use separate Lo and Hi MemOperands when expanding BuildPairF64Pseudo and SplitF64Pseudo.
We generate two 4 byte loads or two stores as part of the expansion.
Previously the MemOperand was set the same for both to cover the
full 8 bytes. Now we set a separate 4 byte mem operand for each
with a 4 byte offset for the high part.
2020-11-22 00:46:12 -08:00
Craig Topper 6a1d8b91ed [RISCV] Custom type legalize i32 bswap/bitreverse to GREVIW on RV64 with Zbp extension
Previously we required a sra to pattern match these properly in isel. If the consumer didn't need the result sign extended we'll have an srl instead of sra and fail to match.

This patch switches to custom legalizing to GREVIW using portions of D91259.

Differential Revision: https://reviews.llvm.org/D91457
2020-11-20 10:41:01 -08:00
Craig Topper 78767b7f8e [RISCV] Add RISCVISD::ROLW/RORW use those for custom legalizing i32 rotl/rotr on RV64IZbb.
This should result in better utilization of RORIW since we
don't need to look for a SIGN_EXTEND_INREG that may not exist.

Also remove rotl/rotr isel matching to GREVI and just prefer RORI.
This is to keep consistency so we don't have to match ROLW/RORW
to GREVIW as well. I imagine RORI/RORIW performance will be the
same or better than GREVI.

Differential Revision: https://reviews.llvm.org/D91449
2020-11-20 10:25:47 -08:00
Fraser Cormack 1ac9b54831 [RISCV] Lower GREVI and GORCI as custom nodes
This moves the recognition of GREVI and GORCI from TableGen patterns
into a DAGCombine. This is done primarily to match "deeper" patterns in
the future, like (grevi (grevi x, 1) 2) -> (grevi x, 3).

TableGen is not best suited to matching patterns such as these as the compile
time of the DAG matchers quickly gets out of hand due to the expansion of
commutative permutations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D91259
2020-11-19 18:11:42 +00:00
Fraser Cormack fe9dc2e54a [RISCV] Use a macro to simplify getTargetNodeName
Similar to the X86 and AMDGPU targets, this uses a macro to cut down on
repetitive and error-prone code when converting RISCVISD node names to
strings in getTargetNodeName.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D91414
2020-11-16 09:33:47 +00:00
Craig Topper 637f19c36b [RISCV] Remove traces of Glue from RISCVISD::SELECT_CC
We were creating RISCVISD::SELECT_CC nodes with Glue output that was never being used, and the tablegen SDNode had the SDNPInGlue flag instead of the SDNPOutGlue flag.

Since we don't seem to need the Glue just get rid of it from both places.

Differential Revision: https://reviews.llvm.org/D91199
2020-11-11 09:30:48 -08:00
Craig Topper 5d3fd3df94 [RISCV] Make ctlz/cttz cheap to speculatively execute so CodeGenPrepare won't insert a zero check.
Add additional isel patterns for ctzw/clzw instructions.

Differential Revision: https://reviews.llvm.org/D91040
2020-11-09 10:13:45 -08:00
Craig Topper 4265cbaa34 [RISCV] Make SIGN_EXTEND_INREG from i8/i16 legal when Zbb extension is enabled.
This produces better code for sign extend to i64 on RV32 target.

Differential Revision: https://reviews.llvm.org/D91023
2020-11-09 10:13:45 -08:00
Craig Topper ce5f4f22e9 [RISCV] Use the 'si' lib call for (double (fp_to_sint/uint i32 X)) when F extension is enabled.
D80526 added custom lowering to pick the si lib call on RV64, but this custom handling is only enabled when the F and D extension are both disabled. This prevents the si library call from being used for double when F is enabled but D is not.

This patch changes the behavior so we always enable the Custom hook on RV64 and decide in ReplaceNodeResults if we should emit a libcall based on whether the FP type should be softened or not.

Differential Revision: https://reviews.llvm.org/D90817
2020-11-05 10:46:45 -08:00
Craig Topper ce1270fc7e [RISCV] Remove shadow register list passed to AllocateReg when allocating FP registers for calling convention
The _F and _D registers are already sub/super registers. When one gets allocated all its aliases are already marked as allocated. We don't need to explicitly shadow it too.

I believe shadow is for calling conventions like 64-bit Windows on X86 where have rules like this

CCIfType<[i32], CCAssignToRegWithShadow<[ECX , EDX , R8D , R9D ],
                                         [XMM0, XMM1, XMM2, XMM3]>>

For that calling convention the argument number determines which register is used regardless of how many scalars or vectors came before it.

Removing this removes a question I had in D90738.

Differential Revision: https://reviews.llvm.org/D90801
2020-11-05 09:49:42 -08:00
Simon Pilgrim 36920d5f9d [RISCV] Avoid std::pair<> in FPReg StringSwitch to avoid MSVC compile failures. NFCI.
As discussed on D90322, some MSVC builds are failing with is_trivially_copyable static asserts (see D86126) - we can avoid this by not using the std::pair<unsigned,unsigned> which held both the FP+DP Registers, just handle the FP register and convert to DP on the fly.
2020-11-02 11:30:57 +00:00
Craig Topper a76cd10fcd [RISCV] Use 'unsigned' instead of Register in getRegForInlineAsmConstraint. NFC
The return value of this interface still uses an 'unsigned' on all
targets. So we convert Register back to unsigned at the end.

I'm hoping this will prevent the issue that caused the revert of
D90322.
2020-11-01 10:16:52 -08:00
Craig Topper 6915c76e10 [RISCV] Don't use DCI.CombineTo to replace a single result. NFCI
Just return the new node, which is the standard practice.

I also noticed what appeared to be an unnecessary attempt at
creating an ANY_EXTEND where the type should already be correct.
I replace with an assert to verify the type.

Differential Revision: https://reviews.llvm.org/D90444
2020-10-30 10:46:32 -07:00
Craig Topper 74b078294f [RISCV] Improve worklist management in the DAG combine for SLLW/SRLW/SRAW
This combine makes two calls to SimplifyDemandedBits, one for the LHS and one
for the RHS. If the LHS call returns true, we don't make the RHS call. When
SimplifyDemandedBits makes a change, it will add the nodes around the change to
the DAG combiner worklist. If the simplification happens on the first recursion
step, the N will get added to the worklist. But if the simplification happens
deeper in the recursion, then N will not be revisited until the next time the
DAG combiner runs.

This patch explicitly addes N to the worklist anytime a Simplification is made.
Without this we might miss additional simplifications on the LHS or never
simplify the RHS. Special care also needs to be taken to not add N if it has
been CSEd by the simplification. There are similar examples in DAGCombiner and
the X86 target, but I don't have a test for it for RISC-V. I've also returned
SDValue(N, 0) instead of SDValue() so DAGCombiner knows a change was made and
will update its Statistic variable.

The test here was constructed so that 2 simplifications happen to the LHS.
Without this fix one happens in the post type legalization DAG combine and the
other happens after LegalizeDAG. This prevents the RHS from ever being
simplified causing the left and right shift to clear the upper 32 bits of the
RHS to be left behind.

Differential Revision: https://reviews.llvm.org/D90339
2020-10-29 14:52:53 -07:00
StephenFan a96921afa7 [RISCV] eliminate the repetition declare of SDLoc DL
Differential revision: https://reviews.llvm.org/D85002
2020-08-03 10:24:30 +08:00