Optimize eliminate FP mechanism. This time optimize a function which has
no call but fixed stack objects. LLVM eliminates FP on such functions now.
Also, optimize GOT/PLT registers save/restore instructions if a given
function doesn't uses them. In addition, remove generating mechanism of
`.cfi` instructions since those are taken from other architectures and not
inspected yet. Update regression tests, also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92251
Change the way to truncate i64 to i32 in I64 registers. VE assumed
sext values previously. Change it to zext values this time to make
it match to the LLVM behaviour.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92226
Optimize emitSPAdjustment function to generate as small as possible
instructions to adjust SP.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92174
VE Vector Predicated (VVP) SDNodes form an intermediate layer between VE
vector instructions and the initial SDNodes.
We introduce 'vvp_add' with isel and tests as the first of these VVP
nodes. VVP nodes have a mask and explicit vector length operand, which
we will make proper use of later.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D91802
A getAdjustedFrameSize function may need to handle larger than 32 bits
integer, so change int to uint64_t.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91862
Implement getMinimumJumpTableEntries() to specify threshold for jump
table genaration. We use 8 for the case of PIC mode to relieve the
impact of PIC calculation required to implement PIC mode jump table.
Update jump table regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91785
This defines the vec_broadcast SDNode along with lowering and isel code.
We also remove unused type mappings for the vector register classes (all vector MVTs that are not used in the ISA go).
We will implement support for short vectors later by intercepting nodes with illegal vector EVTs before LLVM has had a chance to widen them.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D91646
Implement JumpTable to make BRIND work on VE. Update an existing
br_jt regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91582
Add lvm/svm intrinsic instructions and a regression test. Change
RegisterInfo to specify that VM0/VMP0 are constant and reserved
registers. This modifies a vst regression test, so update it.
Also add pseudo instructions for VM512 register classes
and mechanism to expand them after register allocation.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91541
This defines a 'fastcc' for the VE target and implements vreg-to-vreg
copy for parameter passing. The 'fastcc' extends the standard CC for
SX-Aurora with register passing of vector-typed parameters and return
values.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D90842
The VE backend represents vector instructions with an explicit 'i32'
vector length operand. In the VE ISA, the vector length is always read
from the VL hardware register. The LVLGen pass inserts 'lvl'
instructions as necessary to set VL to the right value before each
vector instruction.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D91416
Change the default type of v64 register class from v512i32 to v256f64.
Add a regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91301
Optimize address calculations using LEA/LEASL instructions.
Update comments in VEISelLowering.cpp also. Update an
existing regression test optimized by this modification.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D90878
`Replace ISD::SREM handling with KnownBits::srem to reduce code
duplication` (bf04e34383) changed
the result of rem.ll regression test. So, updating it.
`+vpu` controls whether VEISelLowering adds any vregs. This defaults to
`-vpu` to have scalar code generation out of the box. We bring up
vector isel under the `+vpu` flag. Once vector isel is stable we switch
to `+vpu` and advertise vregs and vops in TTI.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D90465
BRIND and BR_JT are not implmented yet, so expand them atm.
Add regression tests too.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D90283
Support atomic load instruction and add a regression test.
VE uses release consitency, so need to insert fence around
atomic instructions. This patch enable AtomicExpandPass
and use emitLeadingFence and emitTrailingFence mechanism
for such purpose.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D90135
Support atomic fence instruction and add a regression test.
Add MEMBARRIER pseudo insturction also to use it as a barrier
against to the compiler optimizations.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D90112
Add setcc for fp128 and clean existing ISel patterns. Also add
a regression test.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89683
Add missing ISel patterns related to select_cc DAG nodes.
Add regression test of all combination of possible scalar types.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89672
Support br_cc instruction comparing fp128 values. Add a br_cc.ll
regression test for all kind of br_cc instructions. And, clean
existing branch regression tests, this time. Clean a brcond.ll
regression test for brcond instruction. Remove mixed branch1.ll
regression test.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89627
Add an ISel pattern for fp128 select instruction and optimize generated
code for other types' select. instructions. Add a regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89509
VE doesn't have SHL_PARTS/SRA_PARTS/SRL_PARTS instructions, so need
to expand them. Add regression tests too.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89396
VE doesn't have instruction for copysign, so expand it. Add a
regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89228
VE doesn't have fneg or frem instruction, so change them to expand. Add
regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89205
VE doesn't have BRCOND instruction, so need to expand it. Also add
a regression test.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89173
Support register and frame-index pair correctly as operands of
generic load/store instrucitons, e.g. LD1BZXrri, STLrri, and etc.
Add regression tests also.
Differential Revision: https://reviews.llvm.org/D88779
Support f128 using VE instructions. Update regression tests.
I've noticed there is no load or store i128 test, so I add them too.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D86035
VE has only 64 bits AND/OR/XOR instructions. We pretended that VE has 32 bits
instructions also, but doing it increase the number of generated instructions.
Therefore, we decide to promote 32 bits operations and use only 64 bits
instructions in back end. We also avoid pretending that VE has 32 bits LEA
instruction. Update regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85726
Change bitreverse/bswap/ctlz/ctpop/cttz regression tests to support i128
and signext/zeroext i32 types. This patch also change the way to support
i32 types using 64 bits VE instructions.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85712
Change to expand MULHU/MULHS/UMUL_LOHI/SMUL_LOHI for i32 and i64 since
those instructions are not available on Aurora SX VE. Some of them
are used in expansion of i128 multiply, so need to modify them to
support i128. Then, update basic arithmetic regression tests of
i128 and signed/unsigned i32 typed integer values.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85490