llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	77f14c96e5	[RISCV] Use stack temporary to splat two GPRs into SEW=64 vector on RV32. Rather than doing splatting each separately and doing bit manipulation to merge them in the vector domain, copy the data to the stack and splat it using a strided load with x0 stride. At least on some implementations this vector load is optimized to not do a load for each element. This is equivalent to how we move i64 to f64 on RV32. I've only implemented this for the intrinsic fallbacks in this patch. I think we do similar splatting/shifting/oring in other places. If this is approved, I'll refactor the others to share the code. Differential Revision: https://reviews.llvm.org/D101002	2021-04-22 09:50:07 -07:00
Craig Topper	5a9a8c7cd4	[RISCV] Add more nxvi64 vector intrinsic tests for RV32. NFC This confirms we handle most instrutions gracefully. We do currently fail for vslide1up and vslide1down though.	2021-04-01 20:34:28 -07:00
Hsiangkai Wang	a2d19bad07	[RISCV] Use whole register load/store for generic load/store. In vector v0.10, there are whole vector register load/store instructions. I suggest to use the whole register load/store instructions for generic load/store for scalable vector types. It could save up vset{i}vl{i} for these load/store. For fractional LMUL, I keep to use vle{eew}.v/vse{eew}.v instructions to load/store partial vector registers. Differential Revision: https://reviews.llvm.org/D95853	2021-02-09 15:52:04 +08:00
Hsiangkai Wang	6e360460f1	[RISCV] Use v8-v23 as argument registers to conform to the proposal. The maximum LMUL is 8. We need 16 vector registers for two LMUL-8 arguments. The modification follows the proposal of psABI in https://github.com/riscv/riscv-elf-psabi-doc/pull/171 Differential Revision: https://reviews.llvm.org/D95134	2021-01-22 07:55:24 +08:00
Craig Topper	79cbb003c5	[RISCV] Don't use tail agnostic policy on instructions where destination is tied to source If the destination is tied, then user has some control of the register used for input. They would have the ability to control the value of any tail elements. By using tail agnostic we take this option away from them. Its not clear that the intrinsics are defined such that this isn't supposed to work. And undisturbed is a valid implementation for agnostic so code wouldn't even fail to work on all systems if we always used agnostic. The vcompress intrinsic is defined to require tail undisturbed so at minimum we need this for that instruction or need to redefine the intrinsic. I've made an exception here for vmv.s.x/fmv.s.f and reduction instructions which only write to element 0 regardless of the tail policy. This allows us to keep the agnostic policy on those which should allow better redundant vsetvli removal. An enhancement would be to check for undef input and keep the agnostic policy, but we don't have good test coverage for that yet. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D93878	2020-12-29 10:37:58 -08:00
Hsiangkai Wang	dd5281e7cc	[RISCV] Define vector mul/div/rem intrinsics. Define vector mul/div/rem intrinsics and lower them to V instructions. We work with @rogfer01 from BSC to come out this patch. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D93380	2020-12-17 11:50:17 +08:00

6 Commits