llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Tatham	9e37892773	[ARM,MVE] Add intrinsics for vector get/set lane. This adds the `vgetq_lane` and `vsetq_lane` families, to copy between a scalar and a specified lane of a vector. One of the new `vgetq_lane` intrinsics returns a `float16_t`, which causes a compile error if `%clang_cc1` doesn't get the option `-fallow-half-arguments-and-returns`. The driver passes that option to cc1 already, but I've had to edit all the explicit cc1 command lines in the existing MVE intrinsics tests. A couple of fixes are included for the code I wrote up front in MveEmitter to support lane-index immediates (and which nothing has tested until now): the type was wrong (`uint32_t` instead of `int`) and the range was off by one. I've also added a method of bypassing the default promotion to `i32` that is done by the MveEmitter code generation: it's sensible to promote short scalars like `i16` to `i32` if they're going to be passed to custom IR intrinsics representing a machine instruction operating on GPRs, but not if they're going to be passed to standard IR operations like `insertelement` which expect the exact type. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70188	2019-11-15 09:53:58 +00:00
Simon Tatham	a12f588ebb	[ARM,MVE] Add intrinsics for contiguous load/stores. This patch adds the ACLE intrinsics for all the MVE load and store instructions not already handled by D69791. These ones don't need new IR intrinsics, because they can be implemented in terms of standard LLVM IR constructions. Some of the load and store instructions access less than 128 bits of memory, sign/zero extending each value to a wider vector lane on load or truncating it on store. These are represented in IR by a load of a shorter vector followed by a zext/sext, and conversely, a trunc followed by a short store. Existing ISel patterns already recognize those combinations and turn them into the right MVE instructions. The predicated forms of all these instructions are represented in the same way, except that the ordinary load/store operation is replaced with the existing intrinsics @llvm.masked.{load,store}. These are currently only code-generated as predicated MVE load/store instructions if you give LLVM the `-enable-arm-maskedldst` option; so I've done that in the LLVM codegen test. When we make that the default, that option can be removed. In the Tablegen backend, I've had to add a handful of extra support features: * We need to be able to make clang::Address objects out of a pointer and an alignment (previously we only needed these when the user passed us an existing one). * We can now specify vector types that aren't 128 bits wide (for use in those intermediate values in IR), the parametrized type system can make one starting from two existing vector types (using the lane count of one and the element type of the other). * I've added support for code generation of pointer casts, and for specifying LLVM types as operands to IRBuilder operations (for zext and sext, though I think they'll come in useful again). * Now not all IR construction operations need to be specified as Builder.CreateFoo; some don't involve a Builder at all, and one passes it as a parameter to a tiny static helper function in CGBuiltin.cpp. Reviewers: ostannard, MarkMurrayARM, dmgreen Subscribers: kristof.beyls, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70088	2019-11-13 12:47:00 +00:00

Author

SHA1

Message

Date

Simon Tatham

9e37892773

[ARM,MVE] Add intrinsics for vector get/set lane.

This adds the `vgetq_lane` and `vsetq_lane` families, to copy between
a scalar and a specified lane of a vector.

One of the new `vgetq_lane` intrinsics returns a `float16_t`, which
causes a compile error if `%clang_cc1` doesn't get the option
`-fallow-half-arguments-and-returns`. The driver passes that option to
cc1 already, but I've had to edit all the explicit cc1 command lines
in the existing MVE intrinsics tests.

A couple of fixes are included for the code I wrote up front in
MveEmitter to support lane-index immediates (and which nothing has
tested until now): the type was wrong (`uint32_t` instead of `int`)
and the range was off by one.

I've also added a method of bypassing the default promotion to `i32`
that is done by the MveEmitter code generation: it's sensible to
promote short scalars like `i16` to `i32` if they're going to be
passed to custom IR intrinsics representing a machine instruction
operating on GPRs, but not if they're going to be passed to standard
IR operations like `insertelement` which expect the exact type.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70188

2019-11-15 09:53:58 +00:00

Simon Tatham

a12f588ebb

[ARM,MVE] Add intrinsics for contiguous load/stores.

This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.

Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.

The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the `-enable-arm-maskedldst` option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.

In the Tablegen backend, I've had to add a handful of extra support
features:

* We need to be able to make clang::Address objects out of a
  pointer and an alignment (previously we only needed these when the
  user passed us an existing one).

* We can now specify vector types that aren't 128 bits wide (for use
  in those intermediate values in IR), the parametrized type system
  can make one starting from two existing vector types (using the lane
  count of one and the element type of the other).

* I've added support for code generation of pointer casts, and for
  specifying LLVM types as operands to IRBuilder operations (for zext
  and sext, though I think they'll come in useful again).

* Now not all IR construction operations need to be specified as
  Builder.CreateFoo; some don't involve a Builder at all, and one
  passes it as a parameter to a tiny static helper function in
  CGBuiltin.cpp.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Subscribers: kristof.beyls, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70088

2019-11-13 12:47:00 +00:00

2 Commits