forked from OSchip/llvm-project
[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
Summary: Rework the GMIR documentation to focus more on the end user than the implementation and tie it in to the MIR document. There was also some out-of-date information which has been removed. The quality of the GenericOpcode reference is highly variable and drops sharply as I worked through them all but we've got to start somewhere :-). It would be great if others could expand on this too as there is an awful lot to get through. Also fix a typo in the definition of G_FLOG. Previously, the comments said we had two base-2's (G_FLOG and G_FLOG2). Reviewers: aemerson, volkan, rovka, arsenm Reviewed By: rovka Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69545
This commit is contained in:
parent
041f35c468
commit
ad0dfb0a25
|
@ -3,38 +3,35 @@
|
|||
Generic Machine IR
|
||||
==================
|
||||
|
||||
Machine IR operates on physical registers, register classes, and (mostly)
|
||||
target-specific instructions.
|
||||
|
||||
To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to
|
||||
Machine IR:
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
``NOTE``:
|
||||
The generic MIR (GMIR) representation still contains references to IR
|
||||
constructs (such as ``GlobalValue``). Removing those should let us write more
|
||||
accurate tests, or delete IR after building the initial MIR. However, it is
|
||||
not part of the GlobalISel effort.
|
||||
Generic MIR (gMIR) is an intermediate representation that shares the same data
|
||||
structures as :doc:`MachineIR (MIR) <../MIRLangRef>` but has more relaxed
|
||||
constraints. As the compilation pipeline proceeds, these constraints are
|
||||
gradually tightened until gMIR has become MIR.
|
||||
|
||||
The rest of this document will assume that you are familiar with the concepts
|
||||
in :doc:`MachineIR (MIR) <../MIRLangRef>` and will highlight the differences
|
||||
between MIR and gMIR.
|
||||
|
||||
.. _gmir-instructions:
|
||||
|
||||
Generic Instructions
|
||||
--------------------
|
||||
Generic Machine Instructions
|
||||
----------------------------
|
||||
|
||||
The main addition is support for pre-isel generic machine instructions (e.g.,
|
||||
``G_ADD``). Like other target-independent instructions (e.g., ``COPY`` or
|
||||
``PHI``), these are available on all targets.
|
||||
.. note::
|
||||
|
||||
``TODO``:
|
||||
While we're progressively adding instructions, one kind in particular exposes
|
||||
interesting problems: compares and how to represent condition codes.
|
||||
Some targets (x86, ARM) have generic comparisons setting multiple flags,
|
||||
which are then used by predicated variants.
|
||||
Others (IR) specify the predicate in the comparison and users just get a single
|
||||
bit. SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to
|
||||
represent this.
|
||||
This section expands on :ref:`mir-instructions` from the MIR Language
|
||||
Reference.
|
||||
|
||||
Whereas MIR deals largely in Target Instructions and only has a small set of
|
||||
target independent opcodes such as ``COPY``, ``PHI``, and ``REG_SEQUENCE``,
|
||||
gMIR defines a rich collection of ``Generic Opcodes`` which are target
|
||||
independent and describe operations which are typically supported by targets.
|
||||
One example is ``G_ADD`` which is the generic opcode for an integer addition.
|
||||
More information on each of the generic opcodes can be found at
|
||||
:doc:`GenericOpcode`.
|
||||
|
||||
The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
|
||||
a convenient way to create these generic instructions.
|
||||
|
@ -44,50 +41,109 @@ a convenient way to create these generic instructions.
|
|||
Generic Virtual Registers
|
||||
-------------------------
|
||||
|
||||
Generic instructions operate on a new kind of register: "generic" virtual
|
||||
registers. As opposed to non-generic vregs, they are not assigned a Register
|
||||
Class. Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned
|
||||
a :ref:`gmir-regbank`.
|
||||
.. note::
|
||||
|
||||
``MachineRegisterInfo`` tracks the same information that it does for
|
||||
non-generic vregs (e.g., use-def chains). Additionally, it also tracks the
|
||||
:ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``,
|
||||
its :ref:`gmir-regbank`, if any.
|
||||
This section expands on :ref:`mir-registers` from the MIR Language
|
||||
Reference.
|
||||
|
||||
For simplicity, most generic instructions only accept generic vregs:
|
||||
Generic virtual registers are like virtual registers but they are not assigned a
|
||||
Register Class constraint. Instead, generic virtual registers have less strict
|
||||
constraints starting with a :ref:`gmir-llt` and then further constrained to a
|
||||
:ref:`gmir-regbank`. Eventually they will be constrained to a register class
|
||||
at which point they become normal virtual registers.
|
||||
|
||||
* instead of immediates, they use a gvreg defined by an instruction
|
||||
materializing the immediate value (see :ref:`irtranslator-constants`).
|
||||
* instead of physical register, they use a gvreg defined by a ``COPY``.
|
||||
Generic virtual registers can be used with all the virtual register API's
|
||||
provided by ``MachineRegisterInfo``. In particular, the def-use chain API's can
|
||||
be used without needing to distinguish them from non-generic virtual registers.
|
||||
|
||||
``NOTE``:
|
||||
We started with an alternative representation, where MRI tracks a size for
|
||||
each gvreg, and instructions have lists of types.
|
||||
That had two flaws: the type and size are redundant, and there was no generic
|
||||
way of getting a given operand's type (as there was no 1:1 mapping between
|
||||
instruction types and operands).
|
||||
We considered putting the type in some variant of MCInstrDesc instead:
|
||||
See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
|
||||
need a type but this increases the memory footprint of the related objects
|
||||
For simplicity, most generic instructions only accept virtual registers (both
|
||||
generic and non-generic). There are some exceptions to this but in general:
|
||||
|
||||
* instead of immediates, they use a generic virtual register defined by an
|
||||
instruction that materializes the immediate value (see
|
||||
:ref:`irtranslator-constants`). Typically this is a G_CONSTANT or a
|
||||
G_FCONSTANT. One example of an exception to this rule is G_SEXT_INREG where
|
||||
having an immediate is mandatory.
|
||||
* instead of physical register, they use a generic virtual register that is
|
||||
either defined by a ``COPY`` from the physical register or used by a ``COPY``
|
||||
that defines the physical register.
|
||||
|
||||
.. admonition:: Historical Note
|
||||
|
||||
We started with an alternative representation, where MRI tracks a size for
|
||||
each generic virtual register, and instructions have lists of types.
|
||||
That had two flaws: the type and size are redundant, and there was no generic
|
||||
way of getting a given operand's type (as there was no 1:1 mapping between
|
||||
instruction types and operands).
|
||||
We considered putting the type in some variant of MCInstrDesc instead:
|
||||
See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
|
||||
need a type but this increases the memory footprint of the related objects
|
||||
|
||||
.. _gmir-regbank:
|
||||
|
||||
Register Bank
|
||||
-------------
|
||||
|
||||
A Register Bank is a set of register classes defined by the target.
|
||||
A bank has a size, which is the maximum store size of all covered classes.
|
||||
A Register Bank is a set of register classes defined by the target. This
|
||||
definition is rather loose so let's talk about what they can achieve.
|
||||
|
||||
In general, cross-class copies inside a bank are expected to be cheaper than
|
||||
copies across banks. They are also coalesceable by the register coalescer,
|
||||
whereas cross-bank copies are not.
|
||||
Suppose we have a processor that has two register files, A and B. These are
|
||||
equal in every way and support the same instructions for the same cost. They're
|
||||
just physically stored apart and each instruction can only access registers from
|
||||
A or register B but never a mix of the two. If we want to perform an operation
|
||||
on data that's in split between the two register files, we must first copy all
|
||||
the data into a single register file.
|
||||
|
||||
Also, equivalent operations can be performed on different banks using different
|
||||
instructions.
|
||||
Given a processor like this, we would benefit from clustering related data
|
||||
together into one register file so that we minimize the cost of copying data
|
||||
back and forth to satisfy the (possibly conflicting) requirements of all the
|
||||
instructions. Register Banks are a means to constrain the register allocator to
|
||||
use a particular register file for a virtual register.
|
||||
|
||||
For example, X86 can be seen as having 3 main banks: general-purpose, x87, and
|
||||
vector (which could be further split into a bank per domain for single vs
|
||||
double precision instructions).
|
||||
In practice, register files A and B are rarely equal. They can typically store
|
||||
the same data but there's usually some restrictions on what operations you can
|
||||
do on each register file. A fairly common pattern is for one of them to be
|
||||
accessible to integer operations and the other accessible to floating point
|
||||
operations. To accomodate this, let's rename A and B to GPR (general purpose
|
||||
registers) and FPR (floating point registers).
|
||||
|
||||
We now have some additional constraints that limit us. An operation like G_FMUL
|
||||
has to happen in FPR and G_ADD has to happen in GPR. However, even though this
|
||||
prescribes a lot of the assignments we still have some freedom. A G_LOAD can
|
||||
happen in both GPR and FPR, and which we want depends on who is going to consume
|
||||
the loaded data. Similarly, G_FNEG can happen in both GPR and FPR. If we assign
|
||||
it to FPR, then we'll use floating point negation. However, if we assign it to
|
||||
GPR then we can equivalently G_XOR the sign bit with 1 to invert it.
|
||||
|
||||
In summary, Register Banks are a means of disambiguating between seemingly
|
||||
equivalent choices based on some analysis of the differences when each choice
|
||||
is applied in a given context.
|
||||
|
||||
To give some concrete examples:
|
||||
|
||||
AArch64
|
||||
|
||||
AArch64 has three main banks. GPR for integer operations, FPR for floating
|
||||
point and also for the NEON vector instruction set. The third is CCR and
|
||||
describes the condition code register used for predication.
|
||||
|
||||
MIPS
|
||||
|
||||
MIPS has five main banks of which many programs only really use one or two.
|
||||
GPR is the general purpose bank for integer operations. FGR or CP1 is for
|
||||
the floating point operations as well as the MSA vector instructions and a
|
||||
few other application specific extensions. CP0 is for system registers and
|
||||
few programs will use it. CP2 and CP3 are for any application specific
|
||||
coprocessors that may be present in the chip. Arguably, there is also a sixth
|
||||
for the LO and HI registers but these are only used for the result of a few
|
||||
operations and it's of questionable value to model distinctly from GPR.
|
||||
|
||||
X86
|
||||
|
||||
X86 can be seen as having 3 main banks: general-purpose, x87, and
|
||||
vector (which could be further split into a bank per domain for single vs
|
||||
double precision instructions). It also looks like there's arguably a few
|
||||
more potential banks such as one for the AVX512 Mask Registers.
|
||||
|
||||
Register banks are described by a target-provided API,
|
||||
:ref:`RegisterBankInfo <api-registerbankinfo>`.
|
||||
|
@ -108,7 +164,6 @@ as size and number of vector lanes:
|
|||
* ``sN`` for scalars
|
||||
* ``pN`` for pointers
|
||||
* ``<N x sM>`` for vectors
|
||||
* ``unsized`` for labels, etc..
|
||||
|
||||
``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
|
||||
|
||||
|
@ -122,14 +177,13 @@ Here are some LLT examples and their ``EVT`` and ``Type`` equivalents:
|
|||
``s32`` ``i32`` ``i32``
|
||||
``s32`` ``f32`` ``float``
|
||||
``s17`` ``i17`` ``i17``
|
||||
``s16`` N/A ``{i8, i8}``
|
||||
``s32`` N/A ``[4 x i8]``
|
||||
``s16`` N/A ``{i8, i8}`` [#abi-dependent]_
|
||||
``s32`` N/A ``[4 x i8]`` [#abi-dependent]_
|
||||
``p0`` ``iPTR`` ``i8*``, ``i32*``, ``%opaque*``
|
||||
``p2`` ``iPTR`` ``i8 addrspace(2)*``
|
||||
``<4 x s32>`` ``v4f32`` ``<4 x float>``
|
||||
``s64`` ``v1f64`` ``<1 x double>``
|
||||
``<3 x s32>`` ``v3i32`` ``<3 x i32>``
|
||||
``unsized`` ``Other`` ``label``
|
||||
============= ========= ======================================
|
||||
|
||||
|
||||
|
@ -143,16 +197,23 @@ to SelectionDAG where address space is an attribute on operations.
|
|||
This representation better supports pointers having different sizes depending
|
||||
on their addressspace.
|
||||
|
||||
``NOTE``:
|
||||
Currently, LLT requires at least 2 elements in vectors, but some targets have
|
||||
the concept of a '1-element vector'. Representing them as their underlying
|
||||
scalar type is a nice simplification.
|
||||
.. note::
|
||||
|
||||
``TODO``:
|
||||
Currently, non-generic virtual registers, defined by non-pre-isel-generic
|
||||
instructions, cannot have a type, and thus cannot be used by a pre-isel generic
|
||||
instruction. Instead, they are given a type using a COPY. We could relax that
|
||||
and allow types on all vregs: this would reduce the number of MI required when
|
||||
emitting target-specific MIR early in the pipeline. This should purely be
|
||||
a compile-time optimization.
|
||||
.. caution::
|
||||
|
||||
Is this still true? I thought we'd removed the 1-element vector concept.
|
||||
Hypothetically, it could be distinct from a scalar but I think we failed to
|
||||
find a real occurrence.
|
||||
|
||||
Currently, LLT requires at least 2 elements in vectors, but some targets have
|
||||
the concept of a '1-element vector'. Representing them as their underlying
|
||||
scalar type is a nice simplification.
|
||||
|
||||
.. rubric:: Footnotes
|
||||
|
||||
.. [#abi-dependent] This mapping is ABI dependent. Here we've assumed no additional padding is required.
|
||||
|
||||
Generic Opcode Reference
|
||||
------------------------
|
||||
|
||||
The Generic Opcodes that are available are described at :doc:`GenericOpcode`.
|
||||
|
|
|
@ -0,0 +1,658 @@
|
|||
|
||||
.. _gmir-opcodes:
|
||||
|
||||
Generic Opcodes
|
||||
===============
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
|
||||
.. note::
|
||||
|
||||
This documentation does not yet fully account for vectors. Many of the
|
||||
scalar/integer/floating-point operations can also take vectors.
|
||||
|
||||
Constants
|
||||
---------
|
||||
|
||||
G_IMPLICIT_DEF
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
An undefined value.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0:_(s32) = G_IMPLICIT_DEF
|
||||
|
||||
G_CONSTANT
|
||||
^^^^^^^^^^
|
||||
|
||||
An integer constant.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0:_(s32) = G_CONSTANT i32 1
|
||||
|
||||
G_FCONSTANT
|
||||
^^^^^^^^^^^
|
||||
|
||||
A floating point constant.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0:_(s32) = G_FCONSTANT float 1.0
|
||||
|
||||
G_FRAME_INDEX
|
||||
^^^^^^^^^^^^^
|
||||
|
||||
The address of an object in the stack frame.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(p0) = G_FRAME_INDEX %stack.0.ptr0
|
||||
|
||||
G_GLOBAL_VALUE
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
The address of a global value.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0(p0) = G_GLOBAL_VALUE @var_local
|
||||
|
||||
G_BLOCK_ADDR
|
||||
^^^^^^^^^^^^
|
||||
|
||||
The address of a basic block.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0:_(p0) = G_BLOCK_ADDR blockaddress(@test_blockaddress, %ir-block.block)
|
||||
|
||||
Integer Extension and Truncation
|
||||
--------------------------------
|
||||
|
||||
G_ANYEXT
|
||||
^^^^^^^^
|
||||
|
||||
Extend the underlying scalar type of an operation, leaving the high bits
|
||||
unspecified.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_ANYEXT %0:_(s16)
|
||||
|
||||
G_SEXT
|
||||
^^^^^^
|
||||
|
||||
Sign extend the underlying scalar type of an operation, copying the sign bit
|
||||
into the newly-created space.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_SEXT %0:_(s16)
|
||||
|
||||
G_SEXT_INREG
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Sign extend the a value from an arbitrary bit position, copying the sign bit
|
||||
into all bits above it. This is equivalent to a shl + ashr pair with an
|
||||
appropriate shift amount. $sz is an immediate (MachineOperand::isImm()
|
||||
returns true) to allow targets to have some bitwidths legal and others
|
||||
lowered. This opcode is particularly useful if the target has sign-extension
|
||||
instructions that are cheaper than the constituent shifts as the optimizer is
|
||||
able to make decisions on whether it's better to hang on to the G_SEXT_INREG
|
||||
or to lower it and optimize the individual shifts.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_SEXT_INREG %0:_(s32), 16
|
||||
|
||||
G_ZEXT
|
||||
^^^^^^
|
||||
|
||||
Zero extend the underlying scalar type of an operation, putting zero bits
|
||||
into the newly-created space.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_ZEXT %0:_(s16)
|
||||
|
||||
G_TRUNC
|
||||
^^^^^^^
|
||||
|
||||
Truncate the underlying scalar type of an operation. This is equivalent to
|
||||
G_EXTRACT for scalar types, but acts elementwise on vectors.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s16) = G_TRUNC %0:_(s32)
|
||||
|
||||
Type Conversions
|
||||
----------------
|
||||
|
||||
G_INTTOPTR
|
||||
^^^^^^^^^^
|
||||
|
||||
Convert an integer to a pointer.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(p0) = G_INTTOPTR %0:_(s32)
|
||||
|
||||
G_PTRTOINT
|
||||
^^^^^^^^^^
|
||||
|
||||
Convert an pointer to an integer.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_PTRTOINT %0:_(p0)
|
||||
|
||||
G_BITCAST
|
||||
^^^^^^^^^
|
||||
|
||||
Reinterpret a value as a new type. This is usually done without changing any
|
||||
bits but this is not always the case due a sublety in the definition of the
|
||||
:ref:`LLVM-IR Bitcast Instruction <i_bitcast>`.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s64) = G_BITCAST %0:_(<2 x s32>)
|
||||
|
||||
G_ADDRSPACE_CAST
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Convert a pointer to an address space to a pointer to another address space.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(p1) = G_ADDRSPACE_CAST %0:_(p0)
|
||||
|
||||
.. caution::
|
||||
|
||||
:ref:`i_addrspacecast` doesn't mention what happens if the cast is simply
|
||||
invalid (i.e. if the address spaces are disjoint).
|
||||
|
||||
Scalar Operations
|
||||
-----------------
|
||||
|
||||
G_EXTRACT
|
||||
^^^^^^^^^
|
||||
|
||||
Extract a register of the specified size, starting from the block given by
|
||||
index. This will almost certainly be mapped to sub-register COPYs after
|
||||
register banks have been selected.
|
||||
|
||||
G_INSERT
|
||||
^^^^^^^^
|
||||
|
||||
Insert a smaller register into a larger one at the specified bit-index.
|
||||
|
||||
G_MERGE_VALUES
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Concatenate multiple registers of the same size into a wider register.
|
||||
The input operands are always ordered from lowest bits to highest:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8),
|
||||
%bits_16_23:(s8), %bits_24_31:(s8)
|
||||
|
||||
G_UNMERGE_VALUES
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Extract multiple registers specified size, starting from blocks given by
|
||||
indexes. This will almost certainly be mapped to sub-register COPYs after
|
||||
register banks have been selected.
|
||||
The output operands are always ordered from lowest bits to highest:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%bits_0_7:(s8), %bits_8_15:(s8),
|
||||
%bits_16_23:(s8), %bits_24_31:(s8) = G_UNMERGE_VALUES %0:(s32)
|
||||
|
||||
G_BSWAP
|
||||
^^^^^^^
|
||||
|
||||
Reverse the order of the bytes in a scalar
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_BSWAP %0:_(s32)
|
||||
|
||||
G_BITREVERSE
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Reverse the order of the bits in a scalar
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(s32) = G_BITREVERSE %0:_(s32)
|
||||
|
||||
Integer Operations
|
||||
-------------------
|
||||
|
||||
G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
These each perform their respective integer arithmetic on a scalar.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%2:_(s32) = G_ADD %0:_(s32), %1:_(s32)
|
||||
|
||||
G_SHL, G_LSHR, G_ASHR
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Shift the bits of a scalar left or right inserting zeros (sign-bit for G_ASHR).
|
||||
|
||||
G_ICMP
|
||||
^^^^^^
|
||||
|
||||
Perform integer comparison producing non-zero (true) or zero (false). It's
|
||||
target specific whether a true value is 1, ~0U, or some other non-zero value.
|
||||
|
||||
G_SELECT
|
||||
^^^^^^^^
|
||||
|
||||
Select between two values depending on a zero/non-zero value.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%5:_(s32) = G_SELECT %4(s1), %6, %2
|
||||
|
||||
G_PTR_ADD
|
||||
^^^^^^^^^
|
||||
|
||||
Add an offset to a pointer measured in addressible units. Addressible units are
|
||||
typically bytes but this can vary between targets.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(p0) = G_PTR_MASK %0, 3
|
||||
|
||||
G_PTR_MASK
|
||||
^^^^^^^^^^
|
||||
|
||||
Zero the least significant N bits of a pointer.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1:_(p0) = G_PTR_MASK %0, 3
|
||||
|
||||
G_SMIN, G_SMAX, G_UMIN, G_UMAX
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Take the minimum/maximum of two values.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%5:_(s32) = G_SMIN %6, %2
|
||||
|
||||
G_UADDO, G_SADDO, G_USUBO, G_SSUBO, G_SMULO, G_UMULO
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Perform the requested arithmetic and produce a carry output in addition to the
|
||||
normal result.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%3:_(s32), %4:_(s1) = G_UADDO %0, %1
|
||||
|
||||
G_UADDE, G_SADDE, G_USUBE, G_SSUBE
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Perform the requested arithmetic and consume a carry input in addition to the
|
||||
normal input. Also produce a carry output in addition to the normal result.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%3:_(s32), %4:_(s1) = G_UADDO %0, %1
|
||||
|
||||
G_UMULH, G_SMULH
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Multiply two numbers at twice the incoming bit width (signed) and return
|
||||
the high half of the result
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%3:_(s32), %4:_(s1) = G_UADDO %0, %1
|
||||
|
||||
G_CTLZ, G_CTTZ, G_CTPOP
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Count leading zeros, trailing zeros, or number of set bits
|
||||
|
||||
G_CTLZ_ZERO_UNDEF, G_CTTZ_ZERO_UNDEF
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Count leading zeros or trailing zeros. If the value is zero then the result is
|
||||
undefined.
|
||||
|
||||
Floating Point Operations
|
||||
-------------------------
|
||||
|
||||
G_FCMP
|
||||
^^^^^^
|
||||
|
||||
Perform floating point comparison producing non-zero (true) or zero
|
||||
(false). It's target specific whether a true value is 1, ~0U, or some other
|
||||
non-zero value.
|
||||
|
||||
G_FNEG
|
||||
^^^^^^
|
||||
|
||||
Floating point negation
|
||||
|
||||
G_FPEXT
|
||||
^^^^^^^
|
||||
|
||||
Convert a floating point value to a larger type
|
||||
|
||||
G_FPTRUNC
|
||||
^^^^^^^^^
|
||||
|
||||
Convert a floating point value to a narrower type
|
||||
|
||||
G_FPTOSI, G_FPTOUI, G_SITOFP, G_UITOFP
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Convert between integer and floating point
|
||||
|
||||
G_FABS
|
||||
^^^^^^
|
||||
|
||||
Take the absolute value of a floating point value
|
||||
|
||||
G_FCOPYSIGN
|
||||
^^^^^^^^^^^
|
||||
|
||||
Copy the value of the first operand, replacing the sign bit with that of the
|
||||
second operand.
|
||||
|
||||
G_FCANONICALIZE
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
See :ref:`i_intr_llvm_canonicalize`
|
||||
|
||||
G_FMINNUM
|
||||
^^^^^^^^^
|
||||
|
||||
Perform floating-point minimum on two values.
|
||||
|
||||
In the case where a single input is a NaN (either signaling or quiet),
|
||||
the non-NaN input is returned.
|
||||
|
||||
The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.
|
||||
|
||||
G_FMAXNUM
|
||||
^^^^^^^^^
|
||||
|
||||
Perform floating-point maximum on two values.
|
||||
|
||||
In the case where a single input is a NaN (either signaling or quiet),
|
||||
the non-NaN input is returned.
|
||||
|
||||
The return value of (FMAXNUM 0.0, -0.0) could be either 0.0 or -0.0.
|
||||
|
||||
G_FMINNUM_IEEE
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Perform floating-point minimum on two values, following the IEEE-754 2008
|
||||
definition. This differs from FMINNUM in the handling of signaling NaNs. If one
|
||||
input is a signaling NaN, returns a quiet NaN.
|
||||
|
||||
G_FMAXNUM_IEEE
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Perform floating-point maximum on two values, following the IEEE-754 2008
|
||||
definition. This differs from FMAXNUM in the handling of signaling NaNs. If one
|
||||
input is a signaling NaN, returns a quiet NaN.
|
||||
|
||||
G_FMINIMUM
|
||||
^^^^^^^^^^
|
||||
|
||||
NaN-propagating minimum that also treat -0.0 as less than 0.0. While
|
||||
FMINNUM_IEEE follow IEEE 754-2008 semantics, FMINIMUM follows IEEE 754-2018
|
||||
draft semantics.
|
||||
|
||||
G_FMAXIMUM
|
||||
^^^^^^^^^^
|
||||
|
||||
NaN-propagating maximum that also treat -0.0 as less than 0.0. While
|
||||
FMAXNUM_IEEE follow IEEE 754-2008 semantics, FMAXIMUM follows IEEE 754-2018
|
||||
draft semantics.
|
||||
|
||||
G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FREM
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Perform the specified floating point arithmetic.
|
||||
|
||||
G_FMA
|
||||
^^^^^
|
||||
|
||||
Perform a fused multiple add (i.e. without the intermediate rounding step).
|
||||
|
||||
G_FMAD
|
||||
^^^^^^
|
||||
|
||||
Perform a non-fused multiple add (i.e. with the intermediate rounding step).
|
||||
|
||||
G_FPOW
|
||||
^^^^^^
|
||||
|
||||
Raise the first operand to the power of the second.
|
||||
|
||||
G_FEXP, G_FEXP2
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
Calculate the base-e or base-2 exponential of a value
|
||||
|
||||
G_FLOG, G_FLOG2, G_FLOG10
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Calculate the base-e, base-2, or base-10 respectively.
|
||||
|
||||
G_FCEIL, G_FCOS, G_FSIN, G_FSQRT, G_FFLOOR, G_FRINT, G_FNEARBYINT
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
These correspond to the standard C functions of the same name.
|
||||
|
||||
G_INTRINSIC_TRUNC
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
Returns the operand rounded to the nearest integer not larger in magnitude than the operand.
|
||||
|
||||
G_INTRINSIC_ROUND
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
Returns the operand rounded to the nearest integer.
|
||||
|
||||
Vector Specific Operations
|
||||
--------------------------
|
||||
|
||||
G_CONCAT_VECTORS
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Concatenate two vectors to form a longer vector.
|
||||
|
||||
G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Create a vector from multiple scalar registers. No implicit
|
||||
conversion is performed (i.e. the result element type must be the
|
||||
same as all source operands)
|
||||
|
||||
The _TRUNC version truncates the larger operand types to fit the
|
||||
destination vector elt type.
|
||||
|
||||
G_INSERT_VECTOR_ELT
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Insert an element into a vector
|
||||
|
||||
G_EXTRACT_VECTOR_ELT
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Extract an element from a vector
|
||||
|
||||
G_SHUFFLE_VECTOR
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Concatenate two vectors and shuffle the elements according to the mask operand.
|
||||
The mask operand should be an IR Constant which exactly matches the
|
||||
corresponding mask for the IR shufflevector instruction.
|
||||
|
||||
Memory Operations
|
||||
-----------------
|
||||
|
||||
G_LOAD, G_SEXTLOAD, G_ZEXTLOAD
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Generic load. Expects a MachineMemOperand in addition to explicit
|
||||
operands. If the result size is larger than the memory size, the
|
||||
high bits are undefined, sign-extended, or zero-extended respectively.
|
||||
|
||||
Only G_LOAD is valid if the result is a vector type. If the result is larger
|
||||
than the memory size, the high elements are undefined (i.e. this is not a
|
||||
per-element, vector anyextload)
|
||||
|
||||
G_INDEXED_LOAD
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Generic indexed load. Combines a GEP with a load. $newaddr is set to $base + $offset.
|
||||
If $am is 0 (post-indexed), then the value is loaded from $base; if $am is 1 (pre-indexed)
|
||||
then the value is loaded from $newaddr.
|
||||
|
||||
G_INDEXED_SEXTLOAD
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Same as G_INDEXED_LOAD except that the load performed is sign-extending, as with G_SEXTLOAD.
|
||||
|
||||
G_INDEXED_ZEXTLOAD
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Same as G_INDEXED_LOAD except that the load performed is zero-extending, as with G_ZEXTLOAD.
|
||||
|
||||
G_STORE
|
||||
^^^^^^^
|
||||
|
||||
Generic store. Expects a MachineMemOperand in addition to explicit operands.
|
||||
|
||||
G_INDEXED_STORE
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
Combines a store with a GEP. See description of G_INDEXED_LOAD for indexing behaviour.
|
||||
|
||||
G_ATOMIC_CMPXCHG_WITH_SUCCESS
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Generic atomic cmpxchg with internal success check. Expects a
|
||||
MachineMemOperand in addition to explicit operands.
|
||||
|
||||
G_ATOMIC_CMPXCHG
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Generic atomic cmpxchg. Expects a MachineMemOperand in addition to explicit
|
||||
operands.
|
||||
|
||||
G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD, G_ATOMICRMW_SUB, G_ATOMICRMW_AND, G_ATOMICRMW_NAND, G_ATOMICRMW_OR, G_ATOMICRMW_XOR, G_ATOMICRMW_MAX, G_ATOMICRMW_MIN, G_ATOMICRMW_UMAX, G_ATOMICRMW_UMIN, G_ATOMICRMW_FADD, G_ATOMICRMW_FSUB
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Generic atomicrmw. Expects a MachineMemOperand in addition to explicit
|
||||
operands.
|
||||
|
||||
G_FENCE
|
||||
^^^^^^^
|
||||
|
||||
.. caution::
|
||||
|
||||
I couldn't find any documentation on this at the time of writing.
|
||||
|
||||
Control Flow
|
||||
------------
|
||||
|
||||
G_PHI
|
||||
^^^^^
|
||||
|
||||
Implement the φ node in the SSA graph representing the function.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%1(s8) = G_PHI %7(s8), %bb.0, %3(s8), %bb.1
|
||||
|
||||
G_BR
|
||||
^^^^
|
||||
|
||||
Unconditional branch
|
||||
|
||||
G_BRCOND
|
||||
^^^^^^^^
|
||||
|
||||
Conditional branch
|
||||
|
||||
G_BRINDIRECT
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Indirect branch
|
||||
|
||||
G_BRJT
|
||||
^^^^^^
|
||||
|
||||
Indirect branch to jump table entry
|
||||
|
||||
G_JUMP_TABLE
|
||||
^^^^^^^^^^^^
|
||||
|
||||
.. caution::
|
||||
|
||||
I found no documentation for this instruction at the time of writing.
|
||||
|
||||
G_INTRINSIC, G_INTRINSIC_W_SIDE_EFFECTS
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Call an intrinsic
|
||||
|
||||
The _W_SIDE_EFFECTS version is considered to have unknown side-effects and
|
||||
as such cannot be reordered acrosss other side-effecting instructions.
|
||||
|
||||
.. note::
|
||||
|
||||
Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted
|
||||
to have zero, one, or multiple results.
|
||||
|
||||
Variadic Arguments
|
||||
------------------
|
||||
|
||||
G_VASTART
|
||||
^^^^^^^^^
|
||||
|
||||
.. caution::
|
||||
|
||||
I found no documentation for this instruction at the time of writing.
|
||||
|
||||
G_VAARG
|
||||
^^^^^^^
|
||||
|
||||
.. caution::
|
||||
|
||||
I found no documentation for this instruction at the time of writing.
|
||||
|
||||
Other Operations
|
||||
----------------
|
||||
|
||||
G_DYN_STACKALLOC
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Dynamically realign the stack pointer to the specified alignment
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
%8:_(p0) = G_DYN_STACKALLOC %7(s64), 32
|
||||
|
||||
.. caution::
|
||||
|
||||
What does it mean for the immediate to be 0? It happens in the tests
|
|
@ -50,6 +50,7 @@ the following sections.
|
|||
:maxdepth: 1
|
||||
|
||||
GMIR
|
||||
GenericOpcode
|
||||
Pipeline
|
||||
Porting
|
||||
Resources
|
||||
|
|
|
@ -13954,6 +13954,8 @@ Examples
|
|||
Specialised Arithmetic Intrinsics
|
||||
---------------------------------
|
||||
|
||||
.. _i_intr_llvm_canonicalize:
|
||||
|
||||
'``llvm.canonicalize.*``' Intrinsic
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
|
|
|
@ -345,6 +345,8 @@ specified in brackets after the block's definition:
|
|||
|
||||
``Alignment`` is specified in bytes, and must be a power of two.
|
||||
|
||||
.. _mir-instructions:
|
||||
|
||||
Machine Instructions
|
||||
--------------------
|
||||
|
||||
|
@ -407,6 +409,8 @@ The syntax for bundled instructions is the following:
|
|||
The first instruction is often a bundle header. The instructions between ``{``
|
||||
and ``}`` are bundled with the first instruction.
|
||||
|
||||
.. _mir-registers:
|
||||
|
||||
Registers
|
||||
---------
|
||||
|
||||
|
|
|
@ -406,8 +406,9 @@ public:
|
|||
|
||||
/// Build and insert \p Res = G_PTR_ADD \p Op0, \p Op1
|
||||
///
|
||||
/// G_PTR_ADD adds \p Op1 bytes to the pointer specified by \p Op0,
|
||||
/// storing the resulting pointer in \p Res.
|
||||
/// G_PTR_ADD adds \p Op1 addressible units to the pointer specified by \p Op0,
|
||||
/// storing the resulting pointer in \p Res. Addressible units are typically
|
||||
/// bytes but this can vary between targets.
|
||||
///
|
||||
/// \pre setBasicBlock or setMI must have been called.
|
||||
/// \pre \p Res and \p Op0 must be generic virtual registers with pointer
|
||||
|
|
|
@ -670,7 +670,7 @@ def G_FEXP2 : GenericInstruction {
|
|||
let hasSideEffects = 0;
|
||||
}
|
||||
|
||||
// Floating point base-2 logarithm of a value.
|
||||
// Floating point base-e logarithm of a value.
|
||||
def G_FLOG : GenericInstruction {
|
||||
let OutOperandList = (outs type0:$dst);
|
||||
let InOperandList = (ins type0:$src1);
|
||||
|
|
Loading…
Reference in New Issue