From b9683d3c53d6043d7cbeabf451860c027557be96 Mon Sep 17 00:00:00 2001 From: Dmitry Preobrazhensky Date: Wed, 25 Sep 2019 12:38:35 +0000 Subject: [PATCH] [AMDGPU][MC][DOC] Updated AMD GPU assembler description. Summary of changes: - Updated to reflect recent changes in assembler; - Minor bugfixing and improvements. llvm-svn: 372857 --- llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst | 6 +- llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst | 6 +- llvm/docs/AMDGPU/gfx10_bimm16.rst | 2 +- llvm/docs/AMDGPU/gfx10_bimm32.rst | 2 +- .../AMDGPU/gfx10_data_mimg_atomic_cmp.rst | 2 +- .../AMDGPU/gfx10_data_mimg_atomic_reg.rst | 2 +- llvm/docs/AMDGPU/gfx10_fimm16.rst | 3 +- llvm/docs/AMDGPU/gfx10_fimm32.rst | 3 +- llvm/docs/AMDGPU/gfx10_hwreg.rst | 35 +- llvm/docs/AMDGPU/gfx10_label.rst | 23 +- llvm/docs/AMDGPU/gfx10_msg.rst | 60 +- llvm/docs/AMDGPU/gfx10_perm_smem.rst | 3 +- llvm/docs/AMDGPU/gfx10_simm16.rst | 2 +- llvm/docs/AMDGPU/gfx10_uimm16.rst | 2 +- llvm/docs/AMDGPU/gfx10_waitcnt.rst | 47 +- llvm/docs/AMDGPU/gfx7_bimm16.rst | 2 +- llvm/docs/AMDGPU/gfx7_bimm32.rst | 2 +- .../docs/AMDGPU/gfx7_data_mimg_atomic_cmp.rst | 2 +- .../docs/AMDGPU/gfx7_data_mimg_atomic_reg.rst | 2 +- llvm/docs/AMDGPU/gfx7_fimm32.rst | 3 +- llvm/docs/AMDGPU/gfx7_hwreg.rst | 35 +- llvm/docs/AMDGPU/gfx7_label.rst | 23 +- llvm/docs/AMDGPU/gfx7_msg.rst | 58 +- llvm/docs/AMDGPU/gfx7_simm16.rst | 2 +- llvm/docs/AMDGPU/gfx7_uimm16.rst | 2 +- llvm/docs/AMDGPU/gfx7_waitcnt.rst | 46 +- llvm/docs/AMDGPU/gfx8_bimm16.rst | 2 +- llvm/docs/AMDGPU/gfx8_bimm32.rst | 2 +- .../docs/AMDGPU/gfx8_data_mimg_atomic_cmp.rst | 2 +- .../docs/AMDGPU/gfx8_data_mimg_atomic_reg.rst | 2 +- llvm/docs/AMDGPU/gfx8_fimm16.rst | 3 +- llvm/docs/AMDGPU/gfx8_fimm32.rst | 3 +- llvm/docs/AMDGPU/gfx8_hwreg.rst | 35 +- llvm/docs/AMDGPU/gfx8_imask.rst | 66 +++ llvm/docs/AMDGPU/gfx8_imm4.rst | 25 - llvm/docs/AMDGPU/gfx8_label.rst | 23 +- llvm/docs/AMDGPU/gfx8_msg.rst | 58 +- llvm/docs/AMDGPU/gfx8_perm_smem.rst | 3 +- llvm/docs/AMDGPU/gfx8_simm16.rst | 2 +- llvm/docs/AMDGPU/gfx8_uimm16.rst | 2 +- llvm/docs/AMDGPU/gfx8_waitcnt.rst | 46 +- llvm/docs/AMDGPU/gfx9_bimm16.rst | 2 +- llvm/docs/AMDGPU/gfx9_bimm32.rst | 2 +- .../docs/AMDGPU/gfx9_data_mimg_atomic_cmp.rst | 2 +- .../docs/AMDGPU/gfx9_data_mimg_atomic_reg.rst | 2 +- llvm/docs/AMDGPU/gfx9_fimm16.rst | 3 +- llvm/docs/AMDGPU/gfx9_fimm32.rst | 3 +- llvm/docs/AMDGPU/gfx9_hwreg.rst | 35 +- llvm/docs/AMDGPU/gfx9_imask.rst | 66 +++ llvm/docs/AMDGPU/gfx9_imm4.rst | 25 - llvm/docs/AMDGPU/gfx9_label.rst | 23 +- llvm/docs/AMDGPU/gfx9_msg.rst | 60 +- llvm/docs/AMDGPU/gfx9_perm_smem.rst | 3 +- llvm/docs/AMDGPU/gfx9_simm16.rst | 2 +- llvm/docs/AMDGPU/gfx9_uimm16.rst | 2 +- llvm/docs/AMDGPU/gfx9_waitcnt.rst | 47 +- llvm/docs/AMDGPUModifierSyntax.rst | 296 ++++++---- llvm/docs/AMDGPUOperandSyntax.rst | 545 ++++++++---------- 58 files changed, 1058 insertions(+), 709 deletions(-) create mode 100644 llvm/docs/AMDGPU/gfx8_imask.rst delete mode 100644 llvm/docs/AMDGPU/gfx8_imm4.rst create mode 100644 llvm/docs/AMDGPU/gfx9_imask.rst delete mode 100644 llvm/docs/AMDGPU/gfx9_imm4.rst diff --git a/llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst b/llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst index 9c762aa0f838..a0e514e29b2c 100644 --- a/llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst +++ b/llvm/docs/AMDGPU/AMDGPUAsmGFX8.rst @@ -566,7 +566,7 @@ SOPC s_cmp_lg_u64 :ref:`ssrc0`, :ref:`ssrc1` s_cmp_lt_i32 :ref:`ssrc0`, :ref:`ssrc1` s_cmp_lt_u32 :ref:`ssrc0`, :ref:`ssrc1` - s_set_gpr_idx_on :ref:`ssrc`, :ref:`imm4` + s_set_gpr_idx_on :ref:`ssrc`, :ref:`imask` s_setvskip :ref:`ssrc0`, :ref:`ssrc1` SOPK @@ -624,7 +624,7 @@ SOPP s_nop :ref:`imm16` s_sendmsg :ref:`msg` s_sendmsghalt :ref:`msg` - s_set_gpr_idx_mode :ref:`imm4` + s_set_gpr_idx_mode :ref:`imask` s_set_gpr_idx_off s_sethalt :ref:`imm16` s_setkill :ref:`imm16` @@ -1756,7 +1756,7 @@ VOPC gfx8_fimm16 gfx8_fimm32 gfx8_hwreg - gfx8_imm4 + gfx8_imask gfx8_label gfx8_msg gfx8_param diff --git a/llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst b/llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst index 3ffb9fd5c8db..8ce056c8caf8 100644 --- a/llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst +++ b/llvm/docs/AMDGPU/AMDGPUAsmGFX9.rst @@ -736,7 +736,7 @@ SOPC s_cmp_lg_u64 :ref:`ssrc0`, :ref:`ssrc1` s_cmp_lt_i32 :ref:`ssrc0`, :ref:`ssrc1` s_cmp_lt_u32 :ref:`ssrc0`, :ref:`ssrc1` - s_set_gpr_idx_on :ref:`ssrc`, :ref:`imm4` + s_set_gpr_idx_on :ref:`ssrc`, :ref:`imask` s_setvskip :ref:`ssrc0`, :ref:`ssrc1` SOPK @@ -796,7 +796,7 @@ SOPP s_nop :ref:`imm16` s_sendmsg :ref:`msg` s_sendmsghalt :ref:`msg` - s_set_gpr_idx_mode :ref:`imm4` + s_set_gpr_idx_mode :ref:`imask` s_set_gpr_idx_off s_sethalt :ref:`imm16` s_setkill :ref:`imm16` @@ -2010,7 +2010,7 @@ VOPC gfx9_fimm16 gfx9_fimm32 gfx9_hwreg - gfx9_imm4 + gfx9_imask gfx9_label gfx9_msg gfx9_param diff --git a/llvm/docs/AMDGPU/gfx10_bimm16.rst b/llvm/docs/AMDGPU/gfx10_bimm16.rst index 00e9b71b92e7..689ac46d94ba 100644 --- a/llvm/docs/AMDGPU/gfx10_bimm16.rst +++ b/llvm/docs/AMDGPU/gfx10_bimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits. +A 16-bit :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx10_bimm32.rst b/llvm/docs/AMDGPU/gfx10_bimm32.rst index c4b87df907f7..7e1bd334ebfc 100644 --- a/llvm/docs/AMDGPU/gfx10_bimm32.rst +++ b/llvm/docs/AMDGPU/gfx10_bimm32.rst @@ -10,5 +10,5 @@ imm32 =========================== -An :ref:`integer_number`. The value is truncated to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value is truncated to 32 bits. diff --git a/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_cmp.rst b/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_cmp.rst index bd257cf570b8..6b5cd6814350 100644 --- a/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_cmp.rst +++ b/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_cmp.rst @@ -21,7 +21,7 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_reg.rst b/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_reg.rst index 930dab31321d..7b74290a5652 100644 --- a/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_reg.rst +++ b/llvm/docs/AMDGPU/gfx10_data_mimg_atomic_reg.rst @@ -21,6 +21,6 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx10_fimm16.rst b/llvm/docs/AMDGPU/gfx10_fimm16.rst index c4d85be860d6..5c1dccc46892 100644 --- a/llvm/docs/AMDGPU/gfx10_fimm16.rst +++ b/llvm/docs/AMDGPU/gfx10_fimm16.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The number is converted to *f16* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f16* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx10_fimm32.rst b/llvm/docs/AMDGPU/gfx10_fimm32.rst index 7ff84cc051b6..258762c4ed7e 100644 --- a/llvm/docs/AMDGPU/gfx10_fimm32.rst +++ b/llvm/docs/AMDGPU/gfx10_fimm32.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The value is converted to *f32* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f32* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx10_hwreg.rst b/llvm/docs/AMDGPU/gfx10_hwreg.rst index 64d441bc72f5..56e2c66cbcf2 100644 --- a/llvm/docs/AMDGPU/gfx10_hwreg.rst +++ b/llvm/docs/AMDGPU/gfx10_hwreg.rst @@ -14,18 +14,21 @@ Bits of a hardware register being accessed. The bits of this operand have the following meaning: - ============ =================================== - Bits Description - ============ =================================== - 5:0 Register *id*. - 10:6 First bit *offset* (0..31). - 15:11 *Size* in bits (1..32). - ============ =================================== + ======= ===================== ============ + Bits Description Value Range + ======= ===================== ============ + 5:0 Register *id*. 0..63 + 10:6 First bit *offset*. 0..31 + 15:11 *Size* in bits. 1..32 + ======= ===================== ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below. +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* An *hwreg* value described below. ==================================== ============================================================================ - Syntax Description + Hwreg Value Syntax Description ==================================== ============================================================================ hwreg({0..63}) All bits of a register indicated by its *id*. hwreg(<*name*>) All bits of a register indicated by its *name*. @@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. ==================================== ============================================================================ -Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Defined register *names* include: @@ -62,7 +66,16 @@ Examples: .. parsed-literal:: - s_getreg_b32 s2, 0x6 + reg = 1 + offset = 2 + size = 4 + hwreg_enc = reg | (offset << 6) | ((size - 1) << 11) + + s_getreg_b32 s2, 0x1881 + s_getreg_b32 s2, hwreg_enc // the same as above + s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above + s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above + s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) diff --git a/llvm/docs/AMDGPU/gfx10_label.rst b/llvm/docs/AMDGPU/gfx10_label.rst index 288c6c023a5c..40c973eaf675 100644 --- a/llvm/docs/AMDGPU/gfx10_label.rst +++ b/llvm/docs/AMDGPU/gfx10_label.rst @@ -12,19 +12,26 @@ label A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. -This operand may be specified as: +This operand may be specified as one of the following: -* An :ref:`integer_number`. The number is truncated to 16 bits. -* An :ref:`absolute_expression` which must start with an :ref:`integer_number`. The value of the expression is truncated to 16 bits. -* A :ref:`symbol` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. +* A :ref:`symbol` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. Examples: .. parsed-literal:: offset = 30 - s_branch loop_end - s_branch 2 + offset - s_branch 32 - loop_end: + label_1: + label_2 = . + 4 + + s_branch 32 + s_branch offset + 2 + s_branch label_1 + s_branch label_2 + s_branch label_3 + s_branch label_4 + + label_3 = label_2 + 4 + label_4: diff --git a/llvm/docs/AMDGPU/gfx10_msg.rst b/llvm/docs/AMDGPU/gfx10_msg.rst index ef531a14db19..3e6c532dd85a 100644 --- a/llvm/docs/AMDGPU/gfx10_msg.rst +++ b/llvm/docs/AMDGPU/gfx10_msg.rst @@ -12,24 +12,29 @@ msg A 16-bit message code. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 Message *type*. - 6:4 Optional *operation*. - 9:7 Optional *parameters*. - 15:10 Unused. - ============ ====================================================== + ============ =============================== =============== + Bits Description Value Range + ============ =============================== =============== + 3:0 Message *type*. 0..15 + 6:4 Optional *operation*. 0..7 + 7:7 Unused. \- + 9:8 Optional *stream*. 0..3 + 15:10 Unused. \- + ============ =============================== =============== -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below: +This operand may be specified as one of the following: - ======================================== ======================================================================== - Syntax Description - ======================================== ======================================================================== - sendmsg(<*type*>) A message identified by its *type*. - sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. - sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. - ======================================== ======================================================================== +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A *sendmsg* value described below. + + ==================================== ==================================================== + Sendmsg Value Syntax Description + ==================================== ==================================================== + sendmsg(<*type*>) A message identified by its *type*. + sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*. + sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation* + with a stream *id*. + ==================================== ==================================================== *Type* may be specified using message *name* or message *id*. @@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Each message type supports specific operations: @@ -60,16 +66,32 @@ Each message type supports specific operations: \ SYSMSG_OP_TTRACE_PC 4 \- ================= ========== ============================== ============ ========== +*Sendmsg* arguments are validated depending on how *type* value is specified: + +* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above. +* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table). + Examples: .. parsed-literal:: + // numeric message code + msg = 0x10 s_sendmsg 0x12 + s_sendmsg msg + 2 + + // sendmsg with strict arguments validation s_sendmsg sendmsg(MSG_INTERRUPT) - s_sendmsg sendmsg(MSG_GET_DOORBELL) - s_sendmsg sendmsg(2, GS_OP_CUT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) + s_sendmsg sendmsg(MSG_GET_DOORBELL) + + // sendmsg with validation of value range only + msg = 2 + op = 3 + stream = 1 + s_sendmsg sendmsg(msg, op, stream) + s_sendmsg sendmsg(2, GS_OP_CUT) diff --git a/llvm/docs/AMDGPU/gfx10_perm_smem.rst b/llvm/docs/AMDGPU/gfx10_perm_smem.rst index 879b33de1dff..bc12d1566517 100644 --- a/llvm/docs/AMDGPU/gfx10_perm_smem.rst +++ b/llvm/docs/AMDGPU/gfx10_perm_smem.rst @@ -12,7 +12,8 @@ imm3 A bit mask which indicates request permissions. -This operand must be specified as an :ref:`integer_number`. The value is truncated to 7 bits, but only 3 low bits are significant. +This operand must be specified as an :ref:`integer_number` or an :ref:`absolute_expression`. +The value is truncated to 7 bits, but only 3 low bits are significant. ============ ============================== Bit Number Description diff --git a/llvm/docs/AMDGPU/gfx10_simm16.rst b/llvm/docs/AMDGPU/gfx10_simm16.rst index eb1d171e5290..365600a5b2e3 100644 --- a/llvm/docs/AMDGPU/gfx10_simm16.rst +++ b/llvm/docs/AMDGPU/gfx10_simm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then sign-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx10_uimm16.rst b/llvm/docs/AMDGPU/gfx10_uimm16.rst index f4bfe8c924ba..8ade8dd60dae 100644 --- a/llvm/docs/AMDGPU/gfx10_uimm16.rst +++ b/llvm/docs/AMDGPU/gfx10_uimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then zero-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..65535. diff --git a/llvm/docs/AMDGPU/gfx10_waitcnt.rst b/llvm/docs/AMDGPU/gfx10_waitcnt.rst index e4c4bcdc169f..c861fb0113c9 100644 --- a/llvm/docs/AMDGPU/gfx10_waitcnt.rst +++ b/llvm/docs/AMDGPU/gfx10_waitcnt.rst @@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 VM_CNT: vector memory operations count, lower bits. - 6:4 EXP_CNT: export count. - 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. - 15:14 VM_CNT: vector memory operations count, upper bits. - ============ ====================================================== + ========== ========= ================================================ ============ + High Bits Low Bits Description Value Range + ========== ========= ================================================ ============ + 15:14 3:0 VM_CNT: vector memory operations count. 0..63 + \- 6:4 EXP_CNT: export count. 0..7 + \- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15 + ========== ========= ================================================ ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` -or as a combination of the following symbolic helpers: +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below. ====================== ====================================================================== Syntax Description ====================== ====================================================================== - vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. - expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. - lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. - vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). - expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). - lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). + vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value. + expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value. + lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. + vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value). + expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value). + lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). ====================== ====================================================================== -These helpers may be specified in any order. Ampersands and commas may be used as optional separators. +These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators. *N* is either an :ref:`integer number` or an @@ -47,10 +48,18 @@ Examples: .. parsed-literal:: - s_waitcnt 0 + vm_cnt = 1 + exp_cnt = 2 + lgkm_cnt = 3 + cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8) + + s_waitcnt cnt + s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above + s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above + s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above + s_waitcnt vmcnt(1) s_waitcnt expcnt(2) lgkmcnt(3) - s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) diff --git a/llvm/docs/AMDGPU/gfx7_bimm16.rst b/llvm/docs/AMDGPU/gfx7_bimm16.rst index eb43f9b36a9d..5f1fbc1dcd6c 100644 --- a/llvm/docs/AMDGPU/gfx7_bimm16.rst +++ b/llvm/docs/AMDGPU/gfx7_bimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits. +A 16-bit :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx7_bimm32.rst b/llvm/docs/AMDGPU/gfx7_bimm32.rst index 4d8f89d7ae39..45d35ff7adeb 100644 --- a/llvm/docs/AMDGPU/gfx7_bimm32.rst +++ b/llvm/docs/AMDGPU/gfx7_bimm32.rst @@ -10,5 +10,5 @@ imm32 =========================== -An :ref:`integer_number`. The value is truncated to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value is truncated to 32 bits. diff --git a/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_cmp.rst b/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_cmp.rst index 82c3337aeb2b..0328519d17b6 100644 --- a/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_cmp.rst +++ b/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_cmp.rst @@ -21,7 +21,7 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_reg.rst b/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_reg.rst index 729548dcb87b..30785e47ff98 100644 --- a/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_reg.rst +++ b/llvm/docs/AMDGPU/gfx7_data_mimg_atomic_reg.rst @@ -21,6 +21,6 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx7_fimm32.rst b/llvm/docs/AMDGPU/gfx7_fimm32.rst index 70c81891e978..de261aea8dbf 100644 --- a/llvm/docs/AMDGPU/gfx7_fimm32.rst +++ b/llvm/docs/AMDGPU/gfx7_fimm32.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The value is converted to *f32* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f32* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx7_hwreg.rst b/llvm/docs/AMDGPU/gfx7_hwreg.rst index 1e2d96417c33..b303b6ce8f97 100644 --- a/llvm/docs/AMDGPU/gfx7_hwreg.rst +++ b/llvm/docs/AMDGPU/gfx7_hwreg.rst @@ -14,18 +14,21 @@ Bits of a hardware register being accessed. The bits of this operand have the following meaning: - ============ =================================== - Bits Description - ============ =================================== - 5:0 Register *id*. - 10:6 First bit *offset* (0..31). - 15:11 *Size* in bits (1..32). - ============ =================================== + ======= ===================== ============ + Bits Description Value Range + ======= ===================== ============ + 5:0 Register *id*. 0..63 + 10:6 First bit *offset*. 0..31 + 15:11 *Size* in bits. 1..32 + ======= ===================== ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below. +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* An *hwreg* value described below. ==================================== ============================================================================ - Syntax Description + Hwreg Value Syntax Description ==================================== ============================================================================ hwreg({0..63}) All bits of a register indicated by its *id*. hwreg(<*name*>) All bits of a register indicated by its *name*. @@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. ==================================== ============================================================================ -Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Defined register *names* include: @@ -53,7 +57,16 @@ Examples: .. parsed-literal:: - s_getreg_b32 s2, 0x6 + reg = 1 + offset = 2 + size = 4 + hwreg_enc = reg | (offset << 6) | ((size - 1) << 11) + + s_getreg_b32 s2, 0x1881 + s_getreg_b32 s2, hwreg_enc // the same as above + s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above + s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above + s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) diff --git a/llvm/docs/AMDGPU/gfx7_label.rst b/llvm/docs/AMDGPU/gfx7_label.rst index ed2f3a416667..2f6bd68ddc33 100644 --- a/llvm/docs/AMDGPU/gfx7_label.rst +++ b/llvm/docs/AMDGPU/gfx7_label.rst @@ -12,19 +12,26 @@ label A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. -This operand may be specified as: +This operand may be specified as one of the following: -* An :ref:`integer_number`. The number is truncated to 16 bits. -* An :ref:`absolute_expression` which must start with an :ref:`integer_number`. The value of the expression is truncated to 16 bits. -* A :ref:`symbol` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. +* A :ref:`symbol` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. Examples: .. parsed-literal:: offset = 30 - s_branch loop_end - s_branch 2 + offset - s_branch 32 - loop_end: + label_1: + label_2 = . + 4 + + s_branch 32 + s_branch offset + 2 + s_branch label_1 + s_branch label_2 + s_branch label_3 + s_branch label_4 + + label_3 = label_2 + 4 + label_4: diff --git a/llvm/docs/AMDGPU/gfx7_msg.rst b/llvm/docs/AMDGPU/gfx7_msg.rst index 5476053ccc1c..72a895ef5b36 100644 --- a/llvm/docs/AMDGPU/gfx7_msg.rst +++ b/llvm/docs/AMDGPU/gfx7_msg.rst @@ -12,24 +12,29 @@ msg A 16-bit message code. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 Message *type*. - 6:4 Optional *operation*. - 9:7 Optional *parameters*. - 15:10 Unused. - ============ ====================================================== + ============ =============================== =============== + Bits Description Value Range + ============ =============================== =============== + 3:0 Message *type*. 0..15 + 6:4 Optional *operation*. 0..7 + 7:7 Unused. \- + 9:8 Optional *stream*. 0..3 + 15:10 Unused. \- + ============ =============================== =============== -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below: +This operand may be specified as one of the following: - ======================================== ======================================================================== - Syntax Description - ======================================== ======================================================================== - sendmsg(<*type*>) A message identified by its *type*. - sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. - sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. - ======================================== ======================================================================== +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A *sendmsg* value described below. + + ==================================== ==================================================== + Sendmsg Value Syntax Description + ==================================== ==================================================== + sendmsg(<*type*>) A message identified by its *type*. + sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*. + sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation* + with a stream *id*. + ==================================== ==================================================== *Type* may be specified using message *name* or message *id*. @@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Each message type supports specific operations: @@ -58,15 +64,31 @@ Each message type supports specific operations: \ SYSMSG_OP_TTRACE_PC 4 \- ================= ========== ============================== ============ ========== +*Sendmsg* arguments are validated depending on how *type* value is specified: + +* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above. +* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table). + Examples: .. parsed-literal:: + // numeric message code + msg = 0x10 s_sendmsg 0x12 + s_sendmsg msg + 2 + + // sendmsg with strict arguments validation s_sendmsg sendmsg(MSG_INTERRUPT) - s_sendmsg sendmsg(2, GS_OP_CUT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) + // sendmsg with validation of value range only + msg = 2 + op = 3 + stream = 1 + s_sendmsg sendmsg(msg, op, stream) + s_sendmsg sendmsg(2, GS_OP_CUT) + diff --git a/llvm/docs/AMDGPU/gfx7_simm16.rst b/llvm/docs/AMDGPU/gfx7_simm16.rst index 66e560ecec8c..3e5d3700863f 100644 --- a/llvm/docs/AMDGPU/gfx7_simm16.rst +++ b/llvm/docs/AMDGPU/gfx7_simm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then sign-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx7_uimm16.rst b/llvm/docs/AMDGPU/gfx7_uimm16.rst index bd0d4c2fb144..d8a1a20528ae 100644 --- a/llvm/docs/AMDGPU/gfx7_uimm16.rst +++ b/llvm/docs/AMDGPU/gfx7_uimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then zero-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..65535. diff --git a/llvm/docs/AMDGPU/gfx7_waitcnt.rst b/llvm/docs/AMDGPU/gfx7_waitcnt.rst index 3f5e07d16493..9566e6c67327 100644 --- a/llvm/docs/AMDGPU/gfx7_waitcnt.rst +++ b/llvm/docs/AMDGPU/gfx7_waitcnt.rst @@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 VM_CNT: vector memory operations count. - 6:4 EXP_CNT: export count. - 12:8 LGKM_CNT: LDS, GDS, Constant and Message count. - ============ ====================================================== + ===== ================================================ ============ + Bits Description Value Range + ===== ================================================ ============ + 3:0 VM_CNT: vector memory operations count. 0..15 + 6:4 EXP_CNT: export count. 0..7 + 12:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..31 + ===== ================================================ ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` -or as a combination of the following symbolic helpers: +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below. ====================== ====================================================================== Syntax Description ====================== ====================================================================== - vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. - expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. - lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. - vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). - expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). - lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). + vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value. + expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value. + lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. + vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value). + expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value). + lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). ====================== ====================================================================== -These helpers may be specified in any order. Ampersands and commas may be used as optional separators. +These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators. *N* is either an :ref:`integer number` or an @@ -46,10 +48,18 @@ Examples: .. parsed-literal:: - s_waitcnt 0 + vm_cnt = 1 + exp_cnt = 2 + lgkm_cnt = 3 + cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8) + + s_waitcnt cnt + s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above + s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above + s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above + s_waitcnt vmcnt(1) s_waitcnt expcnt(2) lgkmcnt(3) - s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) diff --git a/llvm/docs/AMDGPU/gfx8_bimm16.rst b/llvm/docs/AMDGPU/gfx8_bimm16.rst index ed50e5582329..66875c6024f4 100644 --- a/llvm/docs/AMDGPU/gfx8_bimm16.rst +++ b/llvm/docs/AMDGPU/gfx8_bimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits. +A 16-bit :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx8_bimm32.rst b/llvm/docs/AMDGPU/gfx8_bimm32.rst index d03c27b1732b..e46bc3c7ac81 100644 --- a/llvm/docs/AMDGPU/gfx8_bimm32.rst +++ b/llvm/docs/AMDGPU/gfx8_bimm32.rst @@ -10,5 +10,5 @@ imm32 =========================== -An :ref:`integer_number`. The value is truncated to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value is truncated to 32 bits. diff --git a/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_cmp.rst b/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_cmp.rst index 80222ead81a8..237b91cadfee 100644 --- a/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_cmp.rst +++ b/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_cmp.rst @@ -21,7 +21,7 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_reg.rst b/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_reg.rst index 8baf9269b887..35bcdc379a10 100644 --- a/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_reg.rst +++ b/llvm/docs/AMDGPU/gfx8_data_mimg_atomic_reg.rst @@ -21,6 +21,6 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx8_fimm16.rst b/llvm/docs/AMDGPU/gfx8_fimm16.rst index 5e387f5cb774..7abd5987bcdd 100644 --- a/llvm/docs/AMDGPU/gfx8_fimm16.rst +++ b/llvm/docs/AMDGPU/gfx8_fimm16.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The number is converted to *f16* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f16* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx8_fimm32.rst b/llvm/docs/AMDGPU/gfx8_fimm32.rst index e29e7704b8aa..a1556557f1da 100644 --- a/llvm/docs/AMDGPU/gfx8_fimm32.rst +++ b/llvm/docs/AMDGPU/gfx8_fimm32.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The value is converted to *f32* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f32* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx8_hwreg.rst b/llvm/docs/AMDGPU/gfx8_hwreg.rst index ffa1ea5afde3..87d788849110 100644 --- a/llvm/docs/AMDGPU/gfx8_hwreg.rst +++ b/llvm/docs/AMDGPU/gfx8_hwreg.rst @@ -14,18 +14,21 @@ Bits of a hardware register being accessed. The bits of this operand have the following meaning: - ============ =================================== - Bits Description - ============ =================================== - 5:0 Register *id*. - 10:6 First bit *offset* (0..31). - 15:11 *Size* in bits (1..32). - ============ =================================== + ======= ===================== ============ + Bits Description Value Range + ======= ===================== ============ + 5:0 Register *id*. 0..63 + 10:6 First bit *offset*. 0..31 + 15:11 *Size* in bits. 1..32 + ======= ===================== ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below. +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* An *hwreg* value described below. ==================================== ============================================================================ - Syntax Description + Hwreg Value Syntax Description ==================================== ============================================================================ hwreg({0..63}) All bits of a register indicated by its *id*. hwreg(<*name*>) All bits of a register indicated by its *name*. @@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. ==================================== ============================================================================ -Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Defined register *names* include: @@ -53,7 +57,16 @@ Examples: .. parsed-literal:: - s_getreg_b32 s2, 0x6 + reg = 1 + offset = 2 + size = 4 + hwreg_enc = reg | (offset << 6) | ((size - 1) << 11) + + s_getreg_b32 s2, 0x1881 + s_getreg_b32 s2, hwreg_enc // the same as above + s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above + s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above + s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) diff --git a/llvm/docs/AMDGPU/gfx8_imask.rst b/llvm/docs/AMDGPU/gfx8_imask.rst new file mode 100644 index 000000000000..55e9a244bdcb --- /dev/null +++ b/llvm/docs/AMDGPU/gfx8_imask.rst @@ -0,0 +1,66 @@ +.. + ************************************************** + * * + * Automatically generated file, do not edit! * + * * + ************************************************** + +.. _amdgpu_synid8_imask: + +imask +=========================== + +This operand is a mask which controls indexing mode for operands of subsequent instructions. +Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*. +Value 1 enables indexing and value 0 disables it. + + ===== ======================================== + Bit Meaning + ===== ======================================== + 0 Enables or disables *src0* indexing. + 1 Enables or disables *src1* indexing. + 2 Enables or disables *src2* indexing. + 3 Enables or disables *dst* indexing. + ===== ======================================== + +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..15. +* A *gpr_idx* value described below. + + ==================================== =========================================== + Gpr_idx Value Syntax Description + ==================================== =========================================== + gpr_idx(**) Enable indexing for specified *operands* + and disable it for the rest. + *Operands* is a comma-separated list of + values which may include: + + * "SRC0" - enable *src0* indexing. + + * "SRC1" - enable *src1* indexing. + + * "SRC2" - enable *src2* indexing. + + * "DST" - enable *dst* indexing. + + Each of these values may be specified only + once. + + *Operands* list may be empty; this syntax + disables indexing for all operands. + ==================================== =========================================== + +Examples: + +.. parsed-literal:: + + s_set_gpr_idx_mode 0 + s_set_gpr_idx_mode gpr_idx() // the same as above + + s_set_gpr_idx_mode 15 + s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above + s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above + + s_set_gpr_idx_mode gpr_idx(DST,SRC1) + diff --git a/llvm/docs/AMDGPU/gfx8_imm4.rst b/llvm/docs/AMDGPU/gfx8_imm4.rst deleted file mode 100644 index a03de76e86ef..000000000000 --- a/llvm/docs/AMDGPU/gfx8_imm4.rst +++ /dev/null @@ -1,25 +0,0 @@ -.. - ************************************************** - * * - * Automatically generated file, do not edit! * - * * - ************************************************** - -.. _amdgpu_synid8_imm4: - -imm4 -=========================== - -A positive :ref:`integer_number`. The value is truncated to 4 bits. - -This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it. - - ============ ======================================== - Bit Meaning - ============ ======================================== - 0 Enables or disables *src0* indexing. - 1 Enables or disables *src1* indexing. - 2 Enables or disables *src2* indexing. - 3 Enables or disables *dst* indexing. - ============ ======================================== - diff --git a/llvm/docs/AMDGPU/gfx8_label.rst b/llvm/docs/AMDGPU/gfx8_label.rst index 99e384ee392e..4f10c76f2687 100644 --- a/llvm/docs/AMDGPU/gfx8_label.rst +++ b/llvm/docs/AMDGPU/gfx8_label.rst @@ -12,19 +12,26 @@ label A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. -This operand may be specified as: +This operand may be specified as one of the following: -* An :ref:`integer_number`. The number is truncated to 16 bits. -* An :ref:`absolute_expression` which must start with an :ref:`integer_number`. The value of the expression is truncated to 16 bits. -* A :ref:`symbol` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. +* A :ref:`symbol` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. Examples: .. parsed-literal:: offset = 30 - s_branch loop_end - s_branch 2 + offset - s_branch 32 - loop_end: + label_1: + label_2 = . + 4 + + s_branch 32 + s_branch offset + 2 + s_branch label_1 + s_branch label_2 + s_branch label_3 + s_branch label_4 + + label_3 = label_2 + 4 + label_4: diff --git a/llvm/docs/AMDGPU/gfx8_msg.rst b/llvm/docs/AMDGPU/gfx8_msg.rst index 313d8e68b4b8..0b0b2f307482 100644 --- a/llvm/docs/AMDGPU/gfx8_msg.rst +++ b/llvm/docs/AMDGPU/gfx8_msg.rst @@ -12,24 +12,29 @@ msg A 16-bit message code. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 Message *type*. - 6:4 Optional *operation*. - 9:7 Optional *parameters*. - 15:10 Unused. - ============ ====================================================== + ============ =============================== =============== + Bits Description Value Range + ============ =============================== =============== + 3:0 Message *type*. 0..15 + 6:4 Optional *operation*. 0..7 + 7:7 Unused. \- + 9:8 Optional *stream*. 0..3 + 15:10 Unused. \- + ============ =============================== =============== -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below: +This operand may be specified as one of the following: - ======================================== ======================================================================== - Syntax Description - ======================================== ======================================================================== - sendmsg(<*type*>) A message identified by its *type*. - sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. - sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. - ======================================== ======================================================================== +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A *sendmsg* value described below. + + ==================================== ==================================================== + Sendmsg Value Syntax Description + ==================================== ==================================================== + sendmsg(<*type*>) A message identified by its *type*. + sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*. + sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation* + with a stream *id*. + ==================================== ==================================================== *Type* may be specified using message *name* or message *id*. @@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Each message type supports specific operations: @@ -58,15 +64,31 @@ Each message type supports specific operations: \ SYSMSG_OP_TTRACE_PC 4 \- ================= ========== ============================== ============ ========== +*Sendmsg* arguments are validated depending on how *type* value is specified: + +* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above. +* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table). + Examples: .. parsed-literal:: + // numeric message code + msg = 0x10 s_sendmsg 0x12 + s_sendmsg msg + 2 + + // sendmsg with strict arguments validation s_sendmsg sendmsg(MSG_INTERRUPT) - s_sendmsg sendmsg(2, GS_OP_CUT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) + // sendmsg with validation of value range only + msg = 2 + op = 3 + stream = 1 + s_sendmsg sendmsg(msg, op, stream) + s_sendmsg sendmsg(2, GS_OP_CUT) + diff --git a/llvm/docs/AMDGPU/gfx8_perm_smem.rst b/llvm/docs/AMDGPU/gfx8_perm_smem.rst index 0035ac821a72..75d02ddac796 100644 --- a/llvm/docs/AMDGPU/gfx8_perm_smem.rst +++ b/llvm/docs/AMDGPU/gfx8_perm_smem.rst @@ -12,7 +12,8 @@ imm3 A bit mask which indicates request permissions. -This operand must be specified as an :ref:`integer_number`. The value is truncated to 7 bits, but only 3 low bits are significant. +This operand must be specified as an :ref:`integer_number` or an :ref:`absolute_expression`. +The value is truncated to 7 bits, but only 3 low bits are significant. ============ ============================== Bit Number Description diff --git a/llvm/docs/AMDGPU/gfx8_simm16.rst b/llvm/docs/AMDGPU/gfx8_simm16.rst index 730f239b6bea..86161c5400ef 100644 --- a/llvm/docs/AMDGPU/gfx8_simm16.rst +++ b/llvm/docs/AMDGPU/gfx8_simm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then sign-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx8_uimm16.rst b/llvm/docs/AMDGPU/gfx8_uimm16.rst index a20abcc13448..9da1c60a8a73 100644 --- a/llvm/docs/AMDGPU/gfx8_uimm16.rst +++ b/llvm/docs/AMDGPU/gfx8_uimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then zero-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..65535. diff --git a/llvm/docs/AMDGPU/gfx8_waitcnt.rst b/llvm/docs/AMDGPU/gfx8_waitcnt.rst index 4bad59417116..29b430e07419 100644 --- a/llvm/docs/AMDGPU/gfx8_waitcnt.rst +++ b/llvm/docs/AMDGPU/gfx8_waitcnt.rst @@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 VM_CNT: vector memory operations count. - 6:4 EXP_CNT: export count. - 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. - ============ ====================================================== + ===== ================================================ ============ + Bits Description Value Range + ===== ================================================ ============ + 3:0 VM_CNT: vector memory operations count. 0..15 + 6:4 EXP_CNT: export count. 0..7 + 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15 + ===== ================================================ ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` -or as a combination of the following symbolic helpers: +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below. ====================== ====================================================================== Syntax Description ====================== ====================================================================== - vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. - expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. - lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. - vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). - expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). - lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). + vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value. + expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value. + lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. + vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value). + expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value). + lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). ====================== ====================================================================== -These helpers may be specified in any order. Ampersands and commas may be used as optional separators. +These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators. *N* is either an :ref:`integer number` or an @@ -46,10 +48,18 @@ Examples: .. parsed-literal:: - s_waitcnt 0 + vm_cnt = 1 + exp_cnt = 2 + lgkm_cnt = 3 + cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8) + + s_waitcnt cnt + s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above + s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above + s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above + s_waitcnt vmcnt(1) s_waitcnt expcnt(2) lgkmcnt(3) - s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) diff --git a/llvm/docs/AMDGPU/gfx9_bimm16.rst b/llvm/docs/AMDGPU/gfx9_bimm16.rst index 2c9dc5c52156..6e961167f47a 100644 --- a/llvm/docs/AMDGPU/gfx9_bimm16.rst +++ b/llvm/docs/AMDGPU/gfx9_bimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits. +A 16-bit :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx9_bimm32.rst b/llvm/docs/AMDGPU/gfx9_bimm32.rst index e9b89674a2ef..22286f1d2720 100644 --- a/llvm/docs/AMDGPU/gfx9_bimm32.rst +++ b/llvm/docs/AMDGPU/gfx9_bimm32.rst @@ -10,5 +10,5 @@ imm32 =========================== -An :ref:`integer_number`. The value is truncated to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value is truncated to 32 bits. diff --git a/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_cmp.rst b/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_cmp.rst index 08fe297b4f39..79d10fdb4e96 100644 --- a/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_cmp.rst +++ b/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_cmp.rst @@ -21,7 +21,7 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_reg.rst b/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_reg.rst index 2037dfd5356b..6889c468d26e 100644 --- a/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_reg.rst +++ b/llvm/docs/AMDGPU/gfx9_data_mimg_atomic_reg.rst @@ -21,6 +21,6 @@ Optionally may serve as an output data: * :ref:`dmask` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`tfe` adds 1 dword if specified. - Note. The surface data format is indicated in the image resource constant but not in the instruction. + Note: the surface data format is indicated in the image resource constant but not in the instruction. *Operands:* :ref:`v` diff --git a/llvm/docs/AMDGPU/gfx9_fimm16.rst b/llvm/docs/AMDGPU/gfx9_fimm16.rst index a438b452eecf..53432d2594a2 100644 --- a/llvm/docs/AMDGPU/gfx9_fimm16.rst +++ b/llvm/docs/AMDGPU/gfx9_fimm16.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The number is converted to *f16* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f16* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx9_fimm32.rst b/llvm/docs/AMDGPU/gfx9_fimm32.rst index 11103e70529e..e3198735ae7a 100644 --- a/llvm/docs/AMDGPU/gfx9_fimm32.rst +++ b/llvm/docs/AMDGPU/gfx9_fimm32.rst @@ -10,5 +10,6 @@ imm32 =========================== -An :ref:`integer_number` or a :ref:`floating-point_number`. The value is converted to *f32* as described :ref:`here`. +A :ref:`floating-point_number`, an :ref:`integer_number`, or an :ref:`absolute_expression`. +The value is converted to *f32* as described :ref:`here`. diff --git a/llvm/docs/AMDGPU/gfx9_hwreg.rst b/llvm/docs/AMDGPU/gfx9_hwreg.rst index 7ebb38b42fe0..b0c70cb6ccf8 100644 --- a/llvm/docs/AMDGPU/gfx9_hwreg.rst +++ b/llvm/docs/AMDGPU/gfx9_hwreg.rst @@ -14,18 +14,21 @@ Bits of a hardware register being accessed. The bits of this operand have the following meaning: - ============ =================================== - Bits Description - ============ =================================== - 5:0 Register *id*. - 10:6 First bit *offset* (0..31). - 15:11 *Size* in bits (1..32). - ============ =================================== + ======= ===================== ============ + Bits Description Value Range + ======= ===================== ============ + 5:0 Register *id*. 0..63 + 10:6 First bit *offset*. 0..31 + 15:11 *Size* in bits. 1..32 + ======= ===================== ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below. +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* An *hwreg* value described below. ==================================== ============================================================================ - Syntax Description + Hwreg Value Syntax Description ==================================== ============================================================================ hwreg({0..63}) All bits of a register indicated by its *id*. hwreg(<*name*>) All bits of a register indicated by its *name*. @@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. ==================================== ============================================================================ -Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Defined register *names* include: @@ -54,7 +58,16 @@ Examples: .. parsed-literal:: - s_getreg_b32 s2, 0x6 + reg = 1 + offset = 2 + size = 4 + hwreg_enc = reg | (offset << 6) | ((size - 1) << 11) + + s_getreg_b32 s2, 0x1881 + s_getreg_b32 s2, hwreg_enc // the same as above + s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above + s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above + s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) diff --git a/llvm/docs/AMDGPU/gfx9_imask.rst b/llvm/docs/AMDGPU/gfx9_imask.rst new file mode 100644 index 000000000000..22d73cb4ec04 --- /dev/null +++ b/llvm/docs/AMDGPU/gfx9_imask.rst @@ -0,0 +1,66 @@ +.. + ************************************************** + * * + * Automatically generated file, do not edit! * + * * + ************************************************** + +.. _amdgpu_synid9_imask: + +imask +=========================== + +This operand is a mask which controls indexing mode for operands of subsequent instructions. +Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*. +Value 1 enables indexing and value 0 disables it. + + ===== ======================================== + Bit Meaning + ===== ======================================== + 0 Enables or disables *src0* indexing. + 1 Enables or disables *src1* indexing. + 2 Enables or disables *src2* indexing. + 3 Enables or disables *dst* indexing. + ===== ======================================== + +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..15. +* A *gpr_idx* value described below. + + ==================================== =========================================== + Gpr_idx Value Syntax Description + ==================================== =========================================== + gpr_idx(**) Enable indexing for specified *operands* + and disable it for the rest. + *Operands* is a comma-separated list of + values which may include: + + * "SRC0" - enable *src0* indexing. + + * "SRC1" - enable *src1* indexing. + + * "SRC2" - enable *src2* indexing. + + * "DST" - enable *dst* indexing. + + Each of these values may be specified only + once. + + *Operands* list may be empty; this syntax + disables indexing for all operands. + ==================================== =========================================== + +Examples: + +.. parsed-literal:: + + s_set_gpr_idx_mode 0 + s_set_gpr_idx_mode gpr_idx() // the same as above + + s_set_gpr_idx_mode 15 + s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above + s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above + + s_set_gpr_idx_mode gpr_idx(DST,SRC1) + diff --git a/llvm/docs/AMDGPU/gfx9_imm4.rst b/llvm/docs/AMDGPU/gfx9_imm4.rst deleted file mode 100644 index b1c97fb0b29d..000000000000 --- a/llvm/docs/AMDGPU/gfx9_imm4.rst +++ /dev/null @@ -1,25 +0,0 @@ -.. - ************************************************** - * * - * Automatically generated file, do not edit! * - * * - ************************************************** - -.. _amdgpu_synid9_imm4: - -imm4 -=========================== - -A positive :ref:`integer_number`. The value is truncated to 4 bits. - -This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it. - - ============ ======================================== - Bit Meaning - ============ ======================================== - 0 Enables or disables *src0* indexing. - 1 Enables or disables *src1* indexing. - 2 Enables or disables *src2* indexing. - 3 Enables or disables *dst* indexing. - ============ ======================================== - diff --git a/llvm/docs/AMDGPU/gfx9_label.rst b/llvm/docs/AMDGPU/gfx9_label.rst index 32771722f71d..7348fc914887 100644 --- a/llvm/docs/AMDGPU/gfx9_label.rst +++ b/llvm/docs/AMDGPU/gfx9_label.rst @@ -12,19 +12,26 @@ label A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. -This operand may be specified as: +This operand may be specified as one of the following: -* An :ref:`integer_number`. The number is truncated to 16 bits. -* An :ref:`absolute_expression` which must start with an :ref:`integer_number`. The value of the expression is truncated to 16 bits. -* A :ref:`symbol` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. +* A :ref:`symbol` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker. Examples: .. parsed-literal:: offset = 30 - s_branch loop_end - s_branch 2 + offset - s_branch 32 - loop_end: + label_1: + label_2 = . + 4 + + s_branch 32 + s_branch offset + 2 + s_branch label_1 + s_branch label_2 + s_branch label_3 + s_branch label_4 + + label_3 = label_2 + 4 + label_4: diff --git a/llvm/docs/AMDGPU/gfx9_msg.rst b/llvm/docs/AMDGPU/gfx9_msg.rst index 14dff9050e65..34ede7100381 100644 --- a/llvm/docs/AMDGPU/gfx9_msg.rst +++ b/llvm/docs/AMDGPU/gfx9_msg.rst @@ -12,24 +12,29 @@ msg A 16-bit message code. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 Message *type*. - 6:4 Optional *operation*. - 9:7 Optional *parameters*. - 15:10 Unused. - ============ ====================================================== + ============ =============================== =============== + Bits Description Value Range + ============ =============================== =============== + 3:0 Message *type*. 0..15 + 6:4 Optional *operation*. 0..7 + 7:7 Unused. \- + 9:8 Optional *stream*. 0..3 + 15:10 Unused. \- + ============ =============================== =============== -This operand may be specified as a positive 16-bit :ref:`integer_number` or using the syntax described below: +This operand may be specified as one of the following: - ======================================== ======================================================================== - Syntax Description - ======================================== ======================================================================== - sendmsg(<*type*>) A message identified by its *type*. - sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. - sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. - ======================================== ======================================================================== +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A *sendmsg* value described below. + + ==================================== ==================================================== + Sendmsg Value Syntax Description + ==================================== ==================================================== + sendmsg(<*type*>) A message identified by its *type*. + sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*. + sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation* + with a stream *id*. + ==================================== ==================================================== *Type* may be specified using message *name* or message *id*. @@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number`. +Numeric values may be specified as positive :ref:`integer numbers` +or :ref:`absolute expressions`. Each message type supports specific operations: @@ -60,16 +66,32 @@ Each message type supports specific operations: \ SYSMSG_OP_TTRACE_PC 4 \- ================= ========== ============================== ============ ========== +*Sendmsg* arguments are validated depending on how *type* value is specified: + +* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above. +* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table). + Examples: .. parsed-literal:: + // numeric message code + msg = 0x10 s_sendmsg 0x12 + s_sendmsg msg + 2 + + // sendmsg with strict arguments validation s_sendmsg sendmsg(MSG_INTERRUPT) - s_sendmsg sendmsg(MSG_GET_DOORBELL) - s_sendmsg sendmsg(2, GS_OP_CUT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) + s_sendmsg sendmsg(MSG_GET_DOORBELL) + + // sendmsg with validation of value range only + msg = 2 + op = 3 + stream = 1 + s_sendmsg sendmsg(msg, op, stream) + s_sendmsg sendmsg(2, GS_OP_CUT) diff --git a/llvm/docs/AMDGPU/gfx9_perm_smem.rst b/llvm/docs/AMDGPU/gfx9_perm_smem.rst index 370fb0d67b33..d13c566789cf 100644 --- a/llvm/docs/AMDGPU/gfx9_perm_smem.rst +++ b/llvm/docs/AMDGPU/gfx9_perm_smem.rst @@ -12,7 +12,8 @@ imm3 A bit mask which indicates request permissions. -This operand must be specified as an :ref:`integer_number`. The value is truncated to 7 bits, but only 3 low bits are significant. +This operand must be specified as an :ref:`integer_number` or an :ref:`absolute_expression`. +The value is truncated to 7 bits, but only 3 low bits are significant. ============ ============================== Bit Number Description diff --git a/llvm/docs/AMDGPU/gfx9_simm16.rst b/llvm/docs/AMDGPU/gfx9_simm16.rst index 47b200a72070..4f734a04239a 100644 --- a/llvm/docs/AMDGPU/gfx9_simm16.rst +++ b/llvm/docs/AMDGPU/gfx9_simm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then sign-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range -32768..65535. diff --git a/llvm/docs/AMDGPU/gfx9_uimm16.rst b/llvm/docs/AMDGPU/gfx9_uimm16.rst index 4d1fe1de3f20..dd3c9b4a2995 100644 --- a/llvm/docs/AMDGPU/gfx9_uimm16.rst +++ b/llvm/docs/AMDGPU/gfx9_uimm16.rst @@ -10,5 +10,5 @@ imm16 =========================== -An :ref:`integer_number`. The value is truncated to 16 bits and then zero-extended to 32 bits. +An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..65535. diff --git a/llvm/docs/AMDGPU/gfx9_waitcnt.rst b/llvm/docs/AMDGPU/gfx9_waitcnt.rst index 015a51ae8c32..342f8c7fdb9e 100644 --- a/llvm/docs/AMDGPU/gfx9_waitcnt.rst +++ b/llvm/docs/AMDGPU/gfx9_waitcnt.rst @@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for. The bits of this operand have the following meaning: - ============ ====================================================== - Bits Description - ============ ====================================================== - 3:0 VM_CNT: vector memory operations count, lower bits. - 6:4 EXP_CNT: export count. - 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. - 15:14 VM_CNT: vector memory operations count, upper bits. - ============ ====================================================== + ========== ========= ================================================ ============ + High Bits Low Bits Description Value Range + ========== ========= ================================================ ============ + 15:14 3:0 VM_CNT: vector memory operations count. 0..63 + \- 6:4 EXP_CNT: export count. 0..7 + \- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15 + ========== ========= ================================================ ============ -This operand may be specified as a positive 16-bit :ref:`integer_number` -or as a combination of the following symbolic helpers: +This operand may be specified as one of the following: + +* An :ref:`integer_number` or an :ref:`absolute_expression`. The value must be in the range 0..0xFFFF. +* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below. ====================== ====================================================================== Syntax Description ====================== ====================================================================== - vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. - expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. - lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. - vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). - expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). - lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). + vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value. + expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value. + lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. + vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value). + expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value). + lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). ====================== ====================================================================== -These helpers may be specified in any order. Ampersands and commas may be used as optional separators. +These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators. *N* is either an :ref:`integer number` or an @@ -47,10 +48,18 @@ Examples: .. parsed-literal:: - s_waitcnt 0 + vm_cnt = 1 + exp_cnt = 2 + lgkm_cnt = 3 + cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8) + + s_waitcnt cnt + s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above + s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above + s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above + s_waitcnt vmcnt(1) s_waitcnt expcnt(2) lgkmcnt(3) - s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) diff --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst index d66e94dcb91a..526016d4f4f8 100644 --- a/llvm/docs/AMDGPUModifierSyntax.rst +++ b/llvm/docs/AMDGPUModifierSyntax.rst @@ -34,19 +34,21 @@ Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0. Used with DS instructions which have 2 addresses. - =================== ===================================================== + =================== ==================================================================== Syntax Description - =================== ===================================================== + =================== ==================================================================== offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive - :ref:`integer number `. - =================== ===================================================== + :ref:`integer number ` + or an :ref:`absolute expression`. + =================== ==================================================================== Examples: .. parsed-literal:: - offset:255 offset:0xff + offset:2-x + offset:-x-y .. _amdgpu_synid_ds_offset16: @@ -57,12 +59,13 @@ Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0. Used with DS instructions which have 1 address. - ==================== ====================================================== + ==================== ==================================================================== Syntax Description - ==================== ====================================================== + ==================== ==================================================================== offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive - :ref:`integer number `. - ==================== ====================================================== + :ref:`integer number ` + or an :ref:`absolute expression`. + ==================== ==================================================================== Examples: @@ -70,6 +73,7 @@ Examples: offset:65535 offset:0xffff + offset:-x-y .. _amdgpu_synid_sw_offset16: @@ -95,7 +99,7 @@ See AMD documentation for more information. *mask* is a 5 character sequence which specifies how to transform the bits of the - lane *id*. + lane *id*. The following characters are allowed: @@ -116,7 +120,7 @@ See AMD documentation for more information. size and must be equal to 2, 4, 8, 16 or 32. The second numeric parameter is an index of the - lane being broadcasted. + lane being broadcasted. The index must not exceed group size. offset:swizzle(SWAP,{1..16}) Specifies a swap mode. @@ -128,7 +132,7 @@ See AMD documentation for more information. Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes. ======================================================= =========================================================== -Numeric parameters may be specified as either :ref:`integer numbers` or +Note: numeric values may be specified as either :ref:`integer numbers` or :ref:`absolute expressions`. Examples: @@ -137,7 +141,7 @@ Examples: offset:255 offset:0xffff - offset:swizzle(QUAD_PERM, 0, 1, 2 ,3) + offset:swizzle(QUAD_PERM, 0, 1, 2, 3) offset:swizzle(BITMASK_PERM, "01pi0") offset:swizzle(BROADCAST, 2, 0) offset:swizzle(SWAP, 8) @@ -212,19 +216,20 @@ Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. Cannot be used with *global/scratch* opcodes. GFX9 only. - ================= ====================================================== + ================= ==================================================================== Syntax Description - ================= ====================================================== + ================= ==================================================================== offset:{0..4095} Specifies a 12-bit unsigned offset as a positive - :ref:`integer number `. - ================= ====================================================== + :ref:`integer number ` + or an :ref:`absolute expression`. + ================= ==================================================================== Examples: .. parsed-literal:: offset:4095 - offset:0xff + offset:x-0xff .. _amdgpu_synid_flat_offset13s: @@ -235,12 +240,13 @@ Specifies an immediate signed 13-bit offset, in bytes. The default value is 0. Can be used with *global/scratch* opcodes only. GFX9 only. - ============================ ======================================================= - Syntax Description - ============================ ======================================================= - offset:{-4096..4095} Specifies a 13-bit signed offset as an - :ref:`integer number `. - ============================ ======================================================= + ===================== ==================================================================== + Syntax Description + ===================== ==================================================================== + offset:{-4096..4095} Specifies a 13-bit signed offset as an + :ref:`integer number ` + or an :ref:`absolute expression`. + ===================== ==================================================================== Examples: @@ -248,6 +254,7 @@ Examples: offset:-4000 offset:0x10 + offset:-x .. _amdgpu_synid_flat_offset12s: @@ -260,12 +267,13 @@ Can be used with *global/scratch* opcodes only. GFX10 only. - ============================ ======================================================= - Syntax Description - ============================ ======================================================= - offset:{-2048..2047} Specifies a 12-bit signed offset as an - :ref:`integer number `. - ============================ ======================================================= + ===================== ==================================================================== + Syntax Description + ===================== ==================================================================== + offset:{-2048..2047} Specifies a 12-bit signed offset as an + :ref:`integer number ` + or an :ref:`absolute expression`. + ===================== ==================================================================== Examples: @@ -273,6 +281,7 @@ Examples: offset:-2000 offset:0x10 + offset:-x+y .. _amdgpu_synid_flat_offset11: @@ -285,19 +294,20 @@ Cannot be used with *global/scratch* opcodes. GFX10 only. - ================= ====================================================== + ================= ==================================================================== Syntax Description - ================= ====================================================== + ================= ==================================================================== offset:{0..2047} Specifies an 11-bit unsigned offset as a positive - :ref:`integer number `. - ================= ====================================================== + :ref:`integer number ` + or an :ref:`absolute expression`. + ================= ==================================================================== Examples: .. parsed-literal:: offset:2047 - offset:0xff + offset:x+0xff dlc ~~~ @@ -340,19 +350,18 @@ dmask Specifies which channels (image components) are used by the operation. By default, no channels are used. - =============== ===================================================== + =============== ==================================================================== Syntax Description - =============== ===================================================== + =============== ==================================================================== dmask:{0..15} Specifies image channels as a positive - :ref:`integer number `. + :ref:`integer number ` + or an :ref:`absolute expression`. - Each bit corresponds to one of 4 image - components (RGBA). + Each bit corresponds to one of 4 image components (RGBA). - If the specified bit value - is 0, the component is not used, value 1 means - that the component is used. - =============== ===================================================== + If the specified bit value is 0, the component is not used, + value 1 means that the component is used. + =============== ==================================================================== This modifier has some limitations depending on instruction kind: @@ -373,7 +382,7 @@ Examples: dmask:0xf dmask:0b1111 - dmask:3 + dmask:x|y|z .. _amdgpu_synid_unorm: @@ -468,7 +477,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7. Each 16-bit data element occupies 1 VGPR. GFX8.1, GFX9 and GFX10 support data packing. - Each pair of 16-bit data elements + Each pair of 16-bit data elements occupies 1 VGPR. ======================================== ================================================ @@ -684,18 +693,19 @@ offset12 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. - =============================== ====================================================== - Syntax Description - =============================== ====================================================== - offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive - :ref:`integer number `. - =============================== ====================================================== + ================== ==================================================================== + Syntax Description + ================== ==================================================================== + offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive + :ref:`integer number ` + or an :ref:`absolute expression`. + ================== ==================================================================== Examples: .. parsed-literal:: - offset:0 + offset:x+y offset:0x10 glc @@ -782,14 +792,18 @@ GFX10 only. dpp8_sel ~~~~~~~~ -Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier. +Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier. There is no default value. GFX10 only. -The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7. +The *dpp8_sel* modifier must specify exactly 8 values. First value selects which lane to read from to supply data into lane 0. -Second value controls value for lane 1 and so on. +Second value controls lane 1 and so on. + +Each value may be specified as either +an :ref:`integer number` or +an :ref:`absolute expression`. =============================================================== =========================== Syntax Description @@ -811,7 +825,7 @@ fi Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero. -Note. *Inactive* lanes are those whose :ref:`exec` mask bit is zero. +Note: *inactive* lanes are those whose :ref:`exec` mask bit is zero. GFX10 only. @@ -822,6 +836,9 @@ GFX10 only. fi:1 Fetch pre-exist values from inactive lanes. ==================================== ===================================================== +Note: numeric values may be specified as either :ref:`integer numbers` or +:ref:`absolute expressions`. + DPP/DPP16 Modifiers ------------------- @@ -837,7 +854,7 @@ There is no default value. GFX8 and GFX9 only. Use :ref:`dpp16_ctrl` for GFX10. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. ======================================== ================================================ Syntax Description @@ -856,7 +873,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*. row_ror:{1..15} Row rotate right by 1-15 threads. ======================================== ================================================ -Note: Numeric parameters may be specified as either +Note: numeric values may be specified as either :ref:`integer numbers` or :ref:`absolute expressions`. @@ -877,7 +894,7 @@ There is no default value. GFX10 only. Use :ref:`dpp_ctrl` for GFX8 and GFX9. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) ======================================== ==================================================== @@ -894,7 +911,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*. row_ror:{1..15} Row rotate right by 1-15 threads. ======================================== ==================================================== -Note: Numeric parameters may be specified as either +Note: numeric values may be specified as either :ref:`integer numbers` or :ref:`absolute expressions`. @@ -912,21 +929,21 @@ row_mask Controls which rows are enabled for data sharing. By default, all rows are enabled. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) - ======================================== ===================================================== - Syntax Description - ======================================== ===================================================== - row_mask:{0..15} Specifies a *row mask* as a positive - :ref:`integer number `. + ================= ==================================================================== + Syntax Description + ================= ==================================================================== + row_mask:{0..15} Specifies a *row mask* as a positive + :ref:`integer number ` + or an :ref:`absolute expression`. - Each of 4 bits in the mask controls one - row (0 - disabled, 1 - enabled). + Each of 4 bits in the mask controls one row + (0 - disabled, 1 - enabled). - In *wave32* mode the values should be limited to - {0..7}. - ======================================== ===================================================== + In *wave32* mode the values should be limited to 0..7. + ================= ==================================================================== Examples: @@ -934,7 +951,7 @@ Examples: row_mask:0xf row_mask:0b1010 - row_mask:0b1111 + row_mask:x|y .. _amdgpu_synid_bank_mask: @@ -943,18 +960,19 @@ bank_mask Controls which banks are enabled for data sharing. By default, all banks are enabled. -Note. The lanes of a wavefront are organized in four *rows* and four *banks*. +Note: the lanes of a wavefront are organized in four *rows* and four *banks*. (There are only two rows in *wave32* mode.) - ======================================== ======================================================= - Syntax Description - ======================================== ======================================================= - bank_mask:{0..15} Specifies a *bank mask* as a positive - :ref:`integer number `. + ================== ==================================================================== + Syntax Description + ================== ==================================================================== + bank_mask:{0..15} Specifies a *bank mask* as a positive + :ref:`integer number ` + or an :ref:`absolute expression`. - Each of 4 bits in the mask controls one - bank (0 - disabled, 1 - enabled). - ======================================== ======================================================= + Each of 4 bits in the mask controls one bank + (0 - disabled, 1 - enabled). + ================== ==================================================================== Examples: @@ -962,7 +980,7 @@ Examples: bank_mask:0x3 bank_mask:0b0011 - bank_mask:0b1111 + bank_mask:x&y .. _amdgpu_synid_bound_ctrl: @@ -988,7 +1006,7 @@ fi Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero. -Note. *Inactive* lanes are those whose :ref:`exec` mask bit is zero. +Note: *inactive* lanes are those whose :ref:`exec` mask bit is zero. GFX10 only. @@ -1001,6 +1019,9 @@ GFX10 only. fi:1 Fetch pre-exist values from inactive lanes. ======================================== ================================================== +Note: numeric values may be specified as either :ref:`integer numbers` or +:ref:`absolute expressions`. + SDWA Modifiers -------------- @@ -1037,7 +1058,6 @@ Selects which bits in the destination are affected. By default, all bits are aff dst_sel:WORD_1 Use bits 31:16. ======================================== ================================================ - .. _amdgpu_synid_dst_unused: dst_unused @@ -1151,7 +1171,7 @@ operands (both source and destination). First value controls src0, second value and so on, except that the last value controls destination. The value 0 selects the low bits, while 1 selects the high bits. -Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified +Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified by op_sel must be 0. GFX9 and GFX10 only. @@ -1164,6 +1184,10 @@ GFX9 and GFX10 only. op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. ======================================== ============================================================ +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1189,7 +1213,7 @@ Integer clamping is not supported by GFX7. For floating point operations, clamp modifier indicates that the result must be clamped to the range [0.0, 1.0]. By default, there is no clamping. -Note. Clamp modifier is applied after :ref:`output modifiers` (if any). +Note: clamp modifier is applied after :ref:`output modifiers` (if any). ======================================== ================================================ Syntax Description @@ -1205,12 +1229,12 @@ omod Specifies if an output modifier must be applied to the result. By default, no output modifiers are applied. -Note. Output modifiers are applied before :ref:`clamping` (if any). +Note: output modifiers are applied before :ref:`clamping` (if any). Output modifiers are valid for f32 and f64 floating point results only. They must not be used with f16. -Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result +Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result but accepts output modifiers. ======================================== ================================================ @@ -1221,6 +1245,16 @@ but accepts output modifiers. div:2 Multiply the result by 0.5. ======================================== ================================================ +Note: numeric values may be specified as either :ref:`integer numbers` or +:ref:`absolute expressions`. + +Examples: + +.. parsed-literal:: + + mul:2 + mul:x // x must be equal to 2 or 4 + .. _amdgpu_synid_vop3_operand_modifiers: VOP3 Operand Modifiers @@ -1233,15 +1267,19 @@ Operand modifiers are not used separately. They are applied to source operands. abs ~~~ -Computes absolute value of its operand. Applied before :ref:`neg` (if any). -Valid for floating point operands only. +Computes the absolute value of its operand. Must be applied before :ref:`neg` +(if any). Valid for floating point operands only. - ======================================== ================================================ + ======================================== ==================================================== Syntax Description - ======================================== ================================================ - abs() Get absolute value of operand. - \|| The same as above. - ======================================== ================================================ + ======================================== ==================================================== + abs() Get the absolute value of a floating-point operand. + \|| The same as above (an SP3 syntax). + ======================================== ==================================================== + +Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|' +may be misinterpreted. Such operands should be enclosed into additional parentheses as shown +in examples below. Examples: @@ -1249,28 +1287,50 @@ Examples: abs(v36) \|v36| + abs(x|y) // ok + \|(x|y)| // additional parentheses are required .. _amdgpu_synid_neg: neg ~~~ -Computes negative value of its operand. Applied after :ref:`abs` (if any). -Valid for floating point operands only. +Computes the negative value of its operand. Must be applied after :ref:`abs` +(if any). Valid for floating point operands only. - ======================================== ================================================ - Syntax Description - ======================================== ================================================ - neg() Get negative value of operand. - - The same as above. - ======================================== ================================================ + ================== ==================================================== + Syntax Description + ================== ==================================================== + neg() Get the negative value of a floating-point operand. + The operand may include an optional + :ref:`abs` modifier. + - The same as above (an SP3 syntax). + ================== ==================================================== + +Note: SP3 syntax is supported with limitations because of a potential ambiguity. +Currently it is allowed in the following cases: + +* Before a register. +* Before an :ref:`abs` modifier. +* Before an SP3 :ref:`abs` modifier. + +In all other cases "-" is handled as a part of an expression that follows the sign. Examples: .. parsed-literal:: + // Operands with negate modifiers neg(v[0]) - -v4 + neg(1.0) + neg(abs(v0)) + -v5 + -abs(v5) + -\|v5| + + // Operands without negate modifiers + -1 + -x+y VOP3P Modifiers --------------- @@ -1304,6 +1364,10 @@ The value 0 selects the low bits, while 1 selects the high bits. op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. ================================= ============================================================= +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1333,6 +1397,10 @@ The value 0 selects the low bits, while 1 selects the high bits. op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. =================================== ============================================================= +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1367,6 +1435,10 @@ This modifier is valid for floating point operands only. neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. ================================ ================================================================== +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1401,6 +1473,10 @@ This modifier is valid for floating point operands only. neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. =============================== ================================================================== +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1419,7 +1495,7 @@ VOP3P V_MAD_MIX Modifiers ------------------------- *v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions -use *op_sel* and *op_sel_hi* modifiers +use *op_sel* and *op_sel_hi* modifiers in a manner different from *regular* VOP3P instructions. See a description below. @@ -1449,6 +1525,10 @@ By default, low bits are used for all operands. op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand. =============================== ================================================ +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: @@ -1477,6 +1557,10 @@ The location of 16 bits in the operand may be specified by op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand. ======================================== ==================================== +Note: numeric values may be specified as either +:ref:`integer numbers` or +:ref:`absolute expressions`. + Examples: .. parsed-literal:: diff --git a/llvm/docs/AMDGPUOperandSyntax.rst b/llvm/docs/AMDGPUOperandSyntax.rst index 523c5ac7179c..c20da0047296 100644 --- a/llvm/docs/AMDGPUOperandSyntax.rst +++ b/llvm/docs/AMDGPUOperandSyntax.rst @@ -38,7 +38,8 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register =================================================== ==================================================================== **v**\ A single 32-bit *vector* register. - *N* must be a decimal integer number. + *N* must be a decimal + :ref:`integer number`. **v[**\ \ **]** A single 32-bit *vector* register. *N* may be specified as an @@ -51,10 +52,11 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register or :ref:`absolute expressions`. **[v**\ , \ **v**\ , ... **v**\ \ **]** A sequence of (\ *K-N+1*\ ) *vector* registers. - Register indices must be specified as decimal integer numbers. + Register indices must be specified as decimal + :ref:`integer numbers`. =================================================== ==================================================================== -Note. *N* and *K* must satisfy the following conditions: +Note: *N* and *K* must satisfy the following conditions: * *N* <= *K*. * 0 <= *N* <= 255. @@ -77,26 +79,27 @@ Examples: .. _amdgpu_synid_nsa: -*Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*: +GFX10 *Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*: - =================================================== ==================================================================== - Syntax Description - =================================================== ==================================================================== - **[v**\ , \ **v**\ , ... **v**\ \ **]** A sequence of *vector* registers. At least one register - must be specified. + ===================================== ================================================= + Syntax Description + ===================================== ================================================= + **[Vm**, \ **Vn**, ... **Vk**\ **]** A sequence of 32-bit *vector* registers. + Each register may be specified using a syntax + defined :ref:`above`. - In contrast with standard syntax described above, registers in - this sequence are not required to have consecutive indices. - Moreover, the same register may appear in the list more than once. - =================================================== ==================================================================== - -Note. Reqister indices must be in the range 0..255. They must be specified as decimal integer numbers. + In contrast with standard syntax, registers + in *NSA* sequence are not required to have + consecutive indices. Moreover, the same register + may appear in the list more than once. + ===================================== ================================================= Examples: .. parsed-literal:: - [v32,v1,v2] + [v32,v1,v[2]] + [v[32],v[1:1],[v2]] [v4,v4,v4,v4] .. _amdgpu_synid_s: @@ -126,7 +129,9 @@ Sequences of 4 and more *scalar* registers must be quad-aligned. ======================================================== ==================================================================== **s**\ A single 32-bit *scalar* register. - *N* must be a decimal integer number. + *N* must be a decimal + :ref:`integer number`. + **s[**\ \ **]** A single 32-bit *scalar* register. *N* may be specified as an @@ -137,12 +142,14 @@ Sequences of 4 and more *scalar* registers must be quad-aligned. *N* and *K* may be specified as :ref:`integer numbers` or :ref:`absolute expressions`. + **[s**\ , \ **s**\ , ... **s**\ \ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers. - Register indices must be specified as decimal integer numbers. + Register indices must be specified as decimal + :ref:`integer numbers`. ======================================================== ==================================================================== -Note. *N* and *K* must satisfy the following conditions: +Note: *N* and *K* must satisfy the following conditions: * *N* must be properly aligned based on sequence size. * *N* <= *K*. @@ -210,7 +217,8 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned. ============================================================= ==================================================================== **ttmp**\ A single 32-bit *ttmp* register. - *N* must be a decimal integer number. + *N* must be a decimal + :ref:`integer number`. **ttmp[**\ \ **]** A single 32-bit *ttmp* register. *N* may be specified as an @@ -223,10 +231,11 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned. or :ref:`absolute expressions`. **[ttmp**\ , \ **ttmp**\ , ... **ttmp**\ \ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers. - Register indices must be specified as decimal integer numbers. + Register indices must be specified as decimal + :ref:`integer numbers`. ============================================================= ==================================================================== -Note. *N* and *K* must satisfy the following conditions: +Note: *N* and *K* must satisfy the following conditions: * *N* must be properly aligned based on sequence size. * *N* <= *K*. @@ -266,8 +275,8 @@ Trap base address, 64-bits wide. Holds the pointer to the current trap handler p Syntax Description Availability ================== ======================================================================= ============= tba 64-bit *trap base address* register. GFX7, GFX8 - [tba] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8 - [tba_lo,tba_hi] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8 + [tba] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8 + [tba_lo,tba_hi] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8 ================== ======================================================================= ============= High and low 32 bits of *trap base address* may be accessed as separate registers: @@ -277,8 +286,8 @@ High and low 32 bits of *trap base address* may be accessed as separate register ================== ======================================================================= ============= tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8 tba_hi High 32 bits of *trap base address* register. GFX7, GFX8 - [tba_lo] Low 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8 - [tba_hi] High 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8 + [tba_lo] Low 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8 + [tba_hi] High 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8 ================== ======================================================================= ============= Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10, @@ -295,8 +304,8 @@ Trap memory address, 64-bits wide. Syntax Description Availability ================= ======================================================================= ================== tma 64-bit *trap memory address* register. GFX7, GFX8 - [tma] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8 - [tma_lo,tma_hi] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8 + [tma] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8 + [tma_lo,tma_hi] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8 ================= ======================================================================= ================== High and low 32 bits of *trap memory address* may be accessed as separate registers: @@ -306,8 +315,8 @@ High and low 32 bits of *trap memory address* may be accessed as separate regist ================= ======================================================================= ================== tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8 tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8 - [tma_lo] Low 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8 - [tma_hi] High 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8 + [tma_lo] Low 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8 + [tma_hi] High 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8 ================= ======================================================================= ================== Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10, @@ -324,8 +333,8 @@ Flat scratch address, 64-bits wide. Holds the base address of scratch memory. Syntax Description ================================== ================================================================ flat_scratch 64-bit *flat scratch* address register. - [flat_scratch] 64-bit *flat scratch* address register (an alternative syntax). - [flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an alternative syntax). + [flat_scratch] 64-bit *flat scratch* address register (an SP3 syntax). + [flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an SP3 syntax). ================================== ================================================================ High and low 32 bits of *flat scratch* address may be accessed as separate registers: @@ -335,8 +344,8 @@ High and low 32 bits of *flat scratch* address may be accessed as separate regis ========================= ========================================================================= flat_scratch_lo Low 32 bits of *flat scratch* address register. flat_scratch_hi High 32 bits of *flat scratch* address register. - [flat_scratch_lo] Low 32 bits of *flat scratch* address register (an alternative syntax). - [flat_scratch_hi] High 32 bits of *flat scratch* address register (an alternative syntax). + [flat_scratch_lo] Low 32 bits of *flat scratch* address register (an SP3 syntax). + [flat_scratch_hi] High 32 bits of *flat scratch* address register (an SP3 syntax). ========================= ========================================================================= .. _amdgpu_synid_xnack: @@ -355,8 +364,8 @@ received an *XNACK* due to a vector memory operation. Syntax Description ============================== ===================================================== xnack_mask 64-bit *xnack mask* register. - [xnack_mask] 64-bit *xnack mask* register (an alternative syntax). - [xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an alternative syntax). + [xnack_mask] 64-bit *xnack mask* register (an SP3 syntax). + [xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an SP3 syntax). ============================== ===================================================== High and low 32 bits of *xnack mask* may be accessed as separate registers: @@ -366,8 +375,8 @@ High and low 32 bits of *xnack mask* may be accessed as separate registers: ===================== ============================================================== xnack_mask_lo Low 32 bits of *xnack mask* register. xnack_mask_hi High 32 bits of *xnack mask* register. - [xnack_mask_lo] Low 32 bits of *xnack mask* register (an alternative syntax). - [xnack_mask_hi] High 32 bits of *xnack mask* register (an alternative syntax). + [xnack_mask_lo] Low 32 bits of *xnack mask* register (an SP3 syntax). + [xnack_mask_hi] High 32 bits of *xnack mask* register (an SP3 syntax). ===================== ============================================================== .. _amdgpu_synid_vcc: @@ -385,8 +394,8 @@ Note that GFX10 H/W does not use high 32 bits of *vcc* in *wave32* mode. Syntax Description ================ ========================================================================= vcc 64-bit *vector condition code* register. - [vcc] 64-bit *vector condition code* register (an alternative syntax). - [vcc_lo,vcc_hi] 64-bit *vector condition code* register (an alternative syntax). + [vcc] 64-bit *vector condition code* register (an SP3 syntax). + [vcc_lo,vcc_hi] 64-bit *vector condition code* register (an SP3 syntax). ================ ========================================================================= High and low 32 bits of *vector condition code* may be accessed as separate registers: @@ -396,8 +405,8 @@ High and low 32 bits of *vector condition code* may be accessed as separate regi ================ ========================================================================= vcc_lo Low 32 bits of *vector condition code* register. vcc_hi High 32 bits of *vector condition code* register. - [vcc_lo] Low 32 bits of *vector condition code* register (an alternative syntax). - [vcc_hi] High 32 bits of *vector condition code* register (an alternative syntax). + [vcc_lo] Low 32 bits of *vector condition code* register (an SP3 syntax). + [vcc_hi] High 32 bits of *vector condition code* register (an SP3 syntax). ================ ========================================================================= .. _amdgpu_synid_m0: @@ -412,7 +421,7 @@ including register indexing and bounds checking. Syntax Description =========== =================================================== m0 A 32-bit *memory* register. - [m0] A 32-bit *memory* register (an alternative syntax). + [m0] A 32-bit *memory* register (an SP3 syntax). =========== =================================================== .. _amdgpu_synid_exec: @@ -430,8 +439,8 @@ Note that GFX10 H/W does not use high 32 bits of *exec* in *wave32* mode. Syntax Description ===================== ================================================================= exec 64-bit *execute mask* register. - [exec] 64-bit *execute mask* register (an alternative syntax). - [exec_lo,exec_hi] 64-bit *execute mask* register (an alternative syntax). + [exec] 64-bit *execute mask* register (an SP3 syntax). + [exec_lo,exec_hi] 64-bit *execute mask* register (an SP3 syntax). ===================== ================================================================= High and low 32 bits of *execute mask* may be accessed as separate registers: @@ -441,8 +450,8 @@ High and low 32 bits of *execute mask* may be accessed as separate registers: ===================== ================================================================= exec_lo Low 32 bits of *execute mask* register. exec_hi High 32 bits of *execute mask* register. - [exec_lo] Low 32 bits of *execute mask* register (an alternative syntax). - [exec_hi] High 32 bits of *execute mask* register (an alternative syntax). + [exec_lo] Low 32 bits of *execute mask* register (an SP3 syntax). + [exec_hi] High 32 bits of *execute mask* register (an SP3 syntax). ===================== ================================================================= .. _amdgpu_synid_vccz: @@ -452,7 +461,7 @@ vccz A single bit flag indicating that the :ref:`vcc` is all zeros. -Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo`. +Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo`. .. _amdgpu_synid_execz: @@ -461,7 +470,7 @@ execz A single bit flag indicating that the :ref:`exec` is all zeros. -Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo`. +Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo`. .. _amdgpu_synid_scc: @@ -495,19 +504,20 @@ GFX10 only. .. _amdgpu_synid_constant: -constant --------- +inline constant +--------------- -A set of integer and floating-point *inline* constants and values: +An *inline constant* is an integer or a floating-point value encoded as a part of an instruction. +Compare *inline constants* with :ref:`literals`. + +Inline constants include: * :ref:`iconst` * :ref:`fconst` * :ref:`ival` -In contrast with :ref:`literals`, these operands are encoded as a part of instruction. - If a number may be encoded as either -a :ref:`literal` or +a :ref:`literal` or a :ref:`constant`, assembler selects the latter encoding as more efficient. @@ -516,17 +526,14 @@ assembler selects the latter encoding as more efficient. iconst ~~~~~~ -An :ref:`integer number` +An :ref:`integer number` or +an :ref:`absolute expression` encoded as an *inline constant*. Only a small fraction of integer numbers may be encoded as *inline constants*. They are enumerated in the table below. Other integer numbers have to be encoded as :ref:`literals`. -Integer *inline constants* are converted to -:ref:`expected operand type` -as described :ref:`here`. - ================================== ==================================== Value Note ================================== ==================================== @@ -548,10 +555,6 @@ Only a small fraction of floating-point numbers may be encoded as *inline consta They are enumerated in the table below. Other floating-point numbers have to be encoded as :ref:`literals`. -Floating-point *inline constants* are converted to -:ref:`expected operand type` -as described :ref:`here`. - ===================== ===================================================== ================== Value Note Availability ===================== ===================================================== ================== @@ -594,21 +597,18 @@ These operands provide read-only access to H/W registers. literal ------- -A literal is a 64-bit value which is encoded as a separate 32-bit dword in the instruction stream. +A *literal* is a 64-bit value encoded as a separate 32-bit dword in the instruction stream. +Compare *literals* with :ref:`inline constants`. If a number may be encoded as either -a :ref:`literal` or +a :ref:`literal` or an :ref:`inline constant`, assembler selects the latter encoding as more efficient. Literals may be specified as :ref:`integer numbers`, -:ref:`floating-point numbers` or -:ref:`expressions` -(expressions are currently supported for 32-bit operands only). - -A 64-bit literal value is converted by assembler -to an :ref:`expected operand type` -as described :ref:`here`. +:ref:`floating-point numbers`, +:ref:`absolute expressions` or +:ref:`relocatable expressions`. An instruction may use only one literal but several operands may refer the same literal. @@ -617,30 +617,38 @@ An instruction may use only one literal but several operands may refer the same uimm8 ----- -A 8-bit positive :ref:`integer number`. -The value is encoded as part of the opcode so it is free to use. +A 8-bit :ref:`integer number` +or an :ref:`absolute expression`. +The value must be in the range 0..0xFF. .. _amdgpu_synid_uimm32: uimm32 ------ -A 32-bit positive :ref:`integer number`. -The value is stored as a separate 32-bit dword in the instruction stream. +A 32-bit :ref:`integer number` +or an :ref:`absolute expression`. +The value must be in the range 0..0xFFFFFFFF. .. _amdgpu_synid_uimm20: uimm20 ------ -A 20-bit positive :ref:`integer number`. +A 20-bit :ref:`integer number` +or an :ref:`absolute expression`. + +The value must be in the range 0..0xFFFFF. .. _amdgpu_synid_uimm21: uimm21 ------ -A 21-bit positive :ref:`integer number`. +A 21-bit :ref:`integer number` +or an :ref:`absolute expression`. + +The value must be in the range 0..0x1FFFFF. .. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20` as a replacement. @@ -649,7 +657,10 @@ A 21-bit positive :ref:`integer number`. simm21 ------ -A 21-bit :ref:`integer number`. +A 21-bit :ref:`integer number` +or an :ref:`absolute expression`. + +The value must be in the range -0x100000..0x0FFFFF. .. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20` as a replacement. @@ -678,27 +689,20 @@ Integer Numbers --------------- Integer numbers are 64 bits wide. -They may be specified in binary, octal, hexadecimal and decimal formats: +They are converted to :ref:`expected operand type` +as described :ref:`here`. - ============== ==================================== - Format Syntax - ============== ==================================== - Decimal [-]?[1-9][0-9]* - Binary [-]?0b[01]+ - Octal [-]?0[0-7]+ - Hexadecimal [-]?0x[0-9a-fA-F]+ - \ [-]?[0x]?[0-9][0-9a-fA-F]*[hH] - ============== ==================================== +Integer numbers may be specified in binary, octal, hexadecimal and decimal formats: -Examples: - -.. parsed-literal:: - - -1234 - 0b1010 - 010 - 0xff - 0ffh + ============ =============================== ======== + Format Syntax Example + ============ =============================== ======== + Decimal [-]?[1-9][0-9]* -1234 + Binary [-]?0b[01]+ 0b1010 + Octal [-]?0[0-7]+ 010 + Hexadecimal [-]?0x[0-9a-fA-F]+ 0xff + \ [-]?[0x]?[0-9][0-9a-fA-F]*[hH] 0ffh + ============ =============================== ======== .. _amdgpu_synid_floating-point_number: @@ -706,31 +710,29 @@ Floating-Point Numbers ---------------------- All floating-point numbers are handled as double (64 bits wide). +They are converted to +:ref:`expected operand type` +as described :ref:`here`. Floating-point numbers may be specified in hexadecimal and decimal formats: - ============== ======================================================== ======================================================== - Format Syntax Note - ============== ======================================================== ======================================================== - Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? Must include either a decimal separator or an exponent. - Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+ - ============== ======================================================== ======================================================== - -Examples: - -.. parsed-literal:: - - -1.234 - 234e2 - -0x1afp-10 - 0x.1afp10 + ============ ======================================================== ====================== ==================== + Format Syntax Examples Note + ============ ======================================================== ====================== ==================== + Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? -1.234, 234e2 Must include either + a decimal separator + or an exponent. + Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+ -0x1afp-10, 0x.1afp10 + ============ ======================================================== ====================== ==================== .. _amdgpu_synid_expression: Expressions =========== -An expression specifies an address or a numeric value. +An expression is evaluated to a 64-bit integer. +Note that floating-point expressions are not supported. + There are two kinds of expressions: * :ref:`Absolute`. @@ -741,10 +743,14 @@ There are two kinds of expressions: Absolute Expressions -------------------- -The value of an absolute expression remains the same after program relocation. +The value of an absolute expression does not change after program relocation. Absolute expressions must not include unassigned and relocatable values such as labels. +Absolute expressions are evaluated to 64-bit integer values and converted to +:ref:`expected operand type` +as described :ref:`here`. + Examples: .. parsed-literal:: @@ -760,46 +766,39 @@ Relocatable Expressions The value of a relocatable expression depends on program relocation. Note that use of relocatable expressions is limited with branch targets -and 32-bit :ref:`literals`. +and 32-bit integer operands. -Addition information about relocation may be found :ref:`here`. - -Examples: +A relocatable expression is evaluated to a 64-bit integer value +which depends on operand kind and :ref:`relocation type` +of symbol(s) used in the expression. For example, if an instruction refers a label, +this reference is evaluated to an offset from the address after the instruction +to the label address: .. parsed-literal:: - y = x + 10 // x is not yet defined. Undefined symbols are assumed to be PC-relative. - z = . + label: + v_add_co_u32_e32 v0, vcc, label, v1 // 'label' operand is evaluated to -4 -Expression Data Type --------------------- +Note that values of relocatable expressions are usually unknown at assembly time; +they are resolved later by a linker and converted to +:ref:`expected operand type` +as described :ref:`here`. -Expressions and operands of expressions are interpreted as 64-bit integers. +Operands and Operations +----------------------- -Expressions may include 64-bit :ref:`floating-point numbers` (double). -However these operands are also handled as 64-bit integers -using binary representation of specified floating-point numbers. -No conversion from floating-point to integer is performed. - -Examples: - -.. parsed-literal:: - - x = 0.1 // x is assigned an integer 4591870180066957722 which is a binary representation of 0.1. - y = x + x // y is a sum of two integer values; it is not equal to 0.2! - -Syntax ------- - -Expressions are composed of -:ref:`symbols`, -:ref:`integer numbers`, -:ref:`floating-point numbers`, -:ref:`binary operators`, -:ref:`unary operators` and subexpressions. +Expressions are composed of 64-bit integer operands and operations. +Operands include :ref:`integer numbers` +and :ref:`symbols`. Expressions may also use "." which is a reference to the current PC (program counter). +:ref:`Unary` and :ref:`binary` +operations produce 64-bit integer results. + +Syntax of Expressions +--------------------- + The syntax of expressions is shown below:: expr ::= expr binop expr | primaryexpr ; @@ -887,7 +886,7 @@ They operate on and produce 64-bit integers. Symbols ------- -A symbol is a named 64-bit value, representing a relocatable +A symbol is a named 64-bit integer value, representing a relocatable address or an absolute (non-relocatable) number. Symbol names have the following syntax: @@ -907,128 +906,78 @@ The table below provides several examples of syntax used for symbol definition. A symbol may be used before it is declared or assigned; unassigned symbols are assumed to be PC-relative. -Addition information about symbols may be found :ref:`here`. +Additional information about symbols may be found :ref:`here`. .. _amdgpu_synid_conv: -Conversions -=========== +Type and Size Conversion +======================== This section describes what happens when a 64-bit :ref:`integer number`, a -:ref:`floating-point numbers` or a -:ref:`symbol` +:ref:`floating-point number` or an +:ref:`expression` is used for an operand which has a different type or size. -Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W: +.. _amdgpu_synid_int_conv: -* Values encoded as :ref:`inline constants` are handled by H/W. -* Values encoded as :ref:`literals` are converted by assembler. +Conversion of Integer Values +---------------------------- -.. _amdgpu_synid_const_conv: +Instruction operands may be specified as 64-bit :ref:`integer numbers` or +:ref:`absolute expressions`. These values are converted to +the :ref:`expected operand type` using the following steps: -Inline Constants ----------------- +1. *Validation*. Assembler checks if the input value may be truncated without loss to the required *truncation width* +(see the table below). There are two cases when this operation is enabled: -.. _amdgpu_synid_int_const_conv: + * The truncated bits are all 0. + * The truncated bits are all 1 and the value after truncation has its MSB bit set. -Integer Inline Constants -~~~~~~~~~~~~~~~~~~~~~~~~ +In all other cases assembler triggers an error. -Integer :ref:`inline constants` -may be thought of as 64-bit -:ref:`integer numbers`; -when used as operands they are truncated to the size of -:ref:`expected operand type`. -No data type conversions are performed. +2. *Conversion*. The input value is converted to the expected type as described in the table below. +Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W (or both). -Examples: + ============== ================= =============== ==================================================================== + Expected type Truncation Width Conversion Description + ============== ================= =============== ==================================================================== + i16, u16, b16 16 num.u16 Truncate to 16 bits. + i32, u32, b32 32 num.u32 Truncate to 32 bits. + i64 32 {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits. + u64, b64 32 {0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits. + f16 16 num.u16 Use low 16 bits as an f16 value. + f32 32 num.u32 Use low 32 bits as an f32 value. + f64 32 {num.u32,0} Use low 32 bits of the number as high 32 bits + of the result; low 32 bits of the result are zeroed. + ============== ================= =============== ==================================================================== + +Examples of enabled conversions: .. parsed-literal:: // GFX9 - v_add_u16 v0, -1, 0 // v0 = 0xFFFF - v_add_f16 v0, -1, 0 // v0 = 0xFFFF (NaN) + v_add_u16 v0, -1, 0 // src0 = 0xFFFF + v_add_f16 v0, -1, 0 // src0 = 0xFFFF (NaN) + // + v_add_u32 v0, -1, 0 // src0 = 0xFFFFFFFF + v_add_f32 v0, -1, 0 // src0 = 0xFFFFFFFF (NaN) + // + v_add_u16 v0, 0xff00, v0 // src0 = 0xff00 + v_add_u16 v0, 0xffffffffffffff00, v0 // src0 = 0xff00 + v_add_u16 v0, -256, v0 // src0 = 0xff00 + // + s_bfe_i64 s[0:1], 0xffefffff, s3 // src0 = 0xffffffffffefffff + s_bfe_u64 s[0:1], 0xffefffff, s3 // src0 = 0x00000000ffefffff + v_ceil_f64_e32 v[0:1], 0xffefffff // src0 = 0xffefffff00000000 (-1.7976922776554302e308) + // + x = 0xffefffff // + s_bfe_i64 s[0:1], x, s3 // src0 = 0xffffffffffefffff + s_bfe_u64 s[0:1], x, s3 // src0 = 0x00000000ffefffff + v_ceil_f64_e32 v[0:1], x // src0 = 0xffefffff00000000 (-1.7976922776554302e308) - v_add_u32 v0, -1, 0 // v0 = 0xFFFFFFFF - v_add_f32 v0, -1, 0 // v0 = 0xFFFFFFFF (NaN) - -.. _amdgpu_synid_fp_const_conv: - -Floating-Point Inline Constants -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Floating-point :ref:`inline constants` -may be thought of as 64-bit -:ref:`floating-point numbers`; -when used as operands they are converted to a floating-point number of -:ref:`expected operand size`. - -Examples: - -.. parsed-literal:: - - // GFX9 - - v_add_f16 v0, 1.0, 0 // v0 = 0x3C00 (1.0) - v_add_u16 v0, 1.0, 0 // v0 = 0x3C00 - - v_add_f32 v0, 1.0, 0 // v0 = 0x3F800000 (1.0) - v_add_u32 v0, 1.0, 0 // v0 = 0x3F800000 - - -.. _amdgpu_synid_lit_conv: - -Literals --------- - -.. _amdgpu_synid_int_lit_conv: - -Integer Literals -~~~~~~~~~~~~~~~~ - -Integer :ref:`literals` -are specified as 64-bit :ref:`integer numbers`. - -When used as operands they are converted to -:ref:`expected operand type` as described below. - - ============== ============== =============== ==================================================================== - Expected type Condition Result Note - ============== ============== =============== ==================================================================== - i16, u16, b16 cond(num,16) num.u16 Truncate to 16 bits. - i32, u32, b32 cond(num,32) num.u32 Truncate to 32 bits. - i64 cond(num,32) {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits. - u64, b64 cond(num,32) { 0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits. - f16 cond(num,16) num.u16 Use low 16 bits as an f16 value. - f32 cond(num,32) num.u32 Use low 32 bits as an f32 value. - f64 cond(num,32) {num.u32,0} Use low 32 bits of the number as high 32 bits - of the result; low 32 bits of the result are zeroed. - ============== ============== =============== ==================================================================== - -The condition *cond(X,S)* indicates if a 64-bit number *X* -can be converted to a smaller size *S* by truncation of upper bits. -There are two cases when the conversion is possible: - -* The truncated bits are all 0. -* The truncated bits are all 1 and the value after truncation has its MSB bit set. - -Examples of valid literals: - -.. parsed-literal:: - - // GFX9 - // Literal value after conversion: - v_add_u16 v0, 0xff00, v0 // 0xff00 - v_add_u16 v0, 0xffffffffffffff00, v0 // 0xff00 - v_add_u16 v0, -256, v0 // 0xff00 - // Literal value after conversion: - s_bfe_i64 s[0:1], 0xffefffff, s3 // 0xffffffffffefffff - s_bfe_u64 s[0:1], 0xffefffff, s3 // 0x00000000ffefffff - v_ceil_f64_e32 v[0:1], 0xffefffff // 0xffefffff00000000 (-1.7976922776554302e308) - -Examples of invalid literals: +Examples of disabled conversions: .. parsed-literal:: @@ -1037,49 +986,57 @@ Examples of invalid literals: v_add_u16 v0, 0x1ff00, v0 // truncated bits are not all 0 or 1 v_add_u16 v0, 0xffffffffffff00ff, v0 // truncated bits do not match MSB of the result -.. _amdgpu_synid_fp_lit_conv: +.. _amdgpu_synid_fp_conv: -Floating-Point Literals -~~~~~~~~~~~~~~~~~~~~~~~ +Conversion of Floating-Point Values +----------------------------------- -Floating-point :ref:`literals` are specified as 64-bit -:ref:`floating-point numbers`. +Instruction operands may be specified as 64-bit :ref:`floating-point numbers`. +These values are converted to the :ref:`expected operand type` using the following steps: -When used as operands they are converted to -:ref:`expected operand type` as described below. +1. *Validation*. Assembler checks if the input f64 number can be converted +to the *required floating-point type* (see the table below) without overflow or underflow. +Precision lost is allowed. If this conversion is not possible, assembler triggers an error. - ============== ============== ================= ================================================================= - Expected type Condition Result Note - ============== ============== ================= ================================================================= - i16, u16, b16 cond(num,16) f16(num) Convert to f16 and use bits of the result as an integer value. - i32, u32, b32 cond(num,32) f32(num) Convert to f32 and use bits of the result as an integer value. - i64, u64, b64 false \- Conversion disabled because of an unclear semantics. - f16 cond(num,16) f16(num) Convert to f16. - f32 cond(num,32) f32(num) Convert to f32. - f64 true {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result; - zero-fill low 32 bits of the result. +2. *Conversion*. The input value is converted to the expected type as described in the table below. +Depending on operand kind, this is performed by either assembler or AMDGPU H/W (or both). - Note that the result may differ from the original number. - ============== ============== ================= ================================================================= + ============== ================ ================= ================================================================= + Expected type Required FP Type Conversion Description + ============== ================ ================= ================================================================= + i16, u16, b16 f16 f16(num) Convert to f16 and use bits of the result as an integer value. + i32, u32, b32 f32 f32(num) Convert to f32 and use bits of the result as an integer value. + i64, u64, b64 \- \- Conversion disabled. + f16 f16 f16(num) Convert to f16. + f32 f32 f32(num) Convert to f32. + f64 f64 {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result; + zero-fill low 32 bits of the result. -The condition *cond(X,S)* indicates if an f64 number *X* can be converted -to a smaller *S*-bit floating-point type without overflow or underflow. -Precision lost is allowed. + Note that the result may differ from the original number. + ============== ================ ================= ================================================================= -Examples of valid literals: +Examples of enabled conversions: .. parsed-literal:: // GFX9 - v_add_f16 v1, 65500.0, v2 - v_add_f32 v1, 65600.0, v2 + v_add_f16 v0, 1.0, 0 // src0 = 0x3C00 (1.0) + v_add_u16 v0, 1.0, 0 // src0 = 0x3C00 + // + v_add_f32 v0, 1.0, 0 // src0 = 0x3F800000 (1.0) + v_add_u32 v0, 1.0, 0 // src0 = 0x3F800000 - // Literal value before conversion: 1.7976931348623157e308 (0x7fefffffffffffff) - // Literal value after conversion: 1.7976922776554302e308 (0x7fefffff00000000) + // src0 before conversion: + // 1.7976931348623157e308 = 0x7fefffffffffffff + // src0 after conversion: + // 1.7976922776554302e308 = 0x7fefffff00000000 v_ceil_f64 v[0:1], 1.7976931348623157e308 -Examples of invalid literals: + v_add_f16 v1, 65500.0, v2 // ok for f16. + v_add_f32 v1, 65600.0, v2 // ok for f32, but would result in overflow for f16. + +Examples of disabled conversions: .. parsed-literal:: @@ -1087,25 +1044,35 @@ Examples of invalid literals: v_add_f16 v1, 65600.0, v2 // overflow -.. _amdgpu_synid_exp_conv: +.. _amdgpu_synid_rl_conv: -Expressions -~~~~~~~~~~~ +Conversion of Relocatable Values +-------------------------------- -Expressions operate with and result in 64-bit integers. +:ref:`Relocatable expressions` +may be used with 32-bit integer operands and jump targets. -When used as operands they are truncated to -:ref:`expected operand size`. -No data type conversions are performed. +When the value of a relocatable expression is resolved by a linker, it is +converted as needed and truncated to the operand size. The conversion depends +on :ref:`relocation type` and operand kind. -Examples: +For example, when a 32-bit operand of an instruction refers a relocatable expression *expr*, +this reference is evaluated to a 64-bit offset from the address after the +instruction to the address being referenced, *counted in bytes*. +Then the value is truncated to 32 bits and encoded as a literal: .. parsed-literal:: - // GFX9 + expr = . + v_add_co_u32_e32 v0, vcc, expr, v1 // 'expr' operand is evaluated to -4 + // and then truncated to 0xFFFFFFFC - x = 0.1 - v_sqrt_f32 v0, x // v0 = [low 32 bits of 0.1 (double)] - v_sqrt_f32 v0, (0.1 + 0) // the same as above - v_sqrt_f32 v0, 0.1 // v0 = [0.1 (double) converted to float] +As another example, when a branch instruction refers a label, +this reference is evaluated to an offset from the address after the +instruction to the label address, *counted in dwords*. +Then the value is truncated to 16 bits: +.. parsed-literal:: + + label: + s_branch label // 'label' operand is evaluated to -1 and truncated to 0xFFFF