2015-06-27 05:15:07 +08:00
|
|
|
//===-- AMDGPUBaseInfo.cpp - AMDGPU Base encoding information--------------===//
|
|
|
|
//
|
|
|
|
// The LLVM Compiler Infrastructure
|
|
|
|
//
|
|
|
|
// This file is distributed under the University of Illinois Open Source
|
|
|
|
// License. See LICENSE.TXT for details.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
#include "AMDGPUBaseInfo.h"
|
2015-12-03 01:00:42 +08:00
|
|
|
#include "AMDGPU.h"
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
#include "SIDefines.h"
|
2015-12-16 00:26:16 +08:00
|
|
|
#include "llvm/IR/LLVMContext.h"
|
|
|
|
#include "llvm/IR/Function.h"
|
2015-12-03 01:00:42 +08:00
|
|
|
#include "llvm/IR/GlobalValue.h"
|
2015-09-26 05:41:28 +08:00
|
|
|
#include "llvm/MC/MCContext.h"
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
#include "llvm/MC/MCInstrInfo.h"
|
|
|
|
#include "llvm/MC/MCRegisterInfo.h"
|
2015-09-26 05:41:28 +08:00
|
|
|
#include "llvm/MC/MCSectionELF.h"
|
2015-12-22 02:44:27 +08:00
|
|
|
#include "llvm/MC/MCSubtargetInfo.h"
|
2015-06-27 05:15:07 +08:00
|
|
|
#include "llvm/MC/SubtargetFeature.h"
|
|
|
|
|
|
|
|
#define GET_SUBTARGETINFO_ENUM
|
|
|
|
#include "AMDGPUGenSubtargetInfo.inc"
|
|
|
|
#undef GET_SUBTARGETINFO_ENUM
|
|
|
|
|
2015-12-22 02:44:27 +08:00
|
|
|
#define GET_REGINFO_ENUM
|
|
|
|
#include "AMDGPUGenRegisterInfo.inc"
|
|
|
|
#undef GET_REGINFO_ENUM
|
|
|
|
|
2016-10-07 22:46:06 +08:00
|
|
|
#define GET_INSTRINFO_NAMED_OPS
|
|
|
|
#define GET_INSTRINFO_ENUM
|
|
|
|
#include "AMDGPUGenInstrInfo.inc"
|
|
|
|
#undef GET_INSTRINFO_NAMED_OPS
|
|
|
|
#undef GET_INSTRINFO_ENUM
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
namespace {
|
|
|
|
|
|
|
|
/// \returns Bit mask for given bit \p Shift and bit \p Width.
|
|
|
|
unsigned getBitMask(unsigned Shift, unsigned Width) {
|
|
|
|
return ((1 << Width) - 1) << Shift;
|
|
|
|
}
|
|
|
|
|
|
|
|
/// \brief Packs \p Src into \p Dst for given bit \p Shift and bit \p Width.
|
|
|
|
///
|
|
|
|
/// \returns Packed \p Dst.
|
|
|
|
unsigned packBits(unsigned Src, unsigned Dst, unsigned Shift, unsigned Width) {
|
|
|
|
Dst &= ~(1 << Shift) & ~getBitMask(Shift, Width);
|
|
|
|
Dst |= (Src << Shift) & getBitMask(Shift, Width);
|
|
|
|
return Dst;
|
|
|
|
}
|
|
|
|
|
|
|
|
/// \brief Unpacks bits from \p Src for given bit \p Shift and bit \p Width.
|
|
|
|
///
|
|
|
|
/// \returns Unpacked bits.
|
|
|
|
unsigned unpackBits(unsigned Src, unsigned Shift, unsigned Width) {
|
|
|
|
return (Src & getBitMask(Shift, Width)) >> Shift;
|
|
|
|
}
|
|
|
|
|
|
|
|
/// \returns Vmcnt bit shift.
|
|
|
|
unsigned getVmcntBitShift() { return 0; }
|
|
|
|
|
|
|
|
/// \returns Vmcnt bit width.
|
|
|
|
unsigned getVmcntBitWidth() { return 4; }
|
|
|
|
|
|
|
|
/// \returns Expcnt bit shift.
|
|
|
|
unsigned getExpcntBitShift() { return 4; }
|
|
|
|
|
|
|
|
/// \returns Expcnt bit width.
|
|
|
|
unsigned getExpcntBitWidth() { return 3; }
|
|
|
|
|
|
|
|
/// \returns Lgkmcnt bit shift.
|
|
|
|
unsigned getLgkmcntBitShift() { return 8; }
|
|
|
|
|
|
|
|
/// \returns Lgkmcnt bit width.
|
|
|
|
unsigned getLgkmcntBitWidth() { return 4; }
|
|
|
|
|
|
|
|
} // anonymous namespace
|
|
|
|
|
2015-06-27 05:15:07 +08:00
|
|
|
namespace llvm {
|
|
|
|
namespace AMDGPU {
|
|
|
|
|
|
|
|
IsaVersion getIsaVersion(const FeatureBitset &Features) {
|
|
|
|
|
|
|
|
if (Features.test(FeatureISAVersion7_0_0))
|
|
|
|
return {7, 0, 0};
|
|
|
|
|
|
|
|
if (Features.test(FeatureISAVersion7_0_1))
|
|
|
|
return {7, 0, 1};
|
|
|
|
|
2016-10-27 00:37:56 +08:00
|
|
|
if (Features.test(FeatureISAVersion7_0_2))
|
|
|
|
return {7, 0, 2};
|
|
|
|
|
2015-06-27 05:15:07 +08:00
|
|
|
if (Features.test(FeatureISAVersion8_0_0))
|
|
|
|
return {8, 0, 0};
|
|
|
|
|
|
|
|
if (Features.test(FeatureISAVersion8_0_1))
|
|
|
|
return {8, 0, 1};
|
|
|
|
|
2016-10-12 00:00:47 +08:00
|
|
|
if (Features.test(FeatureISAVersion8_0_2))
|
|
|
|
return {8, 0, 2};
|
|
|
|
|
2016-01-14 04:39:25 +08:00
|
|
|
if (Features.test(FeatureISAVersion8_0_3))
|
|
|
|
return {8, 0, 3};
|
|
|
|
|
2016-10-27 00:37:56 +08:00
|
|
|
if (Features.test(FeatureISAVersion8_0_4))
|
|
|
|
return {8, 0, 4};
|
|
|
|
|
|
|
|
if (Features.test(FeatureISAVersion8_1_0))
|
|
|
|
return {8, 1, 0};
|
|
|
|
|
2015-06-27 05:15:07 +08:00
|
|
|
return {0, 0, 0};
|
|
|
|
}
|
|
|
|
|
2015-06-27 05:58:31 +08:00
|
|
|
void initDefaultAMDKernelCodeT(amd_kernel_code_t &Header,
|
|
|
|
const FeatureBitset &Features) {
|
|
|
|
|
|
|
|
IsaVersion ISA = getIsaVersion(Features);
|
|
|
|
|
|
|
|
memset(&Header, 0, sizeof(Header));
|
|
|
|
|
|
|
|
Header.amd_kernel_code_version_major = 1;
|
|
|
|
Header.amd_kernel_code_version_minor = 0;
|
|
|
|
Header.amd_machine_kind = 1; // AMD_MACHINE_KIND_AMDGPU
|
|
|
|
Header.amd_machine_version_major = ISA.Major;
|
|
|
|
Header.amd_machine_version_minor = ISA.Minor;
|
|
|
|
Header.amd_machine_version_stepping = ISA.Stepping;
|
|
|
|
Header.kernel_code_entry_byte_offset = sizeof(Header);
|
|
|
|
// wavefront_size is specified as a power of 2: 2^6 = 64 threads.
|
|
|
|
Header.wavefront_size = 6;
|
|
|
|
// These alignment values are specified in powers of two, so alignment =
|
|
|
|
// 2^n. The minimum alignment is 2^4 = 16.
|
|
|
|
Header.kernarg_segment_alignment = 4;
|
|
|
|
Header.group_segment_alignment = 4;
|
|
|
|
Header.private_segment_alignment = 4;
|
|
|
|
}
|
|
|
|
|
2015-09-26 05:41:28 +08:00
|
|
|
MCSection *getHSATextSection(MCContext &Ctx) {
|
|
|
|
return Ctx.getELFSection(".hsatext", ELF::SHT_PROGBITS,
|
|
|
|
ELF::SHF_ALLOC | ELF::SHF_WRITE |
|
|
|
|
ELF::SHF_EXECINSTR |
|
|
|
|
ELF::SHF_AMDGPU_HSA_AGENT |
|
|
|
|
ELF::SHF_AMDGPU_HSA_CODE);
|
|
|
|
}
|
|
|
|
|
2015-12-03 03:47:57 +08:00
|
|
|
MCSection *getHSADataGlobalAgentSection(MCContext &Ctx) {
|
|
|
|
return Ctx.getELFSection(".hsadata_global_agent", ELF::SHT_PROGBITS,
|
|
|
|
ELF::SHF_ALLOC | ELF::SHF_WRITE |
|
|
|
|
ELF::SHF_AMDGPU_HSA_GLOBAL |
|
|
|
|
ELF::SHF_AMDGPU_HSA_AGENT);
|
|
|
|
}
|
|
|
|
|
|
|
|
MCSection *getHSADataGlobalProgramSection(MCContext &Ctx) {
|
|
|
|
return Ctx.getELFSection(".hsadata_global_program", ELF::SHT_PROGBITS,
|
|
|
|
ELF::SHF_ALLOC | ELF::SHF_WRITE |
|
|
|
|
ELF::SHF_AMDGPU_HSA_GLOBAL);
|
|
|
|
}
|
|
|
|
|
2015-12-03 11:34:32 +08:00
|
|
|
MCSection *getHSARodataReadonlyAgentSection(MCContext &Ctx) {
|
|
|
|
return Ctx.getELFSection(".hsarodata_readonly_agent", ELF::SHT_PROGBITS,
|
|
|
|
ELF::SHF_ALLOC | ELF::SHF_AMDGPU_HSA_READONLY |
|
|
|
|
ELF::SHF_AMDGPU_HSA_AGENT);
|
|
|
|
}
|
|
|
|
|
2015-12-03 01:00:42 +08:00
|
|
|
bool isGroupSegment(const GlobalValue *GV) {
|
|
|
|
return GV->getType()->getAddressSpace() == AMDGPUAS::LOCAL_ADDRESS;
|
|
|
|
}
|
|
|
|
|
2015-12-03 03:47:57 +08:00
|
|
|
bool isGlobalSegment(const GlobalValue *GV) {
|
|
|
|
return GV->getType()->getAddressSpace() == AMDGPUAS::GLOBAL_ADDRESS;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isReadOnlySegment(const GlobalValue *GV) {
|
|
|
|
return GV->getType()->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS;
|
|
|
|
}
|
|
|
|
|
2016-10-21 02:12:38 +08:00
|
|
|
bool shouldEmitConstantsToTextSection(const Triple &TT) {
|
|
|
|
return TT.getOS() != Triple::AMDHSA;
|
|
|
|
}
|
|
|
|
|
2016-05-12 10:45:18 +08:00
|
|
|
int getIntegerAttribute(const Function &F, StringRef Name, int Default) {
|
2016-01-13 19:45:36 +08:00
|
|
|
Attribute A = F.getFnAttribute(Name);
|
2016-05-12 10:45:18 +08:00
|
|
|
int Result = Default;
|
2015-12-16 00:26:16 +08:00
|
|
|
|
|
|
|
if (A.isStringAttribute()) {
|
|
|
|
StringRef Str = A.getValueAsString();
|
2016-01-13 19:45:36 +08:00
|
|
|
if (Str.getAsInteger(0, Result)) {
|
2015-12-16 00:26:16 +08:00
|
|
|
LLVMContext &Ctx = F.getContext();
|
2016-05-12 10:45:18 +08:00
|
|
|
Ctx.emitError("can't parse integer attribute " + Name);
|
2015-12-16 00:26:16 +08:00
|
|
|
}
|
|
|
|
}
|
2016-05-12 10:45:18 +08:00
|
|
|
|
2016-01-13 19:45:36 +08:00
|
|
|
return Result;
|
|
|
|
}
|
|
|
|
|
2016-09-07 04:22:28 +08:00
|
|
|
std::pair<int, int> getIntegerPairAttribute(const Function &F,
|
|
|
|
StringRef Name,
|
|
|
|
std::pair<int, int> Default,
|
|
|
|
bool OnlyFirstRequired) {
|
|
|
|
Attribute A = F.getFnAttribute(Name);
|
|
|
|
if (!A.isStringAttribute())
|
|
|
|
return Default;
|
|
|
|
|
|
|
|
LLVMContext &Ctx = F.getContext();
|
|
|
|
std::pair<int, int> Ints = Default;
|
|
|
|
std::pair<StringRef, StringRef> Strs = A.getValueAsString().split(',');
|
|
|
|
if (Strs.first.trim().getAsInteger(0, Ints.first)) {
|
|
|
|
Ctx.emitError("can't parse first integer attribute " + Name);
|
|
|
|
return Default;
|
|
|
|
}
|
|
|
|
if (Strs.second.trim().getAsInteger(0, Ints.second)) {
|
|
|
|
if (!OnlyFirstRequired || Strs.second.trim().size()) {
|
|
|
|
Ctx.emitError("can't parse second integer attribute " + Name);
|
|
|
|
return Default;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return Ints;
|
AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit
Summary:
For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD.
This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions.
Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug.
Reviewers: mareko, arsenm, tstellarAMD, nhaehnle
Subscribers: FireBurn, kerberizer, llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D18340
Patch By: Bas Nieuwenhuizen
llvm-svn: 266337
2016-04-15 00:27:07 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
unsigned getWaitcntBitMask(IsaVersion Version) {
|
|
|
|
unsigned Vmcnt = getBitMask(getVmcntBitShift(), getVmcntBitWidth());
|
|
|
|
unsigned Expcnt = getBitMask(getExpcntBitShift(), getExpcntBitWidth());
|
|
|
|
unsigned Lgkmcnt = getBitMask(getLgkmcntBitShift(), getLgkmcntBitWidth());
|
|
|
|
return Vmcnt | Expcnt | Lgkmcnt;
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned getVmcntBitMask(IsaVersion Version) {
|
|
|
|
return (1 << getVmcntBitWidth()) - 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned getExpcntBitMask(IsaVersion Version) {
|
|
|
|
return (1 << getExpcntBitWidth()) - 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned getLgkmcntBitMask(IsaVersion Version) {
|
|
|
|
return (1 << getLgkmcntBitWidth()) - 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned decodeVmcnt(IsaVersion Version, unsigned Waitcnt) {
|
|
|
|
return unpackBits(Waitcnt, getVmcntBitShift(), getVmcntBitWidth());
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned decodeExpcnt(IsaVersion Version, unsigned Waitcnt) {
|
|
|
|
return unpackBits(Waitcnt, getExpcntBitShift(), getExpcntBitWidth());
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned decodeLgkmcnt(IsaVersion Version, unsigned Waitcnt) {
|
|
|
|
return unpackBits(Waitcnt, getLgkmcntBitShift(), getLgkmcntBitWidth());
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
void decodeWaitcnt(IsaVersion Version, unsigned Waitcnt,
|
|
|
|
unsigned &Vmcnt, unsigned &Expcnt, unsigned &Lgkmcnt) {
|
|
|
|
Vmcnt = decodeVmcnt(Version, Waitcnt);
|
|
|
|
Expcnt = decodeExpcnt(Version, Waitcnt);
|
|
|
|
Lgkmcnt = decodeLgkmcnt(Version, Waitcnt);
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
unsigned encodeVmcnt(IsaVersion Version, unsigned Waitcnt, unsigned Vmcnt) {
|
|
|
|
return packBits(Vmcnt, Waitcnt, getVmcntBitShift(), getVmcntBitWidth());
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
unsigned encodeExpcnt(IsaVersion Version, unsigned Waitcnt, unsigned Expcnt) {
|
|
|
|
return packBits(Expcnt, Waitcnt, getExpcntBitShift(), getExpcntBitWidth());
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
unsigned encodeLgkmcnt(IsaVersion Version, unsigned Waitcnt, unsigned Lgkmcnt) {
|
|
|
|
return packBits(Lgkmcnt, Waitcnt, getLgkmcntBitShift(), getLgkmcntBitWidth());
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-10-12 02:58:22 +08:00
|
|
|
unsigned encodeWaitcnt(IsaVersion Version,
|
|
|
|
unsigned Vmcnt, unsigned Expcnt, unsigned Lgkmcnt) {
|
|
|
|
unsigned Waitcnt = getWaitcntBitMask(Version);;
|
|
|
|
Waitcnt = encodeVmcnt(Version, Waitcnt, Vmcnt);
|
|
|
|
Waitcnt = encodeExpcnt(Version, Waitcnt, Expcnt);
|
|
|
|
Waitcnt = encodeLgkmcnt(Version, Waitcnt, Lgkmcnt);
|
|
|
|
return Waitcnt;
|
2016-10-01 01:01:40 +08:00
|
|
|
}
|
|
|
|
|
2016-01-13 19:45:36 +08:00
|
|
|
unsigned getInitialPSInputAddr(const Function &F) {
|
|
|
|
return getIntegerAttribute(F, "InitialPSInputAddr", 0);
|
2015-12-16 00:26:16 +08:00
|
|
|
}
|
|
|
|
|
2016-04-07 03:40:20 +08:00
|
|
|
bool isShader(CallingConv::ID cc) {
|
|
|
|
switch(cc) {
|
|
|
|
case CallingConv::AMDGPU_VS:
|
|
|
|
case CallingConv::AMDGPU_GS:
|
|
|
|
case CallingConv::AMDGPU_PS:
|
|
|
|
case CallingConv::AMDGPU_CS:
|
|
|
|
return true;
|
|
|
|
default:
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isCompute(CallingConv::ID cc) {
|
|
|
|
return !isShader(cc) || cc == CallingConv::AMDGPU_CS;
|
|
|
|
}
|
|
|
|
|
2015-12-22 02:44:27 +08:00
|
|
|
bool isSI(const MCSubtargetInfo &STI) {
|
|
|
|
return STI.getFeatureBits()[AMDGPU::FeatureSouthernIslands];
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isCI(const MCSubtargetInfo &STI) {
|
|
|
|
return STI.getFeatureBits()[AMDGPU::FeatureSeaIslands];
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isVI(const MCSubtargetInfo &STI) {
|
|
|
|
return STI.getFeatureBits()[AMDGPU::FeatureVolcanicIslands];
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned getMCReg(unsigned Reg, const MCSubtargetInfo &STI) {
|
|
|
|
|
|
|
|
switch(Reg) {
|
|
|
|
default: break;
|
|
|
|
case AMDGPU::FLAT_SCR:
|
|
|
|
assert(!isSI(STI));
|
|
|
|
return isCI(STI) ? AMDGPU::FLAT_SCR_ci : AMDGPU::FLAT_SCR_vi;
|
|
|
|
|
|
|
|
case AMDGPU::FLAT_SCR_LO:
|
|
|
|
assert(!isSI(STI));
|
|
|
|
return isCI(STI) ? AMDGPU::FLAT_SCR_LO_ci : AMDGPU::FLAT_SCR_LO_vi;
|
|
|
|
|
|
|
|
case AMDGPU::FLAT_SCR_HI:
|
|
|
|
assert(!isSI(STI));
|
|
|
|
return isCI(STI) ? AMDGPU::FLAT_SCR_HI_ci : AMDGPU::FLAT_SCR_HI_vi;
|
|
|
|
}
|
|
|
|
return Reg;
|
|
|
|
}
|
|
|
|
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
bool isSISrcOperand(const MCInstrDesc &Desc, unsigned OpNo) {
|
|
|
|
unsigned OpType = Desc.OpInfo[OpNo].OperandType;
|
|
|
|
|
|
|
|
return OpType == AMDGPU::OPERAND_REG_IMM32_INT ||
|
|
|
|
OpType == AMDGPU::OPERAND_REG_IMM32_FP ||
|
|
|
|
OpType == AMDGPU::OPERAND_REG_INLINE_C_INT ||
|
|
|
|
OpType == AMDGPU::OPERAND_REG_INLINE_C_FP;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isSISrcFPOperand(const MCInstrDesc &Desc, unsigned OpNo) {
|
|
|
|
unsigned OpType = Desc.OpInfo[OpNo].OperandType;
|
|
|
|
|
|
|
|
return OpType == AMDGPU::OPERAND_REG_IMM32_FP ||
|
|
|
|
OpType == AMDGPU::OPERAND_REG_INLINE_C_FP;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool isSISrcInlinableOperand(const MCInstrDesc &Desc, unsigned OpNo) {
|
|
|
|
unsigned OpType = Desc.OpInfo[OpNo].OperandType;
|
|
|
|
|
|
|
|
return OpType == AMDGPU::OPERAND_REG_INLINE_C_INT ||
|
|
|
|
OpType == AMDGPU::OPERAND_REG_INLINE_C_FP;
|
|
|
|
}
|
|
|
|
|
2016-10-20 01:40:36 +08:00
|
|
|
// Avoid using MCRegisterClass::getSize, since that function will go away
|
|
|
|
// (move from MC* level to Target* level). Return size in bits.
|
2016-10-28 07:05:31 +08:00
|
|
|
unsigned getRegBitWidth(unsigned RCID) {
|
|
|
|
switch (RCID) {
|
2016-10-20 01:40:36 +08:00
|
|
|
case AMDGPU::SGPR_32RegClassID:
|
|
|
|
case AMDGPU::VGPR_32RegClassID:
|
|
|
|
case AMDGPU::VS_32RegClassID:
|
|
|
|
case AMDGPU::SReg_32RegClassID:
|
|
|
|
case AMDGPU::SReg_32_XM0RegClassID:
|
|
|
|
return 32;
|
|
|
|
case AMDGPU::SGPR_64RegClassID:
|
|
|
|
case AMDGPU::VS_64RegClassID:
|
|
|
|
case AMDGPU::SReg_64RegClassID:
|
|
|
|
case AMDGPU::VReg_64RegClassID:
|
|
|
|
return 64;
|
|
|
|
case AMDGPU::VReg_96RegClassID:
|
|
|
|
return 96;
|
|
|
|
case AMDGPU::SGPR_128RegClassID:
|
|
|
|
case AMDGPU::SReg_128RegClassID:
|
|
|
|
case AMDGPU::VReg_128RegClassID:
|
|
|
|
return 128;
|
|
|
|
case AMDGPU::SReg_256RegClassID:
|
|
|
|
case AMDGPU::VReg_256RegClassID:
|
|
|
|
return 256;
|
|
|
|
case AMDGPU::SReg_512RegClassID:
|
|
|
|
case AMDGPU::VReg_512RegClassID:
|
|
|
|
return 512;
|
|
|
|
default:
|
|
|
|
llvm_unreachable("Unexpected register class");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-10-28 07:05:31 +08:00
|
|
|
unsigned getRegBitWidth(const MCRegisterClass &RC) {
|
|
|
|
return getRegBitWidth(RC.getID());
|
|
|
|
}
|
|
|
|
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
unsigned getRegOperandSize(const MCRegisterInfo *MRI, const MCInstrDesc &Desc,
|
|
|
|
unsigned OpNo) {
|
2016-10-20 01:40:36 +08:00
|
|
|
unsigned RCID = Desc.OpInfo[OpNo].RegClass;
|
|
|
|
return getRegBitWidth(MRI->getRegClass(RCID)) / 8;
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
}
|
|
|
|
|
2016-12-06 06:26:17 +08:00
|
|
|
bool isInlinableLiteral64(int64_t Literal, bool HasInv2Pi) {
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
if (Literal >= -16 && Literal <= 64)
|
|
|
|
return true;
|
|
|
|
|
2016-12-06 06:26:17 +08:00
|
|
|
uint64_t Val = static_cast<uint64_t>(Literal);
|
|
|
|
return (Val == DoubleToBits(0.0)) ||
|
|
|
|
(Val == DoubleToBits(1.0)) ||
|
|
|
|
(Val == DoubleToBits(-1.0)) ||
|
|
|
|
(Val == DoubleToBits(0.5)) ||
|
|
|
|
(Val == DoubleToBits(-0.5)) ||
|
|
|
|
(Val == DoubleToBits(2.0)) ||
|
|
|
|
(Val == DoubleToBits(-2.0)) ||
|
|
|
|
(Val == DoubleToBits(4.0)) ||
|
|
|
|
(Val == DoubleToBits(-4.0)) ||
|
|
|
|
(Val == 0x3fc45f306dc9c882 && HasInv2Pi);
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
}
|
|
|
|
|
2016-12-06 06:26:17 +08:00
|
|
|
bool isInlinableLiteral32(int32_t Literal, bool HasInv2Pi) {
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
if (Literal >= -16 && Literal <= 64)
|
|
|
|
return true;
|
|
|
|
|
2016-12-06 06:26:17 +08:00
|
|
|
uint32_t Val = static_cast<uint32_t>(Literal);
|
|
|
|
return (Val == FloatToBits(0.0f)) ||
|
|
|
|
(Val == FloatToBits(1.0f)) ||
|
|
|
|
(Val == FloatToBits(-1.0f)) ||
|
|
|
|
(Val == FloatToBits(0.5f)) ||
|
|
|
|
(Val == FloatToBits(-0.5f)) ||
|
|
|
|
(Val == FloatToBits(2.0f)) ||
|
|
|
|
(Val == FloatToBits(-2.0f)) ||
|
|
|
|
(Val == FloatToBits(4.0f)) ||
|
|
|
|
(Val == FloatToBits(-4.0f)) ||
|
|
|
|
(Val == 0x3e22f983 && HasInv2Pi);
|
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
2016-09-09 22:44:04 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2015-06-27 05:15:07 +08:00
|
|
|
} // End namespace AMDGPU
|
|
|
|
} // End namespace llvm
|