AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.
Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44937
llvm-svn: 328938
2018-04-02 01:09:07 +08:00
|
|
|
//===-- AMDGPUSearchableTables.td - ------------------------*- tablegen -*-===//
|
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.
Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44937
llvm-svn: 328938
2018-04-02 01:09:07 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// Resource intrinsics table.
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2018-06-21 21:36:33 +08:00
|
|
|
class RsrcIntrinsic<AMDGPURsrcIntrinsic intr> {
|
AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.
Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44937
llvm-svn: 328938
2018-04-02 01:09:07 +08:00
|
|
|
Intrinsic Intr = !cast<Intrinsic>(intr);
|
|
|
|
bits<8> RsrcArg = intr.RsrcArg;
|
|
|
|
bit IsImage = intr.IsImage;
|
|
|
|
}
|
|
|
|
|
2018-06-21 21:36:33 +08:00
|
|
|
def RsrcIntrinsics : GenericTable {
|
|
|
|
let FilterClass = "RsrcIntrinsic";
|
|
|
|
let Fields = ["Intr", "RsrcArg", "IsImage"];
|
|
|
|
|
|
|
|
let PrimaryKey = ["Intr"];
|
|
|
|
let PrimaryKeyName = "lookupRsrcIntrinsic";
|
|
|
|
}
|
|
|
|
|
AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.
Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44937
llvm-svn: 328938
2018-04-02 01:09:07 +08:00
|
|
|
foreach intr = !listconcat(AMDGPUBufferIntrinsics,
|
AMDGPU: Dimension-aware image intrinsics
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.
This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.
Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.
v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI
Change-Id: I099f309e0a394082a5901ea196c3967afb867f04
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44939
llvm-svn: 329166
2018-04-04 18:58:54 +08:00
|
|
|
AMDGPUImageDimIntrinsics,
|
|
|
|
AMDGPUImageDimAtomicIntrinsics) in {
|
AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.
Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44937
llvm-svn: 328938
2018-04-02 01:09:07 +08:00
|
|
|
def : RsrcIntrinsic<!cast<AMDGPURsrcIntrinsic>(intr)>;
|
|
|
|
}
|
2018-04-02 01:09:14 +08:00
|
|
|
|
[AMDGPU][SILoadStoreOptimizer] Merge TBUFFER loads/stores
Summary: Extend SILoadStoreOptimizer to merge tbuffer loads and stores.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69794
2019-11-21 05:30:02 +08:00
|
|
|
class GcnBufferFormatBase<bits<8> f, bits<8> bpc, bits<8> numc, bits<8> nfmt, bits<8> dfmt> {
|
|
|
|
bits<8> Format = f;
|
|
|
|
bits<8> BitsPerComp = bpc;
|
|
|
|
bits<8> NumComponents = numc;
|
|
|
|
bits<8> NumFormat = nfmt;
|
|
|
|
bits<8> DataFormat = dfmt;
|
|
|
|
}
|
|
|
|
|
|
|
|
class Gfx9BufferFormat<bits<8> f, bits<8> bpc, bits<8> numc, bits<8> nfmt, bits<8> dfmt> : GcnBufferFormatBase<f, bpc, numc, nfmt, dfmt>;
|
|
|
|
class Gfx10PlusBufferFormat<bits<8> f, bits<8> bpc, bits<8> numc, bits<8> nfmt, bits<8> dfmt> : GcnBufferFormatBase<f, bpc, numc, nfmt, dfmt>;
|
|
|
|
|
|
|
|
class GcnBufferFormatTable : GenericTable {
|
|
|
|
let CppTypeName = "GcnBufferFormatInfo";
|
|
|
|
let Fields = ["Format", "BitsPerComp", "NumComponents", "NumFormat", "DataFormat"];
|
|
|
|
let PrimaryKey = ["BitsPerComp", "NumComponents", "NumFormat"];
|
|
|
|
}
|
|
|
|
|
|
|
|
def Gfx9BufferFormat : GcnBufferFormatTable {
|
|
|
|
let FilterClass = "Gfx9BufferFormat";
|
|
|
|
let PrimaryKeyName = "getGfx9BufferFormatInfo";
|
|
|
|
}
|
|
|
|
def Gfx10PlusBufferFormat : GcnBufferFormatTable {
|
|
|
|
let FilterClass = "Gfx10PlusBufferFormat";
|
|
|
|
let PrimaryKeyName = "getGfx10PlusBufferFormatInfo";
|
|
|
|
}
|
|
|
|
|
|
|
|
def getGfx9BufferFormatInfo : SearchIndex {
|
|
|
|
let Table = Gfx9BufferFormat;
|
|
|
|
let Key = ["Format"];
|
|
|
|
}
|
|
|
|
def getGfx10PlusBufferFormatInfo : SearchIndex {
|
|
|
|
let Table = Gfx10PlusBufferFormat;
|
|
|
|
let Key = ["Format"];
|
|
|
|
}
|
|
|
|
|
|
|
|
// Buffer formats with equal component sizes (GFX9 and earlier)
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_UNORM*/ 0x01, 8, 1, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_SNORM*/ 0x11, 8, 1, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_USCALED*/ 0x21, 8, 1, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_SSCALED*/ 0x31, 8, 1, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_UINT*/ 0x41, 8, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_SINT*/ 0x51, 8, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_UNORM*/ 0x02, 16, 1, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_SNORM*/ 0x12, 16, 1, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_USCALED*/ 0x22, 16, 1, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_SSCALED*/ 0x32, 16, 1, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_UINT*/ 0x42, 16, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_SINT*/ 0x52, 16, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_FLOAT*/ 0x72, 16, 1, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_UNORM*/ 0x03, 8, 2, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_SNORM*/ 0x13, 8, 2, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_USCALED*/ 0x23, 8, 2, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_SSCALED*/ 0x33, 8, 2, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_UINT*/ 0x43, 8, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_SINT*/ 0x53, 8, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_UINT*/ 0x44, 32, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_SINT*/ 0x54, 32, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_FLOAT*/ 0x74, 32, 1, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_UNORM*/ 0x05, 16, 2, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_SNORM*/ 0x15, 16, 2, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_USCALED*/ 0x25, 16, 2, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_SSCALED*/ 0x35, 16, 2, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_UINT*/ 0x45, 16, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_SINT*/ 0x55, 16, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_FLOAT*/ 0x75, 16, 2, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_UNORM*/ 0x0A, 8, 4, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_SNORM*/ 0x1A, 8, 4, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_USCALED*/ 0x2A, 8, 4, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_SSCALED*/ 0x3A, 8, 4, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_UINT*/ 0x4A, 8, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_8_8_8_8_SINT*/ 0x5A, 8, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_UINT*/ 0x4B, 32, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_SINT*/ 0x5B, 32, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_FLOAT*/ 0x7B, 32, 2, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_UNORM*/ 0x0C, 16, 4, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_SNORM*/ 0x1C, 16, 4, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_USCALED*/ 0x2C, 16, 4, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_SSCALED*/ 0x3C, 16, 4, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_UINT*/ 0x4C, 16, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_SINT*/ 0x5C, 16, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_16_16_16_16_FLOAT*/ 0x7C, 16, 4, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_UINT*/ 0x4D, 32, 3, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_SINT*/ 0x5D, 32, 3, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_FLOAT*/ 0x7D, 32, 3, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_32_UINT*/ 0x4E, 32, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_32_SINT*/ 0x5E, 32, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
def : Gfx9BufferFormat< /*FORMAT_32_32_32_32_FLOAT*/ 0x7E, 32, 4, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
|
|
|
|
// Buffer formats with equal component sizes (GFX10 and later)
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_UNORM*/ 0x01, 8, 1, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_SNORM*/ 0x02, 8, 1, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_USCALED*/ 0x03, 8, 1, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_SSCALED*/ 0x04, 8, 1, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_UINT*/ 0x05, 8, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_SINT*/ 0x06, 8, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8*/ 1>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_UNORM*/ 0x07, 16, 1, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_SNORM*/ 0x08, 16, 1, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_USCALED*/ 0x09, 16, 1, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_SSCALED*/ 0x0A, 16, 1, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_UINT*/ 0x0B, 16, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_SINT*/ 0x0C, 16, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_FLOAT*/ 0x0D, 16, 1, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16*/ 2>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_UNORM*/ 0x0E, 8, 2, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_SNORM*/ 0x0F, 8, 2, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_USCALED*/ 0x10, 8, 2, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_SSCALED*/ 0x11, 8, 2, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_UINT*/ 0x12, 8, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_SINT*/ 0x13, 8, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8_8*/ 3>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_UINT*/ 0x14, 32, 1, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_SINT*/ 0x15, 32, 1, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_FLOAT*/ 0x16, 32, 1, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32*/ 4>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_UNORM*/ 0x17, 16, 2, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_SNORM*/ 0x18, 16, 2, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_USCALED*/ 0x19, 16, 2, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_SSCALED*/ 0x1A, 16, 2, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_UINT*/ 0x1B, 16, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_SINT*/ 0x1C, 16, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_FLOAT*/ 0x1D, 16, 2, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16_16*/ 5>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_UNORM*/ 0x38, 8, 4, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_SNORM*/ 0x39, 8, 4, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_USCALED*/ 0x3A, 8, 4, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_SSCALED*/ 0x3B, 8, 4, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_UINT*/ 0x3C, 8, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_8_8_8_8_SINT*/ 0x3D, 8, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_8_8_8_8*/ 10>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_UINT*/ 0x3E, 32, 2, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_SINT*/ 0x3F, 32, 2, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_FLOAT*/ 0x40, 32, 2, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32*/ 11>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_UNORM*/ 0x41, 16, 4, /*NUM_FORMAT_UNORM*/ 0, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_SNORM*/ 0x42, 16, 4, /*NUM_FORMAT_SNORM*/ 1, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_USCALED*/ 0x43, 16, 4, /*NUM_FORMAT_USCALED*/ 2, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_SSCALED*/ 0x44, 16, 4, /*NUM_FORMAT_SSCALED*/ 3, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_UINT*/ 0x45, 16, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_SINT*/ 0x46, 16, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_16_16_16_16_FLOAT*/ 0x47, 16, 4, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_16_16_16_16*/ 12>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_UINT*/ 0x48, 32, 3, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_SINT*/ 0x49, 32, 3, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_FLOAT*/ 0x4A, 32, 3, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32_32*/ 13>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_32_UINT*/ 0x4B, 32, 4, /*NUM_FORMAT_UINT*/ 4, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_32_SINT*/ 0x4C, 32, 4, /*NUM_FORMAT_SINT*/ 5, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
def : Gfx10PlusBufferFormat< /*FORMAT_32_32_32_32_FLOAT*/ 0x4D, 32, 4, /*NUM_FORMAT_FLOAT*/ 7, /*DATA_FORMAT_32_32_32_32*/ 14>;
|
|
|
|
|
2018-06-21 21:36:33 +08:00
|
|
|
class SourceOfDivergence<Intrinsic intr> {
|
2018-04-02 01:09:14 +08:00
|
|
|
Intrinsic Intr = intr;
|
|
|
|
}
|
|
|
|
|
2018-06-21 21:36:33 +08:00
|
|
|
def SourcesOfDivergence : GenericTable {
|
|
|
|
let FilterClass = "SourceOfDivergence";
|
|
|
|
let Fields = ["Intr"];
|
|
|
|
|
|
|
|
let PrimaryKey = ["Intr"];
|
|
|
|
let PrimaryKeyName = "lookupSourceOfDivergence";
|
|
|
|
}
|
|
|
|
|
2018-04-02 01:09:14 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_workitem_id_x>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_workitem_id_y>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_workitem_id_z>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_interp_mov>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_interp_p1>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_interp_p2>;
|
[AMDGPU] Add intrinsics for 16 bit interpolation
Summary:
Added the intrinsics llvm.amdgcn.interp.p1.f16() and
llvm.amdgcn.interp.p2.f16() and related LIT test.
The p1 intrinsic generates code appropriate for both 16 and 32
bank LDS.
Reviewers: #amdgpu, dstuttard, arsenm, tpr
Reviewed By: #amdgpu, arsenm
Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D46754
llvm-svn: 352357
2019-01-28 21:48:59 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_interp_p1_f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_interp_p2_f16>;
|
2018-04-02 01:09:14 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_mbcnt_hi>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mbcnt_lo>;
|
|
|
|
def : SourceOfDivergence<int_r600_read_tidig_x>;
|
|
|
|
def : SourceOfDivergence<int_r600_read_tidig_y>;
|
|
|
|
def : SourceOfDivergence<int_r600_read_tidig_z>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_atomic_inc>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_atomic_dec>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_fadd>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_fmin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_fmax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_swap>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_add>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_sub>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_smin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_umin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_smax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_umax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_and>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_or>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_xor>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_buffer_atomic_cmpswap>;
|
2019-04-17 22:04:31 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_swap>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_add>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_sub>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_smin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_umin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_smax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_umax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_and>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_or>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_xor>;
|
AMDGPU: add missing llvm.amdgcn.{raw,struct}.buffer.atomic.{inc,dec}
Summary:
Wrapping increment/decrement. These aren't exposed by many APIs...
Change-Id: I1df25c7889de5a5ba76468ad8e8a2597efa9af6c
Reviewers: arsenm, tpr, dstuttard
Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65283
llvm-svn: 367821
2019-08-05 17:36:06 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_inc>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_dec>;
|
2019-04-17 22:04:31 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_raw_buffer_atomic_cmpswap>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_swap>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_add>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_sub>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_smin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_umin>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_smax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_umax>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_and>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_or>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_xor>;
|
AMDGPU: add missing llvm.amdgcn.{raw,struct}.buffer.atomic.{inc,dec}
Summary:
Wrapping increment/decrement. These aren't exposed by many APIs...
Change-Id: I1df25c7889de5a5ba76468ad8e8a2597efa9af6c
Reviewers: arsenm, tpr, dstuttard
Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65283
llvm-svn: 367821
2019-08-05 17:36:06 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_inc>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_dec>;
|
2019-04-17 22:04:31 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_struct_buffer_atomic_cmpswap>;
|
2018-04-02 01:09:14 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_ps_live>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_swizzle>;
|
2019-01-16 23:43:53 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_ordered_add>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_ds_ordered_swap>;
|
2019-06-14 00:31:51 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_permlane16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_permlanex16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mov_dpp>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mov_dpp8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_update_dpp>;
|
AMDGPU: llvm.amdgcn.writelane is a source of divergence
Summary:
Consider:
%r = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2)
This produces a value that is 0 on lane 1, and 2 everywhere else; i.e.,
it is divergent.
Reported-by: Marek Olsak <Marek.Olsak@amd.com>
Reviewers: arsenm, foad, mareko
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74400
2020-02-11 21:40:00 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_writelane>;
|
AMDGPU: Dimension-aware image intrinsics
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.
This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.
Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.
v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI
Change-Id: I099f309e0a394082a5901ea196c3967afb867f04
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44939
llvm-svn: 329166
2018-04-04 18:58:54 +08:00
|
|
|
|
2019-07-12 05:19:33 +08:00
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x1f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x1f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x4f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_i32_4x4x4i8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x2bf16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x1f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x4f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x4f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x16f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_i32_16x16x4i8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_i32_16x16x16i8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x2bf16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_16x16x8bf16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x1f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x2f32>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x4f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x8f16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_i32_32x32x4i8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_i32_32x32x8i8>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x2bf16>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_mfma_f32_32x32x4bf16>;
|
|
|
|
|
2020-02-01 06:23:59 +08:00
|
|
|
// The dummy boolean output is divergent from the IR's perspective,
|
|
|
|
// but the mask results are uniform. These produce a divergent and
|
|
|
|
// uniform result, so the returned struct is collectively divergent.
|
|
|
|
// isAlwaysUniform can override the extract of the uniform component.
|
|
|
|
def : SourceOfDivergence<int_amdgcn_if>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_else>;
|
|
|
|
def : SourceOfDivergence<int_amdgcn_loop>;
|
|
|
|
|
AMDGPU: Dimension-aware image intrinsics
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.
This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.
Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.
v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI
Change-Id: I099f309e0a394082a5901ea196c3967afb867f04
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44939
llvm-svn: 329166
2018-04-04 18:58:54 +08:00
|
|
|
foreach intr = AMDGPUImageDimAtomicIntrinsics in
|
|
|
|
def : SourceOfDivergence<intr>;
|