Commit Graph

159 Commits

Author SHA1 Message Date
Matt Arsenault 4bec7d4261 Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
Reverts r337079 with fix for msan error.

llvm-svn: 337535
2018-07-20 09:05:08 +00:00
Evgeniy Stepanov 1971ba097d Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
This reverts commit r337021.

WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7
    #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3
    #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3
    #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18
    #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23
    #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5
    #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5
    #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3
    #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26

[...]

Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv'
    #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192

llvm-svn: 337079
2018-07-14 01:20:53 +00:00
Matt Arsenault de95077780 AMDGPU: Fix handling of alignment padding in DAG argument lowering
This was completely broken if there was ever a struct argument, as
this information is thrown away during the argument analysis.

The offsets as passed in to LowerFormalArguments are not useful,
as they partially depend on the legalized result register type,
and they don't consider the alignment in the first place.

Ignore the Ins array, and instead figure out from the raw IR type
what we need to do. This seems to fix the padding computation
if the DAG lowering is forced (and stops breaking arguments
following padded arguments if the arguments were only partially
lowered in the IR)

llvm-svn: 337021
2018-07-13 16:40:25 +00:00
Tom Stellard 5bfbae5cb1 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

llvm-svn: 336851
2018-07-11 20:59:01 +00:00
Scott Linder 2ad2c18b82 [AMDGPU] Refactor HSAMetadataStream::emitKernel (NFC)
Move all metadata construction into AMDGPUHSAMetadataStreamer.

Differential Revision: https://reviews.llvm.org/D48176

llvm-svn: 336707
2018-07-10 17:31:32 +00:00
Matt Arsenault f5be3ad7f8 AMDGPU: Don't use struct type for argument layout
This was introducing unnecessary padding after the explicit
arguments, depending on the alignment of the total struct type.
Also has the side effect of avoiding creating an extra GEP for
the offset from the base kernel argument to the explicit kernel
argument offset.

llvm-svn: 335999
2018-06-29 17:31:42 +00:00
Matt Arsenault 75e7192ba3 AMDGPU: Remove MFI::ABIArgOffset
We have too many mechanisms for tracking the various offsets
used for kernel arguments, so remove one. There's still a lot of
confusion with these because there are two different "implicit"
argument areas located at the beginning and end of the kernarg
segment.

Additionally, the offset was determined based on the memory
size of the split element types. This would break in a future
commit where v3i32 is decomposed into separate i32 pieces.

llvm-svn: 335830
2018-06-28 10:18:55 +00:00
Konstantin Zhuravlyov 30f03b3bc0 AMDGPU/NFC: Fix typo in comment
llvm-svn: 335707
2018-06-27 05:36:03 +00:00
Konstantin Zhuravlyov e004b3d97b AMDGPU: Remove ability to reserve VGPRs for debugger
Differential Revision: https://reviews.llvm.org/D48234

llvm-svn: 335288
2018-06-21 20:28:19 +00:00
Scott Linder 1e8c2c705d [AMDGPU] Update assembler for HSA Code Object v3
Update AMDGPU assembler syntax behind the code-object-v3 feature:

* Replace/rename most AMDGPU assembler directives/symbols and document them.
* Provide more diagnostics (e.g. values out of range, missing values, repeated
  values).
* Provide path for backwards compatibility, even with underlying descriptor
  changes.

Differential Revision: https://reviews.llvm.org/D47736

llvm-svn: 335281
2018-06-21 19:38:56 +00:00
Konstantin Zhuravlyov 766c77efd7 AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z
and everything that comes with it from implementation
and v3 header files.

Leave definition in v2 header files for backwards
compatibility.

Differential Revision: https://reviews.llvm.org/D48191

llvm-svn: 335267
2018-06-21 18:36:04 +00:00
Konstantin Zhuravlyov ce25bc3e82 AMDHSA/NFC: Code object v3 updates (additional):
- Move section selection and alignment to AMDGPUAsmPrinter

llvm-svn: 334521
2018-06-12 18:33:51 +00:00
Konstantin Zhuravlyov 00f2cb1116 AMDHSA: Code object v3 updates
- Do not emit following assembler directives:
  - .hsa_code_object_version
  - .hsa_code_object_isa
  - .amd_amdgpu_isa
  - .amd_amdgpu_hsa_metadata
  - .amd_amdgpu_pal_metadata
- Do not emit .note entries
- Cleanup and bring in sync kernel descriptor header file
- Emit kernel descriptor into .rodata with appropriate relocations and
  alignments

llvm-svn: 334519
2018-06-12 18:02:46 +00:00
Konstantin Zhuravlyov 2ca6b1f2ba AMDGPU: Always set COMPUTE_PGM_RSRC2.ENABLE_TRAP_HANDLER to zero for AMDHSA as
it is set by CP

Differential Revision: https://reviews.llvm.org/D47392

llvm-svn: 333451
2018-05-29 19:09:13 +00:00
Matt Arsenault ceafc55e5a AMDGPU: Pass function directly instead of MachineFunction
These functions just query the underlying IR function,
so pass it directly.

llvm-svn: 333442
2018-05-29 17:42:50 +00:00
Stanislav Mekhanoshin 1c538423dc [AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:

1. Function is memory bound.
2. Kernel can use wave limiter.

Currently these hints are used in the scheduler. If a function is suspected
to be memory bound we allow occupancy to decrease to 4 waves in the course
of scheduling.

Differential Revision: https://reviews.llvm.org/D46992

llvm-svn: 333289
2018-05-25 17:25:12 +00:00
Tom Stellard c501501055 AMDGPU: Split R600 AsmPrinter code into its own class
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47245

llvm-svn: 333219
2018-05-24 20:02:01 +00:00
Tom Stellard 44b30b4537 AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930
2018-05-22 02:03:23 +00:00
Yaxun Liu deba150c27 [AMDGPU] Fix compilation failure when IR contains comdat
Remove a useless SwitchSection which also causes compilation failure
when IR contains comdat.

The SwitchSection is useless because the current section is already
correct text section for the function therefore no need to switch.

It causes compilation failure for comdat because functions with comdat
has specific text section, not the default .text section.

Since HIP uses comdat, this bug caused failures for HIP.

Differential Revision: https://reviews.llvm.org/D46770

llvm-svn: 332137
2018-05-11 20:40:14 +00:00
Shiva Chen 801bf7ebbe [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

llvm-svn: 331844
2018-05-09 02:42:00 +00:00
Tim Renouf fd8d4af3bc [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Re-landed after noticing that the buildbot failure from 329808 seemed to
be unrelated.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329826
2018-04-11 17:18:36 +00:00
Tim Renouf 8ca33bfcf3 Revert "[AMDGPU] Ensure there are enough registers for wave dispatch"
This reverts 329808. That change caused a report of a failure in
test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect
it is an expensive-check-only error.

Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0
llvm-svn: 329811
2018-04-11 14:27:41 +00:00
Tim Renouf f26b723491 [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808
2018-04-11 14:02:41 +00:00
Matt Arsenault 03ae399d50 AMDGPU: Support realigning stack
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.

I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.

llvm-svn: 328831
2018-03-29 21:30:06 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Tim Renouf 807ecc3d66 [AMDGPU] do not generate .AMDGPU.config for amdpal os type
Summary:
Now we generate PAL metadata for the amdpal os type, there is no need to
generate the .AMDGPU.config section.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37760

Change-Id: I303c5fad66656ce97293da60621afac6595b4c18
llvm-svn: 324346
2018-02-06 13:39:38 +00:00
Dmitry Preobrazhensky 3afbd825a3 [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK support
See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764

Differential Revision: https://reviews.llvm.org/D41614

Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 322189
2018-01-10 14:22:19 +00:00
Dmitry Preobrazhensky 2713495318 [AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registers
See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561

This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic.

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41437

llvm-svn: 321359
2017-12-22 15:18:06 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Tim Renouf cead41d42f [AMDGPU] add labels to +DumpCode output
Summary:
+DumpCode is a hack to embed disassembly in the ELF file. This commit
fixes it to include labels, to make it slightly more useful.

Reviewers: arsenm, kzhuravl

Subscribers: nhaehnle, timcorringham, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl

Differential Revision: https://reviews.llvm.org/D40169

llvm-svn: 320146
2017-12-08 14:09:34 +00:00
Konstantin Zhuravlyov 06ae4ec78e AMDGPU: Add num spilled s/vgprs to metadata
This was requested by tools.

Differential Revision: https://reviews.llvm.org/D40321

llvm-svn: 319192
2017-11-28 17:51:08 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Matt Arsenault 9ba465a972 AMDGPU: Error on stack size overflow
llvm-svn: 318189
2017-11-14 20:33:14 +00:00
Konstantin Zhuravlyov 435151ad75 AMDGPU: Fix set but not used warnings related to AMDGPUAS
Differential Revision: https://reviews.llvm.org/D39499

llvm-svn: 317114
2017-11-01 19:12:38 +00:00
Konstantin Zhuravlyov 8d5e9e110c AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency
Differential Revision: https://reviews.llvm.org/D38957

llvm-svn: 316097
2017-10-18 17:31:09 +00:00
Konstantin Zhuravlyov 7dabe9ced7 AMDGPU: Start generating metadata for MaxFlatWorkGroupSize
Differential Revision: https://reviews.llvm.org/D38958

llvm-svn: 316024
2017-10-17 20:03:21 +00:00
Konstantin Zhuravlyov 8c18f5b3d4 AMDGPU: Don't use TargetStreamer if it has not been initialized
Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl
test after r315808

We may hit few other similar issues, but I want to discuss good
solution offline.

llvm-svn: 315830
2017-10-14 22:16:26 +00:00
Konstantin Zhuravlyov a01d8b0b63 AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753

llvm-svn: 315821
2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov eda425edd4 AMDGPU: Do not emit deprecated notes for code object v3
Differential Revision: https://reviews.llvm.org/D38749

llvm-svn: 315810
2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov 9c05b2bc3b AMDGPU: Add support for isa version note
- Emit NT_AMD_AMDGPU_ISA
  - Add assembler parsing for isa version directive
    - If isa version directive does not match command line arguments, then return error

Differential Revision: https://reviews.llvm.org/D38748

llvm-svn: 315808
2017-10-14 15:40:33 +00:00
Konstantin Zhuravlyov 516651b154 AMDGPU/NFC: Minor clean ups in HSA metadata
- Use HSA metadata streamer directly from AMDGPUAsmPrinter
  - Make naming consistent with PAL metadata

Differential Revision: https://reviews.llvm.org/D38746

llvm-svn: 315526
2017-10-11 22:59:35 +00:00
Konstantin Zhuravlyov c3beb6a075 AMDGPU/NFC: Minor clean ups in PAL metadata
- Move PAL metadata definitions to AMDGPUMetadata
  - Make naming consistent with HSA metadata

Differential Revision: https://reviews.llvm.org/D38745

llvm-svn: 315523
2017-10-11 22:41:09 +00:00
Konstantin Zhuravlyov a63b0f9d20 AMDGPU/NFC: Rename code object metadata as HSA metadata
- Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change)
  - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer
  - Introduce HSAMD namespace
  - Other minor name changes in function and test names

llvm-svn: 315522
2017-10-11 22:18:53 +00:00
Tim Renouf 72800f0436 [AMDGPU] implemented pal metadata
Summary:
For the amdpal OS type:

We write an AMDGPU_PAL_METADATA record in the .note section in the ELF
(or as an assembler directive). It contains key=value pairs of 32 bit
ints. It is a merge of metadata from codegen of the shaders, and
metadata provided by the frontend as _amdgpu_pal_metadata IR metadata.
Where both sources have a key=value with the same key, the two values
are ORed together.

This .note record is part of the amdpal ABI and will be documented in
docs/AMDGPUUsage.rst in a future commit.

Eventually the amdpal OS type will stop generating the .AMDGPU.config
section once the frontend has safely moved over to using the .note
records above instead of .AMDGPU.config.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37753

llvm-svn: 314829
2017-10-03 19:03:52 +00:00
Tim Renouf ef1ae8ffac [AMDGPU] calling conventions for AMDPAL OS type
Summary:
This commit adds comments on how the AMDPAL OS type overloads the
existing AMDGPU_ calling conventions used by Mesa, and adds a couple of
new ones.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37752

llvm-svn: 314502
2017-09-29 09:51:22 +00:00
Tim Renouf 132291589f [AMDGPU] AMDPAL scratch buffer support
Summary:
Added support for scratch (including spilling) for OS type amdpal:
generates code to set up the scratch descriptor if it is needed.

With amdpal, the scratch resource descriptor is loaded from offset 0 of
the global information table. The low 32 bits of the address of the
global information table is passed in s0.

Added amdgpu-git-ptr-high function attribute to hard-wire the high 32
bits of the address of the global information table. If the function
attribute is not specified, or is 0xffffffff, then the backend generates
code to use the high 32 bits of pc.

The documentation for the AMDPAL ABI will be added in a later commit.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye

Differential Revision: https://reviews.llvm.org/D37483

llvm-svn: 314501
2017-09-29 09:49:35 +00:00
Matt Arsenault 22cdb61a78 AMDGPU: Fix not accounting for tail call resource usage
If the only call in a function is a tail call, the
function isn't considered to have a call since it's a
type of return.

llvm-svn: 312561
2017-09-05 18:36:36 +00:00
Matt Arsenault 71bcbd451f AMDGPU: Start adding tail call support
Handle the sibling call cases.

llvm-svn: 310753
2017-08-11 20:42:08 +00:00
Matt Arsenault 36cd1859f3 AMDGPU: Fix assert on n inline asm constraint
llvm-svn: 310515
2017-08-09 20:09:35 +00:00
Matt Arsenault 2738ede6b2 AMDGPU: Restore using MRI to find highest used regs
If there are no calls, this is a faster path than
searching the entire program for calls.

This was supposed to be left in r309781.
Fixes unused variable warning.

llvm-svn: 309832
2017-08-02 17:15:01 +00:00