Commit Graph

5 Commits

Author SHA1 Message Date
Jay Foad 4bdab2e86a [AMDGPU] Fix offset for REL32_HI relocs
The addend in a REL32 reloc needs to be adjusted to account for the
offset from the PC value returned by the s_getpc instruction to the
point where the reloc is applied. This was being done correctly for
(GOTPC)REL32_LO but not for (GOTPC)REL32_HI. This will only make a
difference if the target symbol happens to get loaded almost exactly
a multiple of 4G away from the relocated instructions.

Differential Revision: https://reviews.llvm.org/D86938
2020-09-02 10:55:55 +01:00
Carl Ritson 5bf2a9dd40 [AMDGPU] Update VMEM scalar write hazard mitigation sequence
Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm.org/D83872
2020-07-16 11:37:45 +09:00
Jay Foad 590964c835 [AMDGPU] More accurate gfx10 latencies
Differential Revision: https://reviews.llvm.org/D81012
2020-06-04 10:29:32 +01:00
Scott Linder 0e9368cc8c [AMDGPU] Move frame pointer from s34 to s33
Remove the gap left between the stack pointer (s32) and frame pointer
(s34) now that the scratch wave offset is no longer a part of the
calling convention ABI.

Update llvm/docs/AMDGPUUsage.rst to reflect the change.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75657
2020-03-19 15:35:16 -04:00
Scott Linder 60b1967c39 [AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions
Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in
the entry function prologue. This allows us to removes the scratch wave
offset register from the calling convention ABI.

As part of this change, allow the use of an inline constant zero for the
SOffset of MUBUF instructions accessing the stack in entry functions
when a frame pointer is not requested/required. Entry functions with
calls still need to set up the calling convention ABI stack pointer
register, and reference it in order to address arguments of called
functions. The ABI stack pointer register remains unswizzled, but is now
wave-relative instead of queue-relative.

Non-entry functions also use an inline constant zero SOffset for
wave-relative scratch access, but continue to use the stack and frame
pointers as before. When the stack or frame pointer is converted to a
swizzled offset it is now scaled directly, as the scratch wave offset no
longer needs to be subtracted first.

Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling
convention.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75138
2020-03-19 15:35:16 -04:00