llvm-project

Commit Graph

Author	SHA1	Message	Date
Scott Linder	60b1967c39	[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us to removes the scratch wave offset register from the calling convention ABI. As part of this change, allow the use of an inline constant zero for the SOffset of MUBUF instructions accessing the stack in entry functions when a frame pointer is not requested/required. Entry functions with calls still need to set up the calling convention ABI stack pointer register, and reference it in order to address arguments of called functions. The ABI stack pointer register remains unswizzled, but is now wave-relative instead of queue-relative. Non-entry functions also use an inline constant zero SOffset for wave-relative scratch access, but continue to use the stack and frame pointers as before. When the stack or frame pointer is converted to a swizzled offset it is now scaled directly, as the scratch wave offset no longer needs to be subtracted first. Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling convention. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75138	2020-03-19 15:35:16 -04:00
Matt Arsenault	4b47213951	AMDGPU: Switch backend default max workgroup size to 1024 Previously this would default to 256, not the maximum supported size of 1024. Using a maximum lower than the hardware maximum requires language runtimes to enforce this limit for correctness, which no language has correctly done. Switch the default to the conservatively correct maximum, and force frontends to opt-in to the more optimal 256 default maximum. I don't really understand why the changes in occupancy-levels.ll increased the computed occupancy, which I expected to decrease. I'm not sure if these tests should be forcing the old maximum.	2019-11-13 07:11:02 +05:30
Stanislav Mekhanoshin	2594fa8593	[AMDGPU] Fix high occupancy calculation and print it We had couple places which still return 10 as a maximum occupancy. Fixed. Also print comment about occupancy as compiler see it. Differential Revision: https://reviews.llvm.org/D65423 llvm-svn: 367381	2019-07-31 01:07:10 +00:00
Stanislav Mekhanoshin	7b5a54e369	[AMDGPU] Fixed occupancy calculation for gfx10 Differential Revision: https://reviews.llvm.org/D65010 llvm-svn: 366616	2019-07-19 21:29:51 +00:00
Stanislav Mekhanoshin	0846c125f9	[AMDGPU] gfx1010 core wave32 changes Differential Revision: https://reviews.llvm.org/D63204 llvm-svn: 363934	2019-06-20 15:08:34 +00:00
Stanislav Mekhanoshin	5d00c3060e	[AMDGPU] gfx1010 wave32 metadata Differential Revision: https://reviews.llvm.org/D63207 llvm-svn: 363577	2019-06-17 16:48:56 +00:00
Fangrui Song	e29e30b139	[llvm-readobj] Change -long-option to --long-option in tests. NFC We use both -long-option and --long-option in tests. Switch to --long-option for consistency. In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf. While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf). llvm-svn: 359649	2019-05-01 05:27:20 +00:00
Tim Renouf	ed0b9af997	[AMDGPU] Switched HSA metadata to use MsgPackDocument Summary: MsgPackDocument is the lighter-weight replacement for MsgPackTypes. This commit switches AMDGPU HSA metadata processing to use MsgPackDocument instead of MsgPackTypes. Differential Revision: https://reviews.llvm.org/D57024 Change-Id: I0751668013abe8c87db01db1170831a76079b3a6 llvm-svn: 356081	2019-03-13 18:55:50 +00:00
Scott Linder	f5b36e56fb	[AMDGPU] Emit MessagePack HSA Metadata for v3 code object Continue to present HSA metadata as YAML in ASM and when output by tools (e.g. llvm-readobj), but encode it in Messagepack in the code object. Differential Revision: https://reviews.llvm.org/D48179 llvm-svn: 348963	2018-12-12 19:39:27 +00:00

9 Commits