llvm-project/llvm/lib
Marek Olsak b953cc36e2 AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4
Summary:
Only constant offsets (*_IMM opcodes) are merged.
It reuses code for LDS load/store merging.
It relies on the scheduler to group loads.

The results are mixed, I think they are mostly positive. Most shaders are
affected, so here are total stats only:

 SGPRS: 2072198 -> 2151462 (3.83 %)
 VGPRS: 1628024 -> 1634612 (0.40 %)
 Spilled SGPRs: 7883 -> 8942 (13.43 %)
 Spilled VGPRs: 97 -> 101 (4.12 %)
 Scratch size: 1488 -> 1492 (0.27 %) dwords per thread
 Code Size: 60222620 -> 52940672 (-12.09 %) bytes
 Max Waves: 374337 -> 373066 (-0.34 %)

There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more
VGPRs (now 37), but 12% decrease in code size.

These are the new stats for SGPR spilling. We already spill a lot SGPRs,
so it's uncertain whether more spilling will make any difference since
SGPRs are always spilled to VGPRs:

 SGPR SPILLING APPS   Shaders SpillSGPR AvgPerSh
 alien_isolation         2938       100      0.0
 batman_arkham_origins    589         6      0.0
 bioshock-infinite       1769         4      0.0
 borderlands2            3968        22      0.0
 counter_strike_glob..   1142        60      0.1
 deus_ex_mankind_div..   1410        79      0.1
 dirt-showdown            533         4      0.0
 dirt_rally               364      1163      3.2
 divinity                1052         2      0.0
 dota2                   1747         7      0.0
 f1-2015                  776      1515      2.0
 grid_autosport          1767      1505      0.9
 hitman                  1413       273      0.2
 left_4_dead_2           1762         4      0.0
 life_is_strange         1296        26      0.0
 mad_max                  358        96      0.3
 metro_2033_redux        2670        60      0.0
 payday2                 1362        22      0.0
 portal                   474         3      0.0
 saints_row_iv           1704         8      0.0
 serious_sam_3_bfe        392      1348      3.4
 shadow_of_mordor        1418        12      0.0
 shadow_warrior          3956       239      0.1
 talos_principle          324      1735      5.4
 thea                     172        17      0.1
 tomb_raider             1449       215      0.1
 total_war_warhammer      242        56      0.2
 ue4_effects_cave         295        55      0.2
 ue4_elemental            572        12      0.0
 unigine_tropics          210        56      0.3
 unigine_valley           278       152      0.5
 victor_vran             1262        84      0.1
 yofrankie                 82         2      0.0

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38949

llvm-svn: 317751
2017-11-09 01:52:23 +00:00
..
Analysis Add an @llvm.sideeffect intrinsic 2017-11-08 21:59:51 +00:00
AsmParser [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag 2017-11-06 16:27:15 +00:00
BinaryFormat Simplify. 2017-10-19 01:32:18 +00:00
Bitcode [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag 2017-11-06 16:27:15 +00:00
CodeGen Let replaceVTableHolder accept any type. 2017-11-08 22:04:43 +00:00
DebugInfo Convert FileOutputBuffer to Expected. NFC. 2017-11-08 01:05:44 +00:00
Demangle [ItaniumDemangle] Fix a exponential string copying bug 2017-05-28 23:24:52 +00:00
ExecutionEngine ExecutionEngine: make COFF Thumb2 assertions non-tautological 2017-10-22 20:51:25 +00:00
FuzzMutate FuzzMutate: Fix arch parsing in FuzzerCLI 2017-10-17 02:39:40 +00:00
Fuzzer [libFuzzer] Delete llvm/lib/Fuzzer 2017-10-16 20:48:19 +00:00
IR Let replaceVTableHolder accept any type. 2017-11-08 22:04:43 +00:00
IRReader Move the stripping of invalid debug info from the Verifier to AutoUpgrade. 2017-10-02 18:31:29 +00:00
LTO [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local. 2017-11-04 17:04:39 +00:00
LineEditor
Linker Linker: Create a function declaration when moving a non-prevailing alias of function type. 2017-08-10 01:07:44 +00:00
MC NFC: Rename MCSafeSEHFragment to MCSymbolIdFragment 2017-11-08 18:57:02 +00:00
Object Revert r317046, "Object: Move some code from ELF.h into ELF.cpp." 2017-11-03 21:30:06 +00:00
ObjectYAML Revert "Reapply: Allow yaml2obj to order implicit sections for ELF" 2017-11-08 01:31:20 +00:00
Option Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people. 2017-10-15 14:32:27 +00:00
Passes [PassManager, SimplifyCFG] Revert r316908 and r316869. 2017-11-06 00:32:01 +00:00
ProfileData GCOV: Move GCOV from IR & Support into ProfileData to fix layering 2017-11-03 20:57:10 +00:00
Support [FileOutputBuffer] Move factory methods out of their classes. 2017-11-08 22:57:48 +00:00
TableGen [globalisel][regbank] Warn about MIR ambiguities when register bank/class names clash. 2017-11-01 22:13:05 +00:00
Target AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4 2017-11-09 01:52:23 +00:00
Testing Force #define GTEST_LANG_CXX11. 2017-10-27 21:12:28 +00:00
ToolDrivers [COFF] Improve the check for functions that should get an extra underscore 2017-10-23 09:08:13 +00:00
Transforms Add an @llvm.sideeffect intrinsic 2017-11-08 21:59:51 +00:00
WindowsManifest Fix bug 34608 by moving private header out of public header. 2017-09-14 23:01:13 +00:00
XRay [XRay][tools] Support arg1 logging entries in the basic logging mode 2017-10-05 05:18:17 +00:00
CMakeLists.txt Moving libFuzzer from LLVM to compiler-rt. 2017-08-21 23:25:12 +00:00
LLVMBuild.txt Re-apply "Introduce FuzzMutate library" 2017-08-21 22:57:06 +00:00