The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.
At the very least missing:
- GlobalISel;
- Some optimizations in frame elimination in between vector
and scalar ALU;
- It shall finally allow to always materialize frame index
as an SGPR, but that is not implemented and frame elimination
cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
expression to recover the unswizzled scratch address;
Differential Revision: https://reviews.llvm.org/D89170
Always prefer to clobber input SGPRs and restore them after the
spill. This applies to both spills to VGPRs and scratch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D81914
Explicitly set the exec mask for SGPR spills and reloads.
This fixes a bug where SGPR spills to memory could be incorrect
if the exec mask was 0 (or differed between spill and reload).
Additionally pack scalar subregisters (upto 16/32 per VGPR),
so that the majority of scalar types can be spilt or reloaded
with a simple memory access. This should amortize some of the
additional overhead of manipulating the exec mask.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D80282