It was discovered that an extra register COPY remained when expanding a
(variable length) memory operation with a loop and there was another use of
the involved address register(s) afterwards.
A simple fix for this is to COPY the address registers before the loop and
use that new vreg instead.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D112065
All instructions must have a correct size value close to emission when
SystemZLongBranch runs, or a necessary branch relaxation may be missed.
This patch also adds an assert for instruction sizes in SystemZLongBranch.
Review: Ulrich Weigand
This pseudo is expanded very late (AsmPrinter) and therefore has to have a
correct size value, or the branch relaxation pass may make a wrong decision.
Review: Ulrich Weigand
- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D111662
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....
Differential Revision: https://reviews.llvm.org/D111998
This patch fixes the bug that consisted of treating variable / immediate
length mem operations (such as memcpy, memset, ...) differently. The variable
length case needs to have the length minus 1 passed due to the use of EXRL
target instructions. However, the DAGCombiner can convert a register length
argument into a constant one, and whenever that happened one byte too little
would end up being performed.
This is also a refactorization by reducing the number of opcodes and variants
involved. For any opcode (variable or constant length), only the length minus
one is passed on to the ISD node. The rest of the logic is now instead
handled during isel pseudo expansion.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D111729
This PR implements the save of the XPLINK callee-saved registers
on z/OS.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D111653
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Seem to cause test failures in compiler-rt.
Revert "[SystemZ] Implement memcmp of variable length with CLC."
This reverts commit 7a4e9a0c73.
Revert "[SystemZ] Implement memcpy of variable length with MVC."
This reverts commit c6c13c58ee.
Following the same pattern of memset/memcpy, this patch implements a variable
length memcmp with a CLC loop followed by an EXRL instruction.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D107380
Instead of making a memcpy libcall, emit an MVC loop and an EXRL instruction
the same way as is already done for memset 0.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D106874
Note that SystemZMnemonicSpellCheck is defined in
SystemZGenAsmMatcher.inc, which SystemZAsmParser.cpp includes.
Identified with readability-redundant-declaration.
- This patch adds in the GOFFMCAsmInfo interfaces for the z/OS target.
- This patch decouples the previously existing SystemZMCAsmInfo interface for the ELF target and the z/OS target.
- This patch also removes a small test in the SystemZAsmLexerTest.cpp. The reason for this is because, the test is set up for the s390x-ibm-linux (SystemZ ELF triple), and the test checks a function which is overridden only for the z/OS target. The reason we can't change the test to use a z/OS triple outright is because there is still missing support which prevents the successful running of a test (assert in AsmParser.cpp due to missing GOFFAsmParser support)
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D110077
This patch changes hard-coded usages of SystemZ::R15D with calls to the getStackPointerRegister function. Uses in the LowerCall function are avoided to avoid merge conflicts with an expected upcoming patch.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D109702
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.
Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay
Differential Revision: https://reviews.llvm.org/D109362
The type legalizer has by default no method of doing this bitcast other than
storing and reloading the value from stack.
This patch implements a custom lowering of this operation using extractions
of subregs (z13 and earlier using FP128 register pairs), or of vector
elements (with 'vector enhancements 1' using VR128 FP registers).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D110346
This simplifies the API and addresses a FIXME in
TwoAddressInstructionPass::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D110229
Based off a discussion on D110100, we should be avoiding default CostKinds whenever possible.
This initial patch removes them from the 'inner' target implementation callbacks - these should only be used by the main TTI calls, so this should guarantee that we don't cause changes in CostKind by missing it in an inner call. This exposed a few missing arguments in getGEPCost and reduction cost calls that I've cleaned up.
Differential Revision: https://reviews.llvm.org/D110242
SystemZ adds the EXRL target instructions in the end of each file. This must
be done before debug info emission since that may end the text section, and
therefore this is now done in emitConstantPools() (instead of in
emitEndOfAsmFile).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109513
The .machine directive can be used in assembly files to specify the ISA for
the instructions following it.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109660
This patch adds class SystemZFrameLowering which is a SystemZ-specific class
detailing special registers used by calling conventions on the target.
SystemZELFFrameLowering and SystemZXPLINKFrameLowering implement this class
for ELF and XPLINK64 respectively. Previous functionality in SystemZFrameLowering
is moved to SystemZELFFrameLowering. SystemZXPLINKFrameLowering can then be
implemented in future patches.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108777
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
The backend generally uses 64-bit immediates (e.g. what
MachineOperand::getImm() returns), so use that for analyzeCompare()
and optimizeCompareInst() as well. This avoids truncation for
targets that support immediates larger 32-bit. In particular, we
can avoid the bugprone value normalization hack in the AArch64
target.
This is a followup to D108076.
Differential Revision: https://reviews.llvm.org/D108875
This patch replaces the SpecialRegisters field with a unique_ptr instead of a raw pointer. This is better practice, and allows us to remove the definition of the dtor for the SystemZSubtarget class.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108639
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D107271
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D106380
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Don't use a local MachineOperand copy in SystemZAsmPrinter::PrintAsmOperand()
and change the register as it may break the MRI tracking of register
uses. Use an MCOperand instead.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D105757
The odd register of a (128 bit) register pair is accessed with the 'N' code
with an inline assembly operand.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D105502
Benchmarking has shown that it is worthwhile to implement a variable length
memset of 0 with XC (exclusive or) like gcc does, instead of using a libcall.
This requires the use of the EXecute Relative Long (EXRL) instruction which
can now be done in a framework that can also be used with other target
instructions (not just XC).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D103865
Add support for the .reloc directive along the lines of
other back-ends.
This fixes a regression after https://reviews.llvm.org/D104080
was merged, since that patch presupposed support for .reloc.
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it.
- However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`)
- To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D104715
Since this method can apply to cmpxchg operations, make sure it's clear
what value we're actually retrieving. This will help ensure we don't
accidentally ignore the failure ordering of cmpxchg in the future.
We could potentially introduce a getOrdering() method on AtomicSDNode
that asserts the operation isn't cmpxchg, but not sure that's
worthwhile.
Differential Revision: https://reviews.llvm.org/D103338
The implementation of subword atomics does not actually
guarantee the result is zero-extended, which now caused
build bot failures after https://reviews.llvm.org/D101342
was landed.
Support virtual, physical and tied i128 register operands in inline assembly.
i128 is on SystemZ not really supported and is not a legal type and generally
such a value will be split into two i64 parts. There are however some
instructions that require a pair of two GPR64 registers contained in the GR128
bit reg class, which is untyped.
For inline assmebly operands, it proved to be very cumbersome to first follow
the general behavior of splitting an i128 operand into two parts and then
later rebuild the INLINEASM MI to have one GR128 register. Instead, some
minor common code changes were made to SelectionDAGBUilder to only create one
GR128 register part to begin with. In particular:
- getNumRegisters() now has an optional parameter "RegisterVT" which is
passed by AddInlineAsmOperands() and GetRegistersForValue().
- The bitcasting in GetRegistersForValue is not performed if RegVT is
Untyped.
- The RC for a tied use in AddInlineAsmOperands() is now computed either from
the tied def (virtual register), or by getMinimalPhysRegClass() (physical
register).
- InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register
class (DstRC) can also be computed for an illegal type.
In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and
joinRegisterPartsIntoValue() have been implemented to handle i128 operands.
Differential Revision: https://reviews.llvm.org/D100788
Review: Ulrich Weigand