This patch changes the location of the frame-record (FP, LR) to the
bottom of the callee-saved area. According to the AAPCS the location of
the frame-record within the stackframe is unspecified (section 5.2.3 The
Frame Pointer), so the compiler should be free to choose a different
location.
The reason for changing the location of the frame-record is to prepare
the frame for allocating an SVE area below the callee-saves. This way the
compiler can use the VL-scaled addressing modes to directly access SVE
objects from the frame-pointer.
: :
| stack | | stack |
| args | | args |
+-------+ +-------+
| x30 | | x19 |
| x29 | | x20 |
FP -> |- - - -| | x21 |
| x19 | ==> | x22 |
| x20 | |- - - -|
| x21 | | x30 |
| x22 | | x29 |
+-------+ +-------+ <- FP
|///////| |///////| // realignment gap
|- - - -| |- - - -|
|spills/| |spills/|
| locals| | locals|
SP -> +-------+ +-------+ <- SP
Things to point out:
- The algorithm to find a paired register should be prevented from
accidentally pairing some callee-saved register with LR that is not
FP, since they should always be paired together when the frame
has a frame-record.
- For Darwin platforms the location of the frame-record is unchanged,
since the unwind encoding does not allow for encoding this position
dynamically and other tools currently depend on the former layout.
Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D65653
llvm-svn: 368987
Summary:
This patch adds initial support for the SVE calling convention such that
SVE types can be passed as arguments and return values to/from a
subroutine.
The SVE AAPCS states [1]:
z0-z7 are used to pass scalable vector arguments to a subroutine,
and to return scalable vector results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in such
registers, it must ensure that the entire contents of z8-z23 are
preserved across the call. In other cases it need only preserve the
low 64 bits of z8-z15, as described in §5.1.2.
p0-p3 are used to pass scalable predicate arguments to a subroutine
and to return scalable predicate results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in these
registers, it must ensure that p4-p15 are preserved across the call.
In other cases it need not preserve any scalable predicate register
contents.
SVE predicate and data registers are passed indirectly (i.e. spilled to the
stack and pass the address) if they exceed the registers used for argument
passing defined by the PCS referenced above. Until SVE stack support is merged
we can't spill SVE registers to the stack, so currently an llvm_unreachable is
used where we will eventually handle this.
[1] https://static.docs.arm.com/100986/0000/100986_0000.pdf
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D65448
llvm-svn: 367859
Summary:
Avoids duplicating generated static helpers for calling convention
analysis.
This also means you can modify AArch64CallingConv.td without recompiling
the AArch64ISelLowering.cpp monolith, so it provides faster incremental
rebuilds.
Saves 12K in llc.exe, but adds a new object file, which is large.
Reviewers: efriedma, t.p.northover
Subscribers: mgorny, javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D56948
llvm-svn: 352430
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Emit pseudo instructions indicating unwind codes corresponding to each
instruction inside the prologue/epilogue. These are used by the MCLayer to
populate the .xdata section.
Differential Revision: https://reviews.llvm.org/D50288
llvm-svn: 345701
This patch adds codegen support for the saving/restoring
V8-V23 for functions specified with the aarch64_vector_pcs
calling convention attribute, as added in patch D51477.
Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D51479
llvm-svn: 342049
The implementation of shadow call stack on aarch64 is quite different to
the implementation on x86_64. Instead of reserving a segment register for
the shadow call stack, we reserve the platform register, x18. Any function
that spills lr to sp also spills it to the shadow call stack, a pointer to
which is stored in x18.
Differential Revision: https://reviews.llvm.org/D45239
llvm-svn: 329236
This makes sure that alloca() function calls properly probe the
stack as needed.
Differential Revision: https://reviews.llvm.org/D42356
llvm-svn: 325433
The previous SwiftCC support for AAPCS64 was partially correct. It
setup swiftself parameters in the proper register but failed to setup
swifterror in the correct register. This would break compilation of
swift code for non-Darwin AAPCS64 conforming environments.
llvm-svn: 313956
Pass parameters properly in calls to such functions (pass all
floats in integer registers), and handle va_start properly (allocate
stack immediately below the arguments on the stack, to save the
register arguments into a single continuous array).
Differential Revision: https://reviews.llvm.org/D35006
llvm-svn: 307928
Functions that have a dynamic alloca require a base register which is defined to
be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be
the same register. Use a different callee save register for swifterror instead:
X21 on AArch64
R8 on ARM
rdar://30433803
llvm-svn: 294551
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.
llvm-svn: 282069
They're basically i64 for AArch64, but we'll leave them intact for stranger
targets. Also add some tests for the (very few) other cases we can handle right
now.
llvm-svn: 276689
It is very likely that the swiftself parameter is alive throughout most
functions function so putting it into a callee save register should
avoid spills for the callers with only a minimum amount of extra spills
in the callees.
Currently the generated code is correct but unnecessarily spills and
reloads arguments passed in callee save registers, I will address this
in upcoming patches.
This also adds a missing check that for tail calls the preserved value
of the caller must be the same as the callees parameter.
Differential Revision: http://reviews.llvm.org/D19007
llvm-svn: 266251
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.
There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.
Differential Revision: http://reviews.llvm.org/D18016
llvm-svn: 263092
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.
We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.
Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.
We add CSRsViaCopy, it will be explicitly handled during lowering.
1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
supports it for the given machine function and the function has only return
exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
virtual registers at beginning of the entry block and copies from virtual
registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.
The target independent portion was committed as r255353.
rdar://problem/23557469
Differential Revision: http://reviews.llvm.org/D15341
llvm-svn: 255821
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
This patch is quite boring overall, except for some uglyness in
ASMPrinter which has a getDataLayout function but has some clients
that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so
some methods are taking a DataLayout as parameter.
Reviewers: echristo
Subscribers: yaron.keren, rafael, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D11090
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 242386
The nest attribute is currently supported on the x86 (32-bit) and x86-64
backends, but not on ARM (32-bit) or AArch64. This patch adds support for
nest to the AArch64 backend.
Register x18 is used by GCC for this purpose and hence is used here.
As discussed on the GCC mailing list the register choice is an ABI issue
and so choosing the same register as GCC means __builtin_call_with_static_chain
is compatible.
Patch by Stephen Cross.
llvm-svn: 241794
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
llvm-svn: 227113
Original patch by Luke Iannini. Minor improvements and test added by
Erik de Castro Lopo.
Differential Revision: http://reviews.llvm.org/D6877
From: Erik de Castro Lopo <erikd@mega-nerd.com>
llvm-svn: 226473
The AAPCS treats small structs and homogeneous floating (or vector) aggregates
specially, and guarantees they either get passed as a contiguous block of
registers, or prevent any future use of those registers and get passed on the
stack.
This concept can fit quite neatly into LLVM's own type system, mapping an HFA
to [N x float] and so on, and small structs to [N x i64]. Doing so allows
front-ends to emit AAPCS compliant code without having to duplicate the
register counting logic.
llvm-svn: 222903
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.
llvm-svn: 216555
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.
llvm-svn: 214988
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.
llvm-svn: 210102
We can eliminate the custom C++ code in favour of some TableGen to
check the same things. Functionality should be identical, except for a
buffer overrun that was present in the C++ code and meant webkit
failed if any small argument needed to be passed on the stack.
llvm-svn: 209636
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.
"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.
This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.
llvm-svn: 209577