As a linker is allowed to clobber r12 on function calls, the code
transformation that hardens indirect calls is not correct in case a
linker does so. Similarly, the transformation is not correct when
register lr is used.
This patch makes sure that r12 or lr are not used for indirect calls
when harden-sls-blr is enabled.
Differential Revision: https://reviews.llvm.org/D92469
In most of lib/Target we know that we are not dealing with scalable
types so it's perfectly fine to replace TypeSize comparison operators
with their fixed width equivalents, making use of getFixedSize()
and so on.
Differential Revision: https://reviews.llvm.org/D89101
There's no reason to involve the hassle of a virtual method targets
have to override for a simple boolean.
Not sure exactly what's going on with Mips, but it seems to define its
own totally separate handler classes.
When the byref attribute is added, there will need to be two similar
functions for the existing cases which have an associate value copy,
and byref which does not. Most, but not all of the existing uses will
use the existing version.
The associated size function added by D82679 also needs to
contextually differ, and will help eliminate a few places still
relying on pointee element types.
Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.
Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.
This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.
Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer
Reviewed By: ostannard
Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75169
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
The default implementation of isIncomingArgumentHandler could lead
to generating incorrect code.
Make it a pure virtual method, so that targets know they have to
override it to produce correct code.
NFC
Differential Revision: https://reviews.llvm.org/D69187
llvm-svn: 375277
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.
Support for splitting for return types not yet implemented.
Differential Revision: https://reviews.llvm.org/D66180
llvm-svn: 370822
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
I've now needed to add an extra parameter to this call twice recently. Not only
is the signature getting extremely unwieldy, but just updating all of the
callsites and implementations is a pain. Putting the parameters in a struct
sidesteps both issues.
llvm-svn: 368408
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.
NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.
Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette
Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65488
llvm-svn: 367476
Migrate CallLowering::lowerReturnVal to use the same infrastructure as
lowerCall/FormalArguments and remove the now obsolete code path from
splitToValueTypes.
Forgot to push this earlier.
llvm-svn: 366308
Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one. This is a
follow-up to D46018.
CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.
With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.
ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.
NFCI for AMDGPU, Mips and X86.
Differential Revision: https://reviews.llvm.org/D63551
llvm-svn: 364512
Change the interface of CallLowering::lowerCall to accept several
virtual registers for the call result, instead of just one. This is a
follow-up to D46018.
CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.
With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.
ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.
NFCI for AMDGPU, Mips and X86.
Differential Revision: https://reviews.llvm.org/D63550
llvm-svn: 364511
Change the interface of CallLowering::lowerFormalArguments to accept
several virtual registers for each formal argument, instead of just one.
This is a follow-up to D46018.
CallLowering::lowerReturn was similarly refactored in D49660. lowerCall
will be refactored in the same way in follow-up patches.
With this change, we forward the virtual registers generated for
aggregates to CallLowering. Therefore, the target can decide itself
whether it wants to handle them as separate pieces or use one big
register. We also copy the pack/unpackRegs helpers to CallLowering to
facilitate this.
ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.
AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was
put into a s64 instead of a p0. Added a test-case which illustrates the
problem more clearly (it crashes without this patch) and fixed the
existing test-case to expect p0.
AMDGPU has been updated to unpack into the virtual registers for
kernels. I think the other code paths fall back for aggregates, so this
should be NFC.
Mips doesn't support aggregates yet, so it's also NFC.
x86 seems to have code for dealing with aggregates, but I couldn't find
the tests for it, so I just added a fallback to DAGISel if we get more
than one virtual register for an argument.
Differential Revision: https://reviews.llvm.org/D63549
llvm-svn: 364510
Allow CallLowering::ArgInfo to contain more than one virtual register.
This is useful when passes split aggregates into several virtual
registers, but need to also provide information about the original type
to the call lowering. Used in follow-up patches.
Differential Revision: https://reviews.llvm.org/D63548
llvm-svn: 364509
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().
llvm-svn: 364191
Bail out on function arguments/returns with types aggregating an
unsupported type. This fixes cases where we would happily and
incorrectly lower functions taking e.g. [1 x i64] parameters, when we
don't even support plain i64 yet.
llvm-svn: 359540
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.
Differential Revision: https://reviews.llvm.org/D60425
llvm-svn: 358032
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Allow varargs functions to be called, both in arm and thumb mode. This
boils down to choosing the correct calling convention, which we can
easily test by making sure arm_aapcscc is used instead of
arm_aapcs_vfpcc when the callee is variadic.
llvm-svn: 351424
We currently handle all aggregates by creating one large LLT, and letting the
legalizer deal with splitting them up. However using this approach means that
we can't support big endian code correctly.
This patch changes the way that the IRTranslator deals with aggregate values,
by splitting them up into their constituent element values. To do this, parts
of the translator need to be modified to deal with multiple VRegs for a single
Value.
A new Value to VReg mapper is introduced to help keep compile time under
control, currently there is no measurable impact on CTMark despite the extra
code being generated in some cases.
Patch is based on the original work of Tim Northover.
Differential Revision: https://reviews.llvm.org/D46018
llvm-svn: 332449
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.
The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.
Differential Revision: https://reviews.llvm.org/D45017
llvm-svn: 328806
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)
llvm-svn: 328395
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.
https://reviews.llvm.org/D43409
llvm-svn: 326142
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
We're currently bailing out for Thumb targets while lowering formal
parameters, but there used to be some other checks before it, which
could've caused some functions (e.g. those without formal parameters) to
sneak through unnoticed.
llvm-svn: 317312
We were generating BLX for all the calls, which was incorrect in most
cases. Update ARMCallLowering to generate BL for direct calls, and BLX,
BX_CALL or BMOVPCRX_CALL for indirect calls.
llvm-svn: 316570
We end up creating COPY's that are either truncating/extending and this
should be illegal.
https://reviews.llvm.org/D37640
Patch for X86 and ARM by igorb, rovka
llvm-svn: 315240
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.
Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.
Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.
Differential Revision: https://reviews.llvm.org/D37030
llvm-svn: 311921