Summary:
Check calling convention in AMDGPUMachineFunction::isKernel
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.
Also, in the future unused non-kernels may be optimized.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19917
llvm-svn: 268719
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.
We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.
Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.
llvm-svn: 268693
Given something like:
ldr r0, .LCPI0_0 (== pc-rel var)
add r0, pc
ldr r1, .LCPI0_1 (== pc-rel var)
add r1, pc
we cannot combine the 2 ldr instructions and litpools because they get added to
a different pc to form the correct address. I think the original logic came
from a time when we fused the LDRpci/PICADD instructions into one
pseudo-instruction so the PC was always immediately at-hand. That's no longer
the case.
Should fix general-dynamic TLS access on Linux, and quite possibly other -fPIC
code that relies on litpools (e.g. v6m and -Oz compilations) though trivial
tweaks of the .ll test didn't provoke anything.
llvm-svn: 268662
Summary:
Discovered by Dave Airlie, fixes an assertion in Khronos OpenGL CTS
GL43-CTS.shader_storage_buffer_object.advanced-matrix.
In this particular case, the buffer load intrinsic fed into a uniform
conditional branch, and led the brcond lowering down the wrong path.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19931
llvm-svn: 268650
Summary:
Version 2 is now the default. If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19283
llvm-svn: 268647
The instruction A2_tfrpi has a 64-bit operand, while the corresponding
intrinsic takes a 32-bit value. The actual value has only 8 significant
bits, so the difference is only in the type used to represent it.
In order to map the intrinsic to the instruction, the operand needs to
be extended to the correct type.
llvm-svn: 268635
This backend was supposed to generate C++ code which will re-construct
the LLVM IR passed as input. This seems to me to have very marginal
usefulness in the first place.
However, the code has never been updated to use IRBuilder, which makes
its current value negative -- people who look at the output may be
steered to use the *wrong* C++ APIs to construct IR.
Furthermore, it's generated code that doesn't compile since at least
2013.
Differential Revision: http://reviews.llvm.org/D19942
llvm-svn: 268631
[mips] On error, ParseDirective should always return false to signify that the
directive was understood.
Reviewers: dsanders, vkalintiris, sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D19929
llvm-svn: 268630
Both Linux and kFreeBSD use glibc, so follow similiar code paths.
Add isTargetGlibc to check for this, and use it instead of isTargetLinux
in a few places.
Fixes PR22248 for kFreeBSD.
Differential Revision: http://reviews.llvm.org/D19104
llvm-svn: 268624
The result type of setcc is dependent on whether or not AVX512 is
present.
We had an X86-specific DAG-combine which assumed that the result type
should be i8 when it could be i1.
This meant that we would generate illegal setccs which LowerSETCC did
not like.
Instead, use an appropriate type and zero extend to i8.
Also, there were some scenarios where the fold should have fired but
didn't because we were overly cautious about the types. This meant that
we generated:
shrl $31, %edi
andl $1, %edi
kmovw %edi, %k0
kxnorw %k0, %k0, %k1
kshiftrw $15, %k1, %k1
kxorw %k1, %k0, %k0
kmovw %k0, %eax
instead of:
testl %edi, %edi
setns %al
This fixes PR27638.
llvm-svn: 268609
The code here is recursively Select-ing a new Node to avoid issues
where N is CSE'd during replaceDAGValue and stops being valid. We can
accomplish the same goal in a more principled way by using a
HandleSDNode.
This is essentially a less dodgy fix for PR25733 than the original
attempt back in r255120.
llvm-svn: 268590
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI. This will
be used to implement -mbackchain option in clang.
Differential Revision: http://reviews.llvm.org/D19889
Fixed in this version: added RegState::Define and RegState::Kill on R1D
in prologue.
llvm-svn: 268581
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI. This will
be used to implement -mbackchain option in clang.
Differential Revision: http://reviews.llvm.org/D19889
llvm-svn: 268571
The new register classes allow to tell the machine verifier that it is
fine to use RIP for address accesses in x32 mode. Prior to that patch,
we would complain that we are using a GR64 in place of GR32, whereas it
is actually fine to use GR64 for x32 as long as the 32 high bits are 0s.
RIP has this property and is used for RIP-relative addressing.
This partially fixes http://llvm.org/PR27481.
llvm-svn: 268567
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.
llvm-svn: 268539
Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.
Reviewers: rengolin
Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D19896
llvm-svn: 268529
This patch corresponds to review:
http://reviews.llvm.org/D18592
It allows the PPC back end to generate the xxspltw instruction where we
previously only emitted vspltw.
llvm-svn: 268516
Use std::make_pair instead of constructor
Use C++11 loop
Reuse helper var
Reviewers: tstellardAMD
Subsribers: arsenm
Differential Revision: http://reviews.llvm.org/D19787
llvm-svn: 268503
As requested by Rafael Espindola in his post-commit comments on r268036. This
makes the previous behaviour the default while still allowing verification of
IAS.
llvm-svn: 268496
Modification of previously existing code (variable rename only), with unit test added.
Differential Revision: http://reviews.llvm.org/D19368
llvm-svn: 268493
This code implements builtin_setjmp and builtin_longjmp exception handling intrinsics for 32-bit Sparc back-ends.
The code started as a mash-up of the PowerPC and X86 versions, although there are sufficient differences to both that had to be made for Sparc handling.
Note: I have manual tests running. I'll work on a unit test and add that to the rest of this diff in the next day.
Also, this implementation is only for 32-bit Sparc. I haven't focussed on a 64-bit version, although I have left the code in a prepared state for implementing this, including detecting pointer size and comments indicating where I suspect there may be differences.
Differential Revision: http://reviews.llvm.org/D19798
llvm-svn: 268483
i1 is now a legal type for X86 with AVX512.
There were some paths in X86FastISel which were not quite ready to see
an i1 value: they were not quite sure how to deal with sign/zero extends
for call arguments.
DTRT by extending to i8 for zeroext and bailing out of FastISel for
signext.
This fixes PR27591.
llvm-svn: 268470
This patch changes the TargetMachine arguments to be const. This is
required for {D19265}, and was requested to be done in a separate patch.
Patch by Jacob Hansen!
Differential Revision: http://reviews.llvm.org/D19797
llvm-svn: 268389
Summary:
It's always zero for SelectionDAG and is never read by the MIPS backend so
do the same for FastISel.
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D19863
llvm-svn: 268386
Summary:
This is much closer to the way MIPS relocation expressions work
(%hi(foo + 2) rather than %hi(foo) + 2) and removes the need for the
various bodges in MipsAsmParser::evaluateRelocExpr().
Removing those bodges ensures that the constant stored in MCValue is the
full 32 or 64-bit (depending on ABI) offset from the symbol. This will be used
to correct the %hi/%lo matching needed to sort the relocation table correctly.
As part of this:
* Gave MCExpr::print() the ability to omit parenthesis when emitting a
symbol reference inside a MipsMCExpr operator like %hi(X). Without this
we print things like %lo(($L1)).
* %hi(%neg(%gprel(X))) is now three MipsMCExpr's instead of one. Most of
the related special cases have been removed or moved to MipsMCExpr. We
can remove the rest as we gain support for the less common relocations
when they are not part of this specific combination.
* Renamed MipsMCExpr::VariantKind and the enum prefix ('VK_') to avoid confusion
with MCSymbolRefExpr::VariantKind and its prefix (also 'VK_').
* fixup_Mips_GOT_Local and fixup_Mips_GOT_Global were found to be identical
and merged into fixup_Mips_GOT.
* MO_GOT16 and MO_GOT turned out to be identical and have been merged into
MO_GOT.
* VK_Mips_GOT and VK_Mips_GOT16 turned out to be the same thing so they
have been merged into MEK_GOT
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D19716
llvm-svn: 268379