Summary:
On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:
void foo(int, ...);
typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
void bar(float16x4_t x) {
foo(42, x);
}
According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.
Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.
One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).
Reviewers: eli.friedman, olista01, SjoerdMeijer, javed.absar, efriedma
Reviewed By: javed.absar, efriedma
Subscribers: efriedma, kristof.beyls, cfe-commits, chrib
Differential Revision: https://reviews.llvm.org/D50507
llvm-svn: 342034
- Add a command line options -msign-return-address to enable return address
signing
- Armv8.3a added instructions to sign the return address to help mitigate
against ROP attacks
- This patch adds command line options to generate function attributes that
signal to the back whether return address signing instructions should be
added
Differential revision: https://reviews.llvm.org/D49793
llvm-svn: 340019
The "Procedure Call Procedure Call Standard for the ARM® Architecture"
(https://static.docs.arm.com/ihi0042/f/IHI0042F_aapcs.pdf), specifies that
composite types are passed according to their "natural alignment", i.e. the
alignment before alignment adjustment on the entire composite is applied.
The same applies for AArch64 ABI.
Clang, however, used the adjusted alignment.
GCC already implements the ABI correctly. With this patch Clang becomes
compatible with GCC and passes such arguments in accordance with AAPCS.
Differential Revision: https://reviews.llvm.org/D46013
llvm-svn: 338279
Summary:
Clang supports the GNU style ``__attribute__((interrupt))`` attribute on RISCV targets.
Permissible values for this parameter are user, supervisor, and machine.
If there is no parameter, then it defaults to machine.
Reference: https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html
Based on initial patch by Zhaoshi Zheng.
Reviewers: asb, aaron.ballman
Reviewed By: asb, aaron.ballman
Subscribers: rkruppe, the_o, aaron.ballman, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, cfe-commits
Differential Revision: https://reviews.llvm.org/D48412
llvm-svn: 338045
Update clang to treat fp128 as a valid base type for homogeneous aggregate
passing and returning.
Differential Revision: https://reviews.llvm.org/D48044
llvm-svn: 336308
The WebAssembly backend in particular benefits from being
able to distinguish between varargs functions (...) and prototype-less
C functions.
Differential Revision: https://reviews.llvm.org/D48443
llvm-svn: 335510
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.
This patch moves setting kernel convention before arranging
function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447
The force_align_arg_pointer attribute was using a hardcoded 16-byte
alignment value which in combination with -mstack-alignment=32 (or
larger) would produce a misaligned stack which could result in crashes
when accessing stack buffers using aligned AVX load/store instructions.
Fix the issue by using the "stackrealign" function attribute instead
of using a hardcoded 16-byte alignment.
Patch By: Gramner
Differential Revision: https://reviews.llvm.org/D45812
llvm-svn: 330331
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.
llvm-svn: 329099
This patch sets target specific calling convention for CUDA kernels in IR.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44747
llvm-svn: 328795
ObjC and ObjC++ pass non-trivial structs in a way that is incompatible
with each other. For example:
typedef struct {
id f0;
__weak id f1;
} S;
// this code is compiled in c++.
extern "C" {
void foo(S s);
}
void caller() {
// the caller passes the parameter indirectly and destructs it.
foo(S());
}
// this function is compiled in c.
// 'a' is passed directly and is destructed in the callee.
void foo(S a) {
}
This patch fixes the incompatibility by passing and returning structs
with __strong or weak fields using the C ABI in C++ mode. __strong and
__weak fields in a struct do not cause the struct to be destructed in
the caller and __strong fields do not cause the struct to be passed
indirectly.
Also, this patch fixes the microsoft ABI bug mentioned here:
https://reviews.llvm.org/D41039?id=128767#inline-364710
rdar://problem/38887866
Differential Revision: https://reviews.llvm.org/D44908
llvm-svn: 328731
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting. This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.
llvm-svn: 328636
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue.
Differential Revision: https://reviews.llvm.org/D44696
llvm-svn: 328350
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use a function attribute to communicate to the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D43735
llvm-svn: 328347
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
This recommits r327206, which was reverted because it caused
module-enabled builders to fail. I discovered that the
CXXRecordDecl::CanPassInRegisters flag wasn't being set correctly in
some cases after I moved it to RecordDecl.
Thanks to Eric Liu for helping me investigate the bug.
rdar://problem/33599681
https://reviews.llvm.org/D44095
llvm-svn: 327870
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D44095
llvm-svn: 327206
Summary:
This patch is a fix for following issue:
https://bugs.llvm.org/show_bug.cgi?id=31362 The problem was caused by front end
lowering C calling conventions without taking into account calling conventions
enforced by attribute. In this case win64cc was no correctly lowered on targets
other than Windows.
Reviewed By: rnk (Reid Kleckner)
Differential Revision: https://reviews.llvm.org/D43016
Author: belickim <mateusz.belicki@intel.com>
llvm-svn: 324594
I found this while looking at the ppc failures caused by the dso_local
change.
The issue was that the patch would produce the wrong answer for
available_externally. Having ForDefinition_t available in places where
the code can just check the linkage is a bit of a foot gun.
This patch removes the ForDefiniton_t argument in places where the
linkage is already know.
llvm-svn: 324499
When trying to track down a different bug, we discovered
that calling __builtin_va_arg on a vec3f type caused
the SROA pass to issue a warning that there was an illegal
access.
Further research showed that the vec3f type is
alloca'ed as size '12', but the _builtin_va_arg code
on x86_64 was always loading this out of registers as
{double, double}. Thus, the 2nd store into the vec3f
was storing in bytes 12-15!
This patch alters the original implementation which always
assumed {double, double} to use the actual coerced type
instead, so the LLVM-IR generated is a load/GEP/store of
a <2 x float> and a float, rather than a double and a double.
Tests were added for all combinations I could think of that
would fit in 2 FP registers, and all work exactly as expected.
Differential Revision: https://reviews.llvm.org/D42811
llvm-svn: 324098
Pass and return _Float16 as if it were an int or float for ARM, but with the
top 16 bits unspecified, similarly like we already do for __fp16.
We will implement proper half-precision function argument lowering in the ARM
backend soon, but want to use this workaround in the mean time.
Differential Revision: https://reviews.llvm.org/D42318
llvm-svn: 323185
RISCVABIInfo is implemented in terms of XLen, supporting both RV32 and RV64.
Unfortunately we need to count argument registers in the frontend in order to
determine when to emit signext and zeroext attributes. Integer scalars are
extended according to their type up to 32-bits and then sign-extended to XLen
when passed in registers, but are anyext when passed on the stack. This patch
only implements the base integer (soft float) ABIs.
For more information on the RISC-V ABI, see [the ABI
doc](https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md),
my [golden model](https://github.com/lowRISC/riscv-calling-conv-model), and
the [LLVM RISC-V calling convention
patch](https://reviews.llvm.org/D39898#2d1595b4) (specifically the comment
documenting frontend expectations).
Differential Revision: https://reviews.llvm.org/D40023
llvm-svn: 322494
As @rjmccall suggested in D40023, we can get rid of
ABIInfo::shouldSignExtUnsignedType (used to handle cases like the Mips calling
convention where 32-bit integers are always sign extended regardless of the
sign of the type) by adding a SignExt field to ABIArgInfo. In the common case,
this new field is set automatically by ABIArgInfo::getExtend based on the sign
of the type. For targets that want greater control, they can use
ABIArgInfo::getSignExtend or ABIArgInfo::getZeroExtend when necessary. This
change also cleans up logic in CGCall.cpp.
There is no functional change intended in this patch, and all tests pass
unchanged. As noted in D40023, Mips might want to sign-extend unsigned 32-bit
integer return types. A future patch might modify
MipsABIInfo::classifyReturnType to use MipsABIInfo::extendType.
Differential Revision: https://reviews.llvm.org/D41999
llvm-svn: 322396
Darwin uses char * for the variadic list type (va_list). We use the PPC
SVR4 ABI for PPC, which uses a structure type for the va_list. When
constructing the GEP, we would fail due to the incorrect handling for
the va_list. Correct this to use the right type.
llvm-svn: 316599
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
This change will make it possible to use -fsanitize=function on Darwin and
possibly on other platforms. It fixes an issue with the way RTTI is stored into
function prologue data.
On Darwin, addresses stored in prologue data can't require run-time fixups and
must be PC-relative. Run-time fixups are undesirable because they necessitate
writable text segments, which can lead to security issues. And absolute
addresses are undesirable because they break PIE mode.
The fix is to create a private global which points to the RTTI, and then to
encode a PC-relative reference to the global into prologue data.
Differential Revision: https://reviews.llvm.org/D37597
llvm-svn: 313096
This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
llvm-svn: 311823
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643
The comment markers accepted by the assembler vary between different targets,
but '//' is always accepted, so we should use that for consistency.
Differential revision: https://reviews.llvm.org/D36666
llvm-svn: 311325
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
llvm-svn: 310704
This is an improvement over always using byval for
structs.
This will use registers until ~16 are used, and then
switch back to byval. This needs more work, since I'm
not sure it ever really makes sense to use byval. If
the register limit is exceeded, the arguments still
end up passed on the stack, but with a different ABI.
It also may make sense to base this on number of
registers used for non-struct arguments, rather than
just arguments that appear first in the argument list.
llvm-svn: 310527
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082