Commit Graph

842 Commits

Author SHA1 Message Date
Momchil Velikov f9d932e673 [clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794
2021-04-15 22:58:14 +01:00
Martin Storsjö 8e0f2e89ff [clang] [AArch64] Fix handling of HFAs passed to Windows variadic functions
The documentation says that for variadic functions, all composites
are treated similarly, no special handling of HFAs/HVAs, not even
for the fixed arguments of a variadic function.

Differential Revision: https://reviews.llvm.org/D100467
2021-04-15 22:21:27 +03:00
Martin Storsjö 3637c5c8ec [clang] [AArch64] Fix Windows va_arg handling for larger structs
Aggregate types over 16 bytes are passed by reference.

Contrary to the x86_64 ABI, smaller structs with an odd (non power
of two) are padded and passed in registers.

Differential Revision: https://reviews.llvm.org/D100374
2021-04-14 14:51:53 +03:00
Liu, Chen3 1c4108ab66 [i386] Modify the alignment of __m128/__m256/__m512 vector type according i386 abi.
According to i386 System V ABI:

1. when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call.
2. when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call.

The current method of clang passing __m512 parameter are as follow:

1. when target supports avx512, passing it with 64 byte alignment;
2. when target supports avx, passing it with 32 byte alignment;
3. Otherwise, passing it with 16 byte alignment.

Passing __m256 parameter are as follow:

1. when target supports avx or avx512, passing it with 32 byte alignment;
2. Otherwise, passing it with 16 byte alignment.

This pach will passing __m128/__m256/__m512 following i386 System V ABI and
apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks at present.

Differential Revision: https://reviews.llvm.org/D78564
2021-04-14 16:44:54 +08:00
Ben Shi 4f173c0c42 [clang][AVR] Support variable decorator '__flash'
Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D96853
2021-04-10 11:23:55 +08:00
Soumi Manna 5d7cb79416 RISCVABIInfo::classifyArgumentType: Fix static analyzer warnings with uninitialized variables warnings - NFCI
Differential Revision: https://reviews.llvm.org/D100172
2021-04-09 15:23:32 +01:00
Jonas Paulsson 9cfd301ec8 [SystemZ] Test for isinf and isfinite in testFPKind().
Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes
for finite) in testFPKind() and handle with TDC.

Review: Ulrich Weigand.

Differential Revision: https://reviews.llvm.org/D97901
2021-03-15 15:02:39 -06:00
Nikita Popov 42eb658f65 [OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.

There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
2021-03-12 21:01:16 +01:00
Min-Yih Hsu 5eb7a5814a [cfe][M68k](7/8) Clang basic support
This is the first patch supporting M68k in Clang
 - Register M68k as a target
 - Target specific CodeGen support
 - Target specific attribute support

Authors: myhsu, m4yers, glaubitz

Differential Revision: https://reviews.llvm.org/D88393
2021-03-08 12:30:57 -08:00
Sean Fertile 3f40dbbbc7 [PowerPC][AIX] Enable passing vectors in variadic functions.
Differential Revision: https://reviews.llvm.org/D97474
2021-03-01 13:08:28 -05:00
Jonas Paulsson e57bd1ff4f [CFE, SystemZ] New target hook testFPKind() for checks of FP values.
The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the
lowering in constrained FP mode of builtin_isnan from an FP comparison to
integer operations to avoid trapping.

SystemZ has a special instruction "Test Data Class" which is the preferred
way to do this check. This patch adds a new target hook "testFPKind()" that
lets SystemZ emit the s390_tdc intrinsic instead.

testFPKind() takes the BuiltinID as an argument and is expected to soon
handle more opcodes than just 'builtin_isnan'.

Review: Thomas Preud'homme, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96568
2021-02-18 12:36:46 -06:00
Yaxun (Sam) Liu 053e61d54e Relands "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit e384e94fbe.
2021-02-12 10:53:59 -05:00
George Koehler 018984ae68 [PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets
In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call.
va_arg looked in the wrong registers for (the pointer representation of) an
object in Objective-C, and for some types in C++. Fix va_arg to look in the
general-purpose registers, not the floating-point registers. Also fix va_arg
for some C++ types, like a member function pointer, that are aggregates for
the ABI.

Anthony Richardby found the problem in Objective-C. Eli Friedman suggested
part of this fix.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47921

Reviewed By: efriedma, nemanjai

Differential Revision: https://reviews.llvm.org/D90329
2021-01-23 00:13:36 -05:00
David Truby e5f51fdd65 [clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate
MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto
specification on that platform, so match msvc's behaviour.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611

Co-authored-by: Peter Waller <peter.waller@arm.com>

Differential Revision: https://reviews.llvm.org/D92751
2021-01-12 19:44:01 +00:00
Reid Kleckner ad55d5c3f3 Simplify vectorcall argument classification of HVAs, NFC
This reduces the number of `WinX86_64ABIInfo::classify` call sites from
3 to 1. The call sites were similar, but passed different values for
FreeSSERegs. Use variables instead of `if`s to manage that argument.
2021-01-07 11:14:18 -08:00
Brandon Bergren 6cee9d0cf8 [PowerPC] Support powerpcle target in Clang [3/5]
Add powerpcle support to clang.

For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment.

For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC.

Adjust driver to match current binutils behavior regarding machine naming.

Adjust and expand tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93919
2021-01-02 12:17:58 -06:00
Fangrui Song c70f36865e Use basic_string::find(char) instead of basic_string::find(const char *s, size_type pos=0)
Many (StringRef) cannot be detected by clang-tidy performance-faster-string-find.
2020-12-16 23:28:32 -08:00
Matt Arsenault ef4da3c2ba clang: Add byval on x86_intrcc parameter 0
This will allow removing the special case treatment of the parameter
and avoid depending on the pointer's element type.
2020-12-14 16:34:37 -05:00
Melanie Blower 320af6b138 Create SPIRABIInfo to enable SPIR_FUNC calling convention.
Background: Call to library arithmetic functions for div is emitted by the
compiler and it set wrong “C” calling convention for calls to these functions,
whereas library functions are declared with `spir_function` calling convention.
InstCombine optimization replaces such calls with “unreachable” instruction.
It looks like clang lacks SPIRABIInfo class which should specify default
calling conventions for “system” function calls. SPIR supports only
SPIR_FUNC and SPIR_KERNEL calling convention.

Reviewers: Erich Keane, Anastasia

Differential Revision: https://reviews.llvm.org/D92721
2020-12-12 05:48:20 -08:00
Luís Marques 3af354e863 [Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex
Fixes bug 44904.

Differential Revision: https://reviews.llvm.org/D91278
2020-12-08 09:19:05 +00:00
Luís Marques fa8f5bfa4e [Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct
The code seemed not to account for the field 1 offset.

Differential Revision: https://reviews.llvm.org/D91270
2020-12-08 09:19:05 +00:00
Jinsong Ji b49b8f096c [PowerPC][Clang] Remove QPX support
Clean up QPX code in clang missed in https://reviews.llvm.org/D83915

Reviewed By: #powerpc, steven.zhang

Differential Revision: https://reviews.llvm.org/D92329
2020-12-07 10:15:39 -05:00
Qiu Chaofan 3fca6a7844 [Clang] Don't adjust align for IBM extended double
Commit 6b1341eb fixed alignment for 128-bit FP types on PowerPC.
However, the quadword alignment adjustment shouldn't be applied to IBM
extended double (ppc_fp128 in IR) values.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92278
2020-12-02 17:02:26 +08:00
Zarko Todorovski ff8e8c1b14 [AIX] Enabling vector type arguments and return for AIX
This patch enables vector type arguments on AIX.  All non-aggregate Altivec vector types are 16bytes in size and are 16byte aligned.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D92117
2020-11-27 09:55:52 -05:00
Simon Pilgrim 9d996c01aa TargetInfo.cpp - use castAs<> instead of getAs<> as we dereference the pointer directly. NFCI.
castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately.
2020-11-25 11:38:29 +00:00
Qiu Chaofan 6b1341eb5b [PowerPC] [Clang] Fix alignment of 128-bit float types
According to ELF v2 ABI, both IEEE 128-bit and IBM extended floating
point variables should be quad-word (16 bytes) aligned. Previously, only
vector types are considered aligned as quad-word on PowerPC.

This patch will fix incorrectness of IEEE 128-bit float argument in
va_arg cases.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D91596
2020-11-19 14:22:14 +08:00
Yaxun (Sam) Liu 3f4b5893ef [AMDGPU] Add option -munsafe-fp-atomics
Add an option -munsafe-fp-atomics for AMDGPU target.

When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics"
to any functions for amdgpu target. This allows amdgpu backend to use
unsafe fp atomic instructions in these functions.

Differential Revision: https://reviews.llvm.org/D91546
2020-11-16 21:52:12 -05:00
Michael Liao 8920ef06a1 [hip] Remove the coercion on aggregate kernel arguments.
- If an aggregate argument is indirectly accessed within kernels, direct
  passing results in unpromotable `alloca`, which degrade performance
  significantly. InferAddrSpace pass is enhanced in
  [D91121](https://reviews.llvm.org/D91121) to take the assumption that
  generic pointers loaded from the constant memory could be regarded
  global ones. The need for the coercion on aggregate arguments is
  mitigated.

Differential Revision: https://reviews.llvm.org/D89980
2020-11-12 21:19:30 -05:00
David Sherwood cea69fa4dc [SVE] Add fatal error for unnamed SVE variadic arguments
We don't currently support passing unnamed variadic SVE arguments
so I've added a fatal error if we hit such cases to prevent any
silent ABI issues in future.

Differential Revision: https://reviews.llvm.org/D90230
2020-10-30 13:35:47 +00:00
Douglas Yung 774ab60125 Add option to use older clang ABI behavior when passing certain union types as function arguments
Recently commit D78699 (commit 26cfb6e562), fixed clang's behavior with respect
to passing a union type through a register to correctly follow the ABI. However,
this is an ABI breaking change with earlier versions of the clang compiler, so we
should add an -fclang-abi-compat option to address this. Additionally, the PS4 ABI
requires the older behavior, so that is added as well.

This change adds a Ver11 value to the ClangABI enum that when it is set (or the
target is the PS4 triple), we skip the ABI fix introduced in D78699.

Differential Revision: https://reviews.llvm.org/D89747
2020-10-19 18:17:34 -07:00
Yaxun (Sam) Liu e384e94fbe Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit 187658b8a6 due to
AMDGPU backend issues.
2020-10-15 17:25:55 -04:00
Bevin Hansson 101309fe04 [AST] Change return type of getTypeInfoInChars to a proper struct instead of std::pair.
Followup to D85191.

This changes getTypeInfoInChars to return a TypeInfoChars
struct instead of a std::pair of CharUnits. This lets the
interface match getTypeInfo more closely.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86447
2020-10-13 13:26:56 +02:00
Liu, Chen3 26cfb6e562 [X86] Passing union type through register
For example:

  union M256 {
    double d;
    __m256 m;
  };
  extern void foo1(union M256 A);
  union M256 m1;
  void test() {
    foo1(m1);
  }

clang will pass m1 through stack which does not follow the ABI.

Differential Revision: https://reviews.llvm.org/D78699
2020-10-09 11:24:29 +08:00
Xiangling Liao 3a7487f903 [FE] Use preferred alignment instead of ABI alignment for complete object when applicable
On some targets, preferred alignment is larger than ABI alignment in some cases. For example,
on AIX we have special power alignment rules which would cause that. Previously, to support
those cases, we added a “PreferredAlignment” field in the `RecordLayout` to store the AIX
special alignment values in “PreferredAlignment” as the community suggested.

However, that patch alone is not enough. There are places in the Clang where `PreferredAlignment`
should have been used instead of ABI-specified alignment. This patch is aimed at fixing those
spots.

Differential Revision: https://reviews.llvm.org/D86790
2020-09-30 10:48:28 -04:00
Yaxun (Sam) Liu 187658b8a6 Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Recommit 04abbb3a78
2020-09-28 22:43:17 -04:00
Chris Bowler f330d9f163 [PPC] [AIX] Implement calling convention IR for C99 complex types on AIX
Add AIX calling convention logic to Clang for C99 complex types on AIX

Differential Revision: https://reviews.llvm.org/D88130
2020-09-25 07:43:31 -04:00
Momchil Velikov a88c722e68 [AArch64] PAC/BTI code generation for LLVM generated functions
PAC/BTI-related codegen in the AArch64 backend is controlled by a set
of LLVM IR function attributes, added to the function by Clang, based
on command-line options and GCC-style function attributes. However,
functions, generated in the LLVM middle end (for example,
asan.module.ctor or __llvm_gcov_write_out) do not get any attributes
and the backend incorrectly does not do any PAC/BTI code generation.

This patch record the default state of PAC/BTI codegen in a set of
LLVM IR module-level attributes, based on command-line options:

* "sign-return-address", with non-zero value means generate code to
  sign return addresses (PAC-RET), zero value means disable PAC-RET.

* "sign-return-address-all", with non-zero value means enable PAC-RET
  for all functions, zero value means enable PAC-RET only for
  functions, which spill LR.

* "sign-return-address-with-bkey", with non-zero value means use B-key
  for signing, zero value mean use A-key.

This set of attributes are always added for AArch64 targets (as
opposed, for example, to interpreting a missing attribute as having a
value 0) in order to be able to check for conflicts when combining
module attributed during LTO.

Module-level attributes are overridden by function level attributes.
All the decision making about whether to not to generate PAC and/or
BTI code is factored out into AArch64FunctionInfo, there shouldn't be
any places left, other than AArch64FunctionInfo, which directly
examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which
is/will-be handled by a separate patch.

Differential Revision: https://reviews.llvm.org/D85649
2020-09-25 11:47:14 +01:00
Cullen Rhodes 002f5ab3b1 [clang][aarch64] Fix ILP32 ABI for arm_sve_vector_bits
The element types of scalable vectors are defined in terms of stdint
types in the ACLE. This patch fixes the mapping to builtin types for the
ILP32 ABI when creating VLS types with the arm_sve_vector_bits, where
the mapping is as follows:

  int32_t -> LongTy
  int64_t -> LongLongTy
  uint32_t -> UnsignedLongTy
  uint64_t -> UnsignedLongLongTy

This is implemented by leveraging getBuiltinVectorTypeInfo which is
target agnostic since it calls ASTContext::getIntTypeForBitwidth for
integer types. The element type for svfloat16_t is changed from
Float16Ty to HalfTy when creating VLS types since this is what is used
elsewhere.

For more information, see:

https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#types-varying-by-data-model
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-support-for-scalable-vectors

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87358
2020-09-11 09:46:35 +00:00
Yaxun (Sam) Liu 62dbb7e54c Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Temporarily revert commit 04abbb3a78
due to regressions in some HIP apps due backend issues revealed by
this change.

Will re-commit it when backend issues are fixed.
2020-09-02 16:12:28 -04:00
Cullen Rhodes 2ddf795e8c Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375

Original summary:

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-28 15:57:09 +00:00
Cullen Rhodes 2e7041fdc2 Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders
[1][2]. Reverting whilst I investigate.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
[2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112

This reverts commit 42587345a3.
2020-08-27 21:31:05 +00:00
Cullen Rhodes 42587345a3 [CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-27 15:11:58 +00:00
Matt Arsenault 30eeb742f1 clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.

Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.

Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).

This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.

This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
2020-08-06 15:52:26 -04:00
Anatoly Trosinenko 5a07490d76 [ABI][NFC] Fix the confusion of ByVal and ByRef argument names
The second argument of getNaturalAlignIndirect() was `bool ByRef`, but
the implementation was just delegating to getIndirect() with `ByRef`
passed unchanged to `bool ByVal` parameter of getIndirect().

Fix a couple of /*ByRef=*/ comments as well.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D85113
2020-08-06 15:20:18 +03:00
Kazushi (Jam) Marukawa 045e79e77c [VE] Extend integer arguments and return values smaller than 64 bits
In order to follow NEC Aurora SX VE ABI correctly, change to sign/zero
extend integer arguments and return values smaller than 64 bits in clang.
Also update regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D85071
2020-08-04 08:07:05 +09:00
Akira Hatanaka 41b1e97b12 [CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as
notail on x86-64

This is needed because the epilogue code inserted before tail calls on
x86-64 breaks the handshake between the caller and callee.

Calls to objc_retainAutoreleasedReturnValue used to have the same
problem, which was fixed in https://reviews.llvm.org/D59656.

rdar://problem/66029552

Differential Revision: https://reviews.llvm.org/D84540
2020-08-03 13:25:25 -07:00
Ulrich Weigand 4c5a93bd58 [ABI] Handle C++20 [[no_unique_address]] attribute
Many platform ABIs have special support for passing aggregates that
either just contain a single member of floatint-point type, or else
a homogeneous set of members of the same floating-point type.

When making this determination, any extra "empty" members of the
aggregate type will typically be ignored.  However, in C++ (at least
in all prior versions), no data member would actually count as empty,
even if it's type is an empty record -- it would still be considered
to take up at least one byte of space, and therefore make those ABI
special cases not apply.

This is now changing in C++20, which introduced the [[no_unique_address]]
attribute.  Members of empty record type, if they also carry this
attribute, now do *not* take up any space in the type, and therefore
the ABI special cases for single-element or homogeneous aggregates
should apply.

The C++ Itanium ABI has been updated accordingly, and GCC 10 has
added support for this new case.  This patch now adds support to
LLVM.  This is cross-platform; it affects all platforms that use
the single-element or homogeneous aggregate ABI special case and
implement this using any of the following common subroutines
in lib/CodeGen/TargetInfo.cpp:
  isEmptyField
  isEmptyRecord
  isSingleElementStruct
  isHomogeneousAggregate
2020-07-10 14:01:05 +02:00
Anatoly Trosinenko 67422e4294 [MSP430] Align the _Complex ABI with current msp430-gcc
Assembler output is checked against msp430-gcc 9.2.0.50 from TI.

Reviewed By: asl

Differential Revision: https://reviews.llvm.org/D82646
2020-07-09 18:28:48 +03:00
Ulrich Weigand 80a1b95b8e [SystemZ ABI] Allow class types in GetSingleElementType
The SystemZ ABI specifies that aggregate types with just a single
member of floating-point type shall be passed as if they were just
a scalar of that type.  This applies to both struct and class types
(but not unions).

However, the current ABI support code in clang only checks this
case for struct types, which means that for class types, generated
code does not adhere to the platform ABI.

Fixed by accepting both struct and class types in the
SystemZABIInfo::GetSingleElementType routine.
2020-07-07 19:56:19 +02:00
Erich Keane 2831a317b6 Implement AVX ABI Warning/error
The x86-64 "avx" feature changes how >128 bit vector types are passed,
instead of being passed in separate 128 bit registers, they can be
passed in 256 bit registers.

"avx512f" does the same thing, except it switches from 256 bit registers
to 512 bit registers.

The result of both of these is an ABI incompatibility between functions
compiled with and without these features.

This patch implements a warning/error pair upon an attempt to call a
function that would run afoul of this. First, if a function is called
that would have its ABI changed, we issue a warning.

Second, if said call is made in a situation where the caller and callee
are known to have different calling conventions (such as the case of
'target'), we instead issue an error.

Differential Revision: https://reviews.llvm.org/D82562
2020-07-01 07:14:31 -07:00