Commit Graph

10 Commits

Author SHA1 Message Date
Jun Ma 3afbf89804 [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast
This change fixes the crash that PRValue cannot be handled by
EmitLValue.

Differential Revision: https://reviews.llvm.org/D105097
2021-07-01 10:09:47 +08:00
Fangrui Song d1fd72343c Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions
The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.

The refactoring is made available by 2820a2ca3a.

Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:

* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.

Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
2020-12-31 13:59:45 -08:00
Joe Ellis dad07baf12 [clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts
This change makes use of the llvm.vector.extract intrinsic to avoid
going through memory when performing bitcasts between vector-length
agnostic types and vector-length specific types.

Depends on D91362

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D92761
2020-12-16 12:24:32 +00:00
Roman Lebedev e00f189d39
[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592)
(it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html)

This canonicalization seems dubious.

Most importantly, while it does not create `inttoptr` casts by itself,
it may cause them to appear later, see e.g. D88788.

I think it's pretty obvious that it is an undesirable outcome,
by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts
are not no-op, and are no longer eager to look past them.
Which e.g. means that given
```
%a = load i32
%b = inttoptr %a
%c = inttoptr %a
```
we likely won't be able to tell that `%b` and `%c` is the same thing.

As we can see in D88789 / D88788 / D88806 / D75505,
we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least)
And we can't recover the situation post-inlining in instcombine.

So it really does look like this fold is actively breaking
otherwise-good IR, in a way that is not recoverable.
And that means, this fold isn't helpful in exposing the passes
that are otherwise unaware of these patterns it produces.

Thusly, i propose to simply not perform such a canonicalization.
The original motivational RFC does not state what larger problem
that canonicalization was trying to solve, so i'm not sure
how this plays out in the larger picture.

On vanilla llvm test-suite + RawSpeed, this results in
increase of asm instructions and final object size by ~+0.05%
decreases final count of bitcasts by -4.79% (-28990),
ptrtoint casts by -15.41% (-3423),
and of inttoptr casts by -25.59% (-6919, *sic*).
Overall, there's -0.04% less IR blocks, -0.39% instructions.

See https://bugs.llvm.org/show_bug.cgi?id=47592

Differential Revision: https://reviews.llvm.org/D88789
2020-10-06 00:00:30 +03:00
Roman Lebedev aaae13d0c2
[NFC][clang][codegen] Autogenerate a few ARM SVE tests that are being affected by an upcoming patch 2020-10-04 19:54:09 +03:00
David Green ab2ed8bce9 [SVE] Regenerate sve vector bits tests. NFC 2020-09-11 18:51:57 +01:00
Cullen Rhodes f9091e56d3 [clang][aarch64] Drop experimental from __ARM_FEATURE_SVE_BITS macro
The __ARM_FEATURE_SVE_BITS feature macro is specified in the Arm C
Language Extensions (ACLE) for SVE [1] (version 00bet5). From the spec,
where __ARM_FEATURE_SVE_BITS==N:

    When N is nonzero, indicates that the implementation is generating
    code for an N-bit SVE target and that the arm_sve_vector_bits(N)
    attribute is available.

This was defined in D83550 as __ARM_FEATURE_SVE_BITS_EXPERIMENTAL and
enabled under the -msve-vector-bits flag to simplify initial tests.
This patch drops _EXPERIMENTAL now there is support for the feature.

[1] https://developer.arm.com/documentation/100987/latest

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D86720
2020-09-03 09:39:37 +00:00
Cullen Rhodes 2ddf795e8c Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375

Original summary:

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-28 15:57:09 +00:00
Cullen Rhodes 2e7041fdc2 Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders
[1][2]. Reverting whilst I investigate.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
[2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112

This reverts commit 42587345a3.
2020-08-27 21:31:05 +00:00
Cullen Rhodes 42587345a3 [CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-27 15:11:58 +00:00