Commit Graph

14551 Commits

Author SHA1 Message Date
Arthur Eubanks ad727ab7d9 [NFC] Migrate some callers away from Function/AttributeLists methods that take an index
These methods can be confusing.
2021-08-17 21:05:40 -07:00
Arthur Eubanks 46cf82532c [NFC] Replace Function handling of attributes with less confusing calls
To avoid magic constants and confusing indexes.
2021-08-17 21:05:40 -07:00
Wang, Pengfei 5aeca3b0a5 [CFE][X86] Enable complex _Float16 support
Support complex _Float16 on X86 in C/C++ following the latest X86 psABI. (https://gitlab.com/x86-psABIs)

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105331
2021-08-18 11:16:14 +08:00
Wang, Pengfei 2379949aad [X86] AVX512FP16 instructions enabling 3/6
Enable FP16 conversion instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105265
2021-08-18 09:03:41 +08:00
Dylan Fleming ef198cd99e [SVE] Remove usage of getMaxVScale for AArch64, in favour of IR Attribute
Removed AArch64 usage of the getMaxVScale interface, replacing it with
the vscale_range(min, max) IR Attribute.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D106277
2021-08-17 14:42:47 +01:00
Wang, Pengfei f1de9d6dae [X86] AVX512FP16 instructions enabling 2/6
Enable FP16 binary operator instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105264
2021-08-15 08:56:33 +08:00
Arthur Eubanks 8e9ffa1dc6 [NFC] Cleanup callers of AttributeList::hasAttributes()
AttributeList::hasAttributes() is confusing, use clearer methods like
hasFnAttrs().
2021-08-13 12:16:52 -07:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Arthur Eubanks 92ce6db9ee [NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr()
This is more consistent with similar methods.
2021-08-13 11:09:18 -07:00
Michael Kruse b1de32d6dd [OMPIRBuilder] Clarify CanonicalLoopInfo. NFC.
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.

CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.

In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0

Also see discussion in D105706.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107540
2021-08-12 21:02:19 -05:00
Arnold Schwaighofer 9eb99d2e73 CodeGen: No need to check for isExternC if HasStrictReturn is already false
NFC intended.

Differential Revision: https://reviews.llvm.org/D107841
2021-08-11 07:42:48 -07:00
Wang, Pengfei 6f7f5b54c8 [X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105263
2021-08-10 12:46:01 +08:00
Michael Liao 6ec36d18ec [cuda] Mark builtin texture/surface reference variable as 'externally_initialized'.
- They need to be preserved even if there's no reference within the
  device code as the host code may need to initialize them based on the
  application logic.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D107718
2021-08-09 13:27:40 -04:00
Roger Ferrer Ibanez bfb77364d0 [OpenMP] Fix accidental reuse of VLA size
We were using an OpaqueValueExpr allocated on the stack to store
the size of a VLA. Because the VLASizeMap in CodegenFunction
uses the address of the expression to avoid recomputing VLAs,
we were accidentally reusing an earlier llvm::Value. This led to
invalid LLVM IR.

This is a temporary solution until VLASizeMap can be pushed and popped
based on the context.

Differential Revision: https://reviews.llvm.org/D107666
2021-08-07 05:55:27 +00:00
Joseph Huber 41a6b50c25 [OpenMP]Fix PR51349: Remove AlwaysInline for if regions.
After D94315 we add the `NoInline` attribute to the outlined function to handle
data environments in the OpenMP if clause. This conflicted with the `AlwaysInline`
attribute added to the outlined function. for better performance in D106799.
The data environments should ideally not require NoInline, but for now this
fixes PR51349.

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D107649
2021-08-06 17:53:04 -04:00
Serge Pavlov 4c4093e6e3 Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:

    Clang has builtin function '__builtin_isnan', which implements C
    library function 'isnan'. This function now is implemented entirely in
    clang codegen, which expands the function into set of IR operations.
    There are three mechanisms by which the expansion can be made.

    * The most common mechanism is using an unordered comparison made by
      instruction 'fcmp uno'. This simple solution is target-independent
      and works well in most cases. It however is not suitable if floating
      point exceptions are tracked. Corresponding IEEE 754 operation and C
      function must never raise FP exception, even if the argument is a
      signaling NaN. Compare instructions usually does not have such
      property, they raise 'invalid' exception in such case. So this
      mechanism is unsuitable when exception behavior is strict. In
      particular it could result in unexpected trapping if argument is SNaN.

    * Another solution was implemented in https://reviews.llvm.org/D95948.
      It is used in the cases when raising FP exceptions by 'isnan' is not
      allowed. This solution implements 'isnan' using integer operations.
      It solves the problem of exceptions, but offers one solution for all
      targets, however some can do the check in more efficient way.

    * Solution implemented by https://reviews.llvm.org/D96568 introduced a
      hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
      specific code into IR. Now only SystemZ implements this hook and it
      generates a call to target specific intrinsic function.

    Although these mechanisms allow to implement 'isnan' with enough
    efficiency, expanding 'isnan' in clang has drawbacks:

    * The operation 'isnan' is hidden behind generic integer operations or
      target-specific intrinsics. It complicates analysis and can prevent
      some optimizations.

    * IR can be created by tools other than clang, in this case treatment
      of 'isnan' has to be duplicated in that tool.

    Another issue with the current implementation of 'isnan' comes from the
    use of options '-ffast-math' or '-fno-honor-nans'. If such option is
    specified, 'fcmp uno' may be optimized to 'false'. It is valid
    optimization in general, but it results in 'isnan' always returning
    'false'. For example, in some libc++ implementations the following code
    returns 'false':

        std::isnan(std::numeric_limits<float>::quiet_NaN())

    The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
    operands are never NaNs. This assumption however should not be applied
    to the functions that check FP number properties, including 'isnan'. If
    such function returns expected result instead of actually making
    checks, it becomes useless in many cases. The option '-ffast-math' is
    often used for performance critical code, as it can speed up execution
    by the expense of manual treatment of corner cases. If 'isnan' returns
    assumed result, a user cannot use it in the manual treatment of NaNs
    and has to invent replacements, like making the check using integer
    operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
    which also expresses the opinion, that limitations imposed by
    '-ffast-math' should be applied only to 'math' functions but not to
    'tests'.

    To overcome these drawbacks, this change introduces a new IR intrinsic
    function 'llvm.isnan', which realizes the check as specified by IEEE-754
    and C standards in target-agnostic way. During IR transformations it
    does not undergo undesirable optimizations. It reaches instruction
    selection, where is lowered in target-dependent way. The lowering can
    vary depending on options like '-ffast-math' or '-ffp-model' so the
    resulting code satisfies requested semantics.

    Differential Revision: https://reviews.llvm.org/D104854
2021-08-06 14:32:27 +07:00
Fangrui Song c38efb4899 [clang] Implement -falign-loops=N (N is a power of 2) for non-LTO
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.

This patch implements the simplest but the most useful form where N is a
power of 2.

The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.

Differential Revision: https://reviews.llvm.org/D106701
2021-08-05 12:17:50 -07:00
Anshil Gandhi 39dac1f7f6 [clang] Add clang builtins support for gfx90a
Implement target builtins for gfx90a including fadd64, fadd32, add2h,
max and min on various global, flat and ds address spaces for which
intrinsics are implemented.

Differential Revision: https://reviews.llvm.org/D106909
2021-08-05 02:08:06 -06:00
Pavel Asyutchenko 7df405e079 Apply -fmacro-prefix-map to __builtin_FILE()
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107393
2021-08-04 16:42:14 -07:00
Bradley Smith e57e1e4e00 [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.

Differential Revision: https://reviews.llvm.org/D106860
2021-08-04 16:10:37 +00:00
Serge Pavlov 0c28a7c990 Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebcc.
Several errors were reported mainly test-suite execution time. Reverted
for investigation.
2021-08-04 17:18:15 +07:00
Serge Pavlov 16ff91ebcc Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.

* The most common mechanism is using an unordered comparison made by
  instruction 'fcmp uno'. This simple solution is target-independent
  and works well in most cases. It however is not suitable if floating
  point exceptions are tracked. Corresponding IEEE 754 operation and C
  function must never raise FP exception, even if the argument is a
  signaling NaN. Compare instructions usually does not have such
  property, they raise 'invalid' exception in such case. So this
  mechanism is unsuitable when exception behavior is strict. In
  particular it could result in unexpected trapping if argument is SNaN.

* Another solution was implemented in https://reviews.llvm.org/D95948.
  It is used in the cases when raising FP exceptions by 'isnan' is not
  allowed. This solution implements 'isnan' using integer operations.
  It solves the problem of exceptions, but offers one solution for all
  targets, however some can do the check in more efficient way.

* Solution implemented by https://reviews.llvm.org/D96568 introduced a
  hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
  specific code into IR. Now only SystemZ implements this hook and it
  generates a call to target specific intrinsic function.

Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:

* The operation 'isnan' is hidden behind generic integer operations or
  target-specific intrinsics. It complicates analysis and can prevent
  some optimizations.

* IR can be created by tools other than clang, in this case treatment
  of 'isnan' has to be duplicated in that tool.

Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':

    std::isnan(std::numeric_limits<float>::quiet_NaN())

The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.

To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.

Differential Revision: https://reviews.llvm.org/D104854
2021-08-04 15:27:49 +07:00
Alexandros Lamprineas 29b263a34f [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.

Differential Revision: https://reviews.llvm.org/D94098
2021-07-31 09:51:28 +01:00
Tarindu Jayatilaka 7a797b2902 Take OptimizationLevel class out of Pass Builder
Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D107025
2021-07-29 21:57:23 -07:00
Melanie Blower bc5b5ea037 [clang][patch][FPEnv] Make initialization of C++ globals strictfp aware
@kpn pointed out that the global variable initialization functions didn't
have the "strictfp" metadata set correctly, and @rjmccall said that there
was buggy code in SetFPModel and StartFunction, this patch is to solve
those problems. When Sema creates a FunctionDecl, it sets the
FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings
(i.e. a combination of command line options and #pragma float_control
settings) correspond to ConstrainedFP mode. That bit is used when CodeGen
starts codegen for a llvm function, and it translates into the
"strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571

Reviewed By: Aaron Ballman

Differential Revision: https://reviews.llvm.org/D102343
2021-07-29 12:02:37 -04:00
Kai Luo e4902e69e9 [PowerPC] Fix return type of XL compat CAS
`__compare_and_swap*` should return `i32` rather than `i1`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D107077
2021-07-29 14:49:26 +00:00
Fangrui Song 828767f325 COFF/ELF: Place llvm.global_ctors elements in llvm.used if comdat is used
On ELF, an SHT_INIT_ARRAY outside a section group is a GC root. The current
codegen abuses SHT_INIT_ARRAY in a section group to mean a GC root.

On PE/COFF, the dynamic initialization for `__declspec(selectany)` in a comdat
can be garbage collected by `-opt:ref`.

Call `addUsedGlobal` for the two cases to fix the abuse/bug.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D106925
2021-07-28 11:44:19 -07:00
Matheus Izvekov 4819b751bd [clang] NFC: change uses of `Expr->getValueKind` into `is?Value`
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D100733
2021-07-28 03:09:31 +02:00
Jose M Monsalve Diaz 0276db1416 [OpenMP] Creating the `omp_target_num_teams` and `omp_target_thread_limit` attributes to outlined functions
The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block
and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.

In commit D106033 we have the optimization phase. This commit adds the attributes to
the outlined function for the grid size. the two attributes are `omp_target_num_teams` and
`omp_target_thread_limit`. These values are added as long as they are constant.

Two functions are created `getNumThreadsExprForTargetDirective` and
`getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective`
 and `emitNumThreadsForTargetDirective` identify the expresion and emit the code.
However, for the Device version of the outlined function, we cannot emit anything.
Therefore, this is a first attempt to separate emision of code from deduction of the
values.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106298
2021-07-27 17:21:04 -04:00
Thomas Lively 33786576fd [WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.

Differential Revision: https://reviews.llvm.org/D106724
2021-07-27 08:41:30 -07:00
Reid Kleckner f9f56488e0 [DebugInfo] Use per-enumerator signedness for DIEnumerator
Allegedly the DWARF backend ignores this field of DIEnumerator, but we
set it nonetheless in case we decide to use it in the future.
Alternatively, we could remove it, but it is simpler to pass down the
signed bit as it is in the AST for now.

Implemented to address comments on D106585
2021-07-26 16:14:28 -07:00
Joseph Huber af000197c4 [OpenMP] Always inline the OpenMP outlined function
This patch adds the always inline attribute to the outlined functions generated
by OpenMP regions. Because there is only a single instance of this function and
it always has internal linkage it is safe to inline in every instance it is
created. This could potentially lead to performance degredation due to
inflated register counts in the parallel region.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106799
2021-07-26 17:27:59 -04:00
Reid Kleckner 3230493299 Fix clang debug info irgen of i128 enums
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.

Split from D105320, thanks to Matheus Izvekov for the test case and
report.

Differential Revision: https://reviews.llvm.org/D106585
2021-07-26 12:25:29 -07:00
Nemanja Ivanovic 1c50a5da36 [PowerPC] Implement partial vector ld/st builtins for XL compatibility
XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a
sequence of 1 to 16 bytes in big endian order, right justified in the
vector register (regardless of target endianness).
This is equivalent to vec_xl_len_r/vec_xst_len_r which are only
available on Power9.

This patch simply uses the Power9 functions when compiled for Power9,
but provides a more general implementation for Power8.

Differential revision: https://reviews.llvm.org/D106757
2021-07-26 13:19:52 -05:00
Shilei Tian 3274cdc83e [Clang][OpenMP] Remove the mandatory flush for capture for OpenMP 5.1
In OpenMP 5.1:
> If the `write` or `update` clause is specifieded, the atomic operation is not an atomic conditional update for which the comparison fails, and the effective memory ordering is `release`, `acq_rel`, or `seq_cst`, the strong flush on entry to the atomic operation is also a release flush. If the `read` or `update` clause is specified and the effective memory ordering is `acquire`, `acq_rel`, or `seq_cst` then the strong flush on exit from the atomic operation is also an acquire flush.

In OpenMP 5.0:
> If the `write`, `update`, or **`capture`** clause is specified and the `release`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on entry to the atomic operation is also a release flush. If the `read` or `capture` clause is specified and the `acquire`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on exit from the atomic operation is also an acquire flush.

From my understanding, in OpenMP 5.1, `capture` is removed from the requirement for flush, therefore we don't have to enforce it.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100768
2021-07-26 11:00:44 -04:00
Thomas Lively 85157c0079 [WebAssembly] Codegen for pmin and pmax
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax}
with standard codegen patterns. Since wasm_simd128.h uses an integer vector as
the standard single vector type, the IR for the pmin and pmax intrinsic
functions contains bitcasts that would not be there otherwise. Add extra codegen
patterns that can still select the pmin and pmax instructions in the presence of
these bitcasts.

Differential Revision: https://reviews.llvm.org/D106612
2021-07-23 14:49:21 -07:00
Fangrui Song 7290ddd6b1 Revert "[clang] -falign-loops="
This reverts commit 42896eeed9.

Unfinished. Accidentally pushed when reverting a clangd commit.
2021-07-23 09:58:35 -07:00
Fangrui Song 42896eeed9 [clang] -falign-loops= 2021-07-23 09:50:43 -07:00
Yaxun (Sam) Liu 44dbbe6106 [HIP] Preserve ASAN bitcode library functions
Address sanitizer passes may generate call of ASAN bitcode library
functions after bitcode linking in lld, therefore lld cannot add
those symbols since it does not know they will be used later.

To solve this issue, clang emits a reference to a bicode library
function which calls all ASAN functions which need to be
preserved. This basically force all ASAN functions to be
linked in.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D106315
2021-07-23 10:35:52 -04:00
Kai Luo e4ed93cb25 [PowerPC] Implement XL compatible behavior of __compare_and_swap
According to https://www.ibm.com/docs/en/xl-c-and-cpp-aix/16.1?topic=functions-compare-swap-compare-swaplp
XL's `__compare_and_swap` has a weird behavior that

> In either case, the contents of the memory location specified by addr are copied into the memory location specified by old_val_addr.

(unlike c11 `atomic_compare_exchange` specified in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)

This patch let clang's implementation follow this behavior.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106344
2021-07-23 01:16:02 +00:00
Florian Mayer 96c63492cb [hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D105703
2021-07-22 16:20:27 -07:00
Alexey Bataev b88a68c45e [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 08:44:37 -07:00
Alexey Bataev f828f0a90f Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments."
This reverts commit b455f7f225 to fix
buildbots.
2021-07-22 08:06:29 -07:00
Alexey Bataev b455f7f225 [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 07:53:37 -07:00
Florian Mayer 789a4a2e5c Revert "[hwasan] Use stack safety analysis."
This reverts commit bde9415fef.
2021-07-22 12:16:16 +01:00
Florian Mayer bde9415fef [hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D105703
2021-07-22 12:04:54 +01:00
Simon Tatham bd41136746 [clang] Use i64 for the !srcloc metadata on asm IR nodes.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.

!srcloc is generated in clang codegen, and pulled back out by llvm
functions like AsmPrinter::emitInlineAsm that need to report errors in
the inline asm. From there it goes to LLVMContext::emitError, is
stored in DiagnosticInfoInlineAsm, and ends up back in clang, at
BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true
clang::SourceLocation from the integer cookie.

Throughout this code path, it's now 64-bit rather than 32, which means
that if SourceLocation is expanded to a 64-bit type, this error report
won't lose half of the data.

The compiler will tolerate both of i32 and i64 !srcloc metadata in
input IR without faulting. Test added in llvm/MC. (The semantic
accuracy of the metadata is another matter, but I don't know of any
situation where that matters: if you're reading an IR file written by
a previous run of clang, you don't have the SourceManager that can
relate those source locations back to the original source files.)

Original version of the patch by Mikhail Maltsev.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D105491
2021-07-22 10:24:52 +01:00
Joseph Huber 754eb1c210 [OpenMP] Change `__kmpc_free_shared` to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496
2021-07-21 20:56:21 -04:00
Thomas Lively 8af333cf1a [WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8
Use the standard target-independent intrinsic to take advantage of standard
optimizations.

Differential Revision: https://reviews.llvm.org/D106506
2021-07-21 16:45:54 -07:00
Thomas Lively db7efcab7d [WebAssembly] Remove clang builtins for extract_lane and replace_lane
These builtins were added to capture the fact that the underlying Wasm
instructions return i32s and implicitly sign or zero extend the extracted lanes
in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations
during code gen that these low-level details do not need to be exposed to users.

This commit replaces the use of the builtins in wasm_simd128.h with normal
target-independent vector code. As a result, we can switch the relevant
intrinsics to use functions rather than macros and can use more user-friendly
return types rather than trying to precisely expose the underlying Wasm types.
Note, however, that the generated LLVM IR is no different after this change.

Differential Revision: https://reviews.llvm.org/D106500
2021-07-21 16:11:00 -07:00
Thomas Lively 1a57ee1276 [WebAssembly] Codegen for v128.load{32,64}_zero
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal instruction selection patterns. The wasm_simd128.h
intrinsics header was already using portable code for the corresponding
intrinsics, so now it produces the correct instructions.

Differential Revision: https://reviews.llvm.org/D106400
2021-07-21 09:02:12 -07:00
Quinn Pham e002d251dd [PowerPC] Floating Point Builtins for XL Compat.
This patch is in a series of patches to provide
builtins for compatibility with the XL compiler.
This patch adds builtins related to floating point
operations

Reviewed By: #powerpc, nemanjai, amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D103986
2021-07-21 08:33:39 -05:00
Simon Tatham 21401a7262 [clang] Introduce SourceLocation::[U]IntTy typedefs.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.

NFC: this patch introduces typedefs for the integer type used by
SourceLocation and makes all the boring changes to use the typedefs
everywhere, but for the moment, they are unconditionally defined to
uint32_t.

Patch originally by Mikhail Maltsev.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D105492
2021-07-21 10:45:46 +01:00
Albion Fung 2fd1520247 [PowerPC] Implemented mtmsr, mfspr, mtspr Builtins
Implemented builtins for mtmsr, mfspr, mtspr on PowerPC;
the patch is intended for XL Compatibility.

Differential revision: https://reviews.llvm.org/D106130
2021-07-20 17:51:00 -05:00
Albion Fung 3434ac9e39 [PowerPC] Store, load, move from and to registers related builtins
This patch implements store, load, move from and to registers related
builtins, as well as the builtin for stfiw. The patch aims to provide
feature parady with xlC on AIX.

Differential revision: https://reviews.llvm.org/D105946
2021-07-20 15:46:14 -05:00
Victor Huang 1a762f93f8 [PowerPC] Add PowerPC cmpb builtin and emit target indepedent code for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch add the builtin and emit target independent
code for __cmpb.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D105194
2021-07-20 13:06:22 -05:00
Melanie Blower ea864c9933 [clang][patch][NFC] Refactor calculation of FunctionDecl to avoid duplicate code 2021-07-20 11:01:22 -04:00
Quinn Pham fd855c24c7 [PowerPC] Restore FastMathFlags of Builder for Vector FDiv Builtins
This patch fixes `__builtin_ppc_recipdivf`, `__builtin_ppc_recipdivd`,
`__builtin_ppc_rsqrtf`, and `__builtin_ppc_rsqrtd`. FastMathFlags are
set to fast immediately before emitting these builtins. Now the flags
are restored to their previous values after the builtins are emitted.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D105984
2021-07-20 09:41:00 -05:00
Stefan Pintilie 02cd937945 [PowerPC][Builtins] Added a number of builtins for compatibility with XL.
Added a number of different builtins that exist in the XL compiler. Most of
these builtins already exist in clang under a different name.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D104386
2021-07-20 08:57:55 -05:00
Florian Mayer 5f08219322 Revert "[hwasan] Use stack safety analysis."
This reverts commit e9c63ed10b.
2021-07-20 10:36:46 +01:00
Florian Mayer e9c63ed10b [hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D105703
2021-07-20 10:06:35 +01:00
Quinn Pham 0268e123be [PowerPC] swdiv_nochk Builtins for XL Compat
This patch is in a series of patches to provide builtins for
compatibility with the XL compiler. This patch adds software divide
builtins with no checking. These builtins are each emitted as a fast
fdiv.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D106150
2021-07-19 16:51:10 -05:00
Giorgis Georgakoudis fb0cf01795 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb2.

Fix failing tests
2021-07-19 07:54:26 -07:00
Jamie Schmeiser 73840f9f81 thread_local support for AIX
Summary:
The AIX linker will produce errors on unresolved weak symbols.  Change the
generated code to not check for the initialization function but just call
it and ensure that it always exists.  Also, the AIX atexit routine has a
different name (and signature) so call it correctly.  Update the lit tests
to test on AIX appropriately.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D104420
2021-07-19 10:03:22 -04:00
Florian Mayer 807d50100c Revert "[hwasan] Use stack safety analysis."
This reverts commit 12268fe14a.
2021-07-19 12:08:32 +01:00
Florian Mayer 12268fe14a [hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D105703
2021-07-19 11:54:44 +01:00
David Blaikie dac582ad3a DebugInfo: Name class templates with default arguments consistently (both direct naming, and as a template argument for a function template)
It's noteworthy that GCC has the same bug here, which is a bit
surprising. Both Clang and GCC's bug is only for function template
arguments that are themselves templates with default template arguments
(f1<t1<int[, missing_default_here]>>). Probably because function name
matching isn't generally necessary - whereas type matching is necessary
for DWARF consumers to associate declarations and definitions across
translation units, so the bug's been addressed there already - but
continued to exist for function templates since it's fairly benign
there.

I came across this while working on a change that could reconstitute
these pretty printed names based on the rest of the DWARF, reducing the
size of the DWARF by not having to encode all the template parameters in
the name string. That reconstitution code can't tell the difference
between a defaulted argument or not, so couldn't create the current
buggy-ish output.

Making the names more consistent between direct and indirect references,
and between function and class templates seems all to the good.

(I fixed the function template version of this a few years back in
9fdd09a4cc - clearly I should've looked
more closely and generalized the code better so it only had to be fixed
once - well, doing that here now)
2021-07-17 23:58:15 -07:00
Nikita Popov 2c68ecccc9 [OpaquePtr] Remove uses of CreateGEP() without element type
Remove uses of to-be-deprecated API. In cases where the correct
element type was not immediately obvious to me, fall back to
explicit getPointerElementType().
2021-07-17 22:56:27 +02:00
Nikita Popov 6225d0cc6e [OpaquePtr] Remove uses of CreateInBoundsGEP() without element type
Remove uses of to-be-deprecated API.

Unfortunately this one mostly just makes the use of
getPointerElementType() explicit, as the correct type to use
wasn't immediately available (deriving it from QualType is left
as an excercise to the reader).
2021-07-17 21:27:16 +02:00
Nikita Popov 4ace6008f2 [OpaquePtr] Remove uses of CreateStructGEP() without element type
Remove uses of to-be-deprecated API.
2021-07-17 18:48:21 +02:00
Nikita Popov 6d3e7c783b [OpaquePtr] Remove uses of CreateConstGEP1_32() without element type
Remove uses of to-be-deprecated API. I've fallen back to calling
getPointerElementType() in some cases where the correct type wasn't
immediately obvious to me.
2021-07-17 18:32:36 +02:00
Nikita Popov 5071360eb1 [OpaquePtr] Remove uses of CGF.Builder.CreateConstInBoundsGEP1_64() without type
Remove uses of to-be-deprecated API.
2021-07-17 17:07:46 +02:00
Nikita Popov 357756ecf6 [OpaquePtr] Remove uses of CreateConstGEP1_64() without element type
Remove uses of to-be-deprecated API.
2021-07-17 16:43:20 +02:00
Nikita Popov 4737eebc0d [OpaquePtr] Remove uses of CreateConstInBoundsGEP2_64() without type
Remove uses of to-be-deprecated API.
2021-07-17 16:42:10 +02:00
Giorgis Georgakoudis e9c7291cb2 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102107
2021-07-16 23:27:44 -07:00
Nemanja Ivanovic 35a18a981f [PowerPC] Implement intrinsics for mtfsf[i]
This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`).

The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands.

Reviewed By: #powerpc, qiucf

Differential Revision: https://reviews.llvm.org/D105957
2021-07-16 16:26:11 -05:00
Victor Huang 4eb107ccba [PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D106021
2021-07-15 17:23:56 -05:00
Artem Belevich d774b4aa5e [NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction.
That should allow clang to compile mma.h from CUDA-11.3.

Differential Revision: https://reviews.llvm.org/D105384
2021-07-15 12:02:09 -07:00
Quinn Pham de3956605a [PowerPC] Fix popcntb XL Compat Builtin for 32bit
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.

Reviewed By: #powerpc, nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D105360
2021-07-15 13:19:47 -05:00
Victor Huang d40e8091bd [PowerPC] Add PowerPC rotate related builtins and emit target independent code for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and emit target independent
code for rotate related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D104744
2021-07-15 10:23:54 -05:00
Chuanqi Xu 8a1727ba51 [Coroutines] Run coroutine passes by default
This patch make coroutine passes run by default in LLVM pipeline. Now
the clang and opt could handle IR inputs containing coroutine intrinsics
without special options.
It should be fine. On the one hand, the coroutine passes seems to be stable
since there are already many projects using coroutine feature.
On the other hand, the coroutine passes should do nothing for IR who doesn't
contain coroutine intrinsic.

Test Plan: check-llvm

Reviewed by: lxfind, aeubanks

Differential Revision: https://reviews.llvm.org/D105877
2021-07-15 14:33:40 +08:00
Thomas Lively 4a4229f70f [WebAssembly] Codegen for v128.storeX_lane instructions
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.

Differential Revision: https://reviews.llvm.org/D106019
2021-07-14 16:15:25 -07:00
Thomas Lively 970e090010 [WebAssembly] Codegen for v128.loadX_lane instructions
Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.

Differential Revision: https://reviews.llvm.org/D105950
2021-07-14 11:31:53 -07:00
Albion Fung f1aca5ac96 [PowerPC] Fix L[D|W]ARX Implementation
LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.

Differential Revision: https://reviews.llvm.org/D105754
2021-07-13 11:02:07 -05:00
Dave MacLachlan 45ffe6341d [clang/objc] Optimize getters for non-atomic, copied properties
Properties that were declared `@property(copy, nonatomic) id foo` make an
unnecessary call to objc_get_property().  This call can be replaced with a
direct access to the backing variable identical to how a `@property(nonatomic)
id foo` would do it.

This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit
function call per property declared as copy/nonatomic.

Differential Revision: https://reviews.llvm.org/D105311
2021-07-13 09:22:13 -04:00
Thomas Lively cbabfc63b1 [WebAssembly] Custom combines for f32x4.demote_zero_f64x2
Replace the clang builtin function and LLVM intrinsic for
f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing
combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.

Differential Revision: https://reviews.llvm.org/D105755
2021-07-12 10:32:18 -07:00
Johannes Doerfert e2cfbfcc0c [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Nico Weber d3e7491333 Revert Attributor patch series
Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`
2021-07-10 16:15:55 -04:00
Johannes Doerfert 1d5711c3ee [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 12:32:50 -05:00
Thomas Lively e5220104d0 [WebAssembly] Custom combines for f64x2.promote_low_f32x4
Replace the clang builtin function and LLVM intrinsic previously used to select
the f64x2.promote_low_f32x4 instruction with custom combines from standard
SelectionDAG nodes. Implement the new combines to share code with the similar
combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.

Differential Revision: https://reviews.llvm.org/D105675
2021-07-09 18:59:29 -07:00
Alexey Bataev ab8989ab87 [OPENMP]Fix overlapped mapping for dereferenced pointer members.
If the base is used in a map clause and later we have a memberexpr with
this base, and the member is a pointer, and this pointer is dereferenced
anyhow (subscript, array section, dereference, etc.), such components
should be considered as overlapped, otherwise it may lead to incorrect
size computations, since we try to map a pointee as a part of the whole
struct, which is not true for the pointer members.

Differential Revision: https://reviews.llvm.org/D105562
2021-07-09 12:51:26 -07:00
David Blaikie 768e3af634 PR51034: Debug Info: Remove 'prototyped' from K&R function declarations
Regression caused by 6c9559b67b.
2021-07-09 12:07:36 -07:00
Varun Gandhi 92dcb1d2db [Clang] Introduce Swift async calling convention.
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D95561
2021-07-09 11:50:10 -07:00
David Blaikie 1def2579e1 PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
2021-07-08 13:37:57 -07:00
Nikita Popov a0ea367562 [CodeGen] Avoid nullptr arg to CreateStructGEP (NFC)
For now just make the getPointerElementType() explicit.
2021-07-08 21:21:43 +02:00
Alexey Bataev f57d396dca [OPENMP]Do no privatize const firstprivates in target regions.
No need to emit private copyfor firstprivate constants in target
regions, we can use the original copy instead.

Differential Revision: https://reviews.llvm.org/D105647
2021-07-08 11:55:37 -07:00
Nikita Popov 693251fb2f [CodeGen] Avoid CreateGEP with nullptr type (NFC)
In preparation for dropping support for it. I've replaced it with
a proper type where the correct type was obvious and left an
explicit getPointerElementType() where it wasn't.
2021-07-08 20:38:54 +02:00
Alexey Bataev b3c80dd894 [OPENMP]Remove const firstprivate allocation as a variable in a constant space.
Current implementation is not compatible with asynchronous target
regions, need to remove it.

Differential Revision: https://reviews.llvm.org/D105375
2021-07-07 05:56:48 -07:00
Hsiangkai Wang 593bf9b4de [Clang][RISCV] Implement vlseg and vlsegff.
Differential Revision: https://reviews.llvm.org/D103527
2021-07-07 13:44:40 +08:00
David Blaikie 6c9559b67b DebugInfo: Mangle K&R declarations for debug info linkage names
This fixes a gap in the `overloadable` attribute support (K&R declared
functions would get mangled symbol names, but that name wouldn't be
represented in the debug info linkage name field for the function) and
in -funique-internal-linkage-names (this came up in review discussion on
D98799) where K&R static declarations would not get the uniqued linkage
names.
2021-07-06 16:28:02 -07:00