Commit Graph

13703 Commits

Author SHA1 Message Date
Fangrui Song b5ef137c11 [gcov] Increment counters with atomicrmw if -fsanitize=thread
Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously
because non-atomic counter increments can be detected as data races.
2020-08-28 16:32:35 -07:00
Cullen Rhodes 2ddf795e8c Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375

Original summary:

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-28 15:57:09 +00:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
JF Bastien 82d29b397b Add an unsigned shift base sanitizer
It's not undefined behavior for an unsigned left shift to overflow (i.e. to
shift bits out), but it has been the source of bugs and exploits in certain
codebases in the past. As we do in other parts of UBSan, this patch adds a
dynamic checker which acts beyond UBSan and checks other sources of errors. The
option is enabled as part of -fsanitize=integer.

The flag is named: -fsanitize=unsigned-shift-base
This matches shift-base and shift-exponent flags.

<rdar://problem/46129047>

Differential Revision: https://reviews.llvm.org/D86000
2020-08-27 19:50:10 -07:00
Cullen Rhodes 2e7041fdc2 Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders
[1][2]. Reverting whilst I investigate.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
[2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112

This reverts commit 42587345a3.
2020-08-27 21:31:05 +00:00
Craig Topper 17ceda99d3 [CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes.
I think the removeAttributes interface should be faster than
calling removeAttribute 3 times.
2020-08-27 12:54:20 -07:00
Mikhail Maltsev ae1396c7d4 [ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).

The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.

This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86146
2020-08-27 18:43:16 +01:00
Teresa Johnson 7ed8124d46 [HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
2020-08-27 08:50:35 -07:00
Cullen Rhodes 42587345a3 [CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-27 15:11:58 +00:00
Sander de Smalen 4e9b66de3f [AArch64][SVE] Add missing debug info for ACLE types.
This patch adds type information for SVE ACLE vector types,
by describing them as vectors, with a lower bound of 0, and
an upper bound described by a DWARF expression using the
AArch64 Vector Granule register (VG), which contains the
runtime multiple of 64bit granules in an SVE vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86101
2020-08-27 10:56:42 +01:00
Christopher Tetreault 19e883fc59 [SVE] Remove calls to VectorType::getNumElements from clang
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D82582
2020-08-26 11:12:26 -07:00
Zequan Wu 9500a72091 Revert "[Coverage] Enable emitting gap area between macros"
This reverts commit a31c89c1b7.
2020-08-25 15:28:42 -07:00
Amy Huang b1009ee84f Reland "[DebugInfo] Move constructor homing case in shouldOmitDefinition."
For some reason the ctor homing case was before the template
specialization case, and could have returned false too early.
I moved the code out into a separate function to avoid this.

This reverts commit 05777ab941.
2020-08-25 12:36:11 -07:00
Jeremy Morse 121a49d839 [LiveDebugValues] Add switches for using instr-ref variable locations
This patch adds the -Xclang option
"-fexperimental-debug-variable-locations" and same LLVM CodeGen option,
to pick which variable location tracking solution to use.

Right now all the switch does is pick which LiveDebugValues
implementation to use, the normal VarLoc one or the instruction
referencing one in rGae6f78824031. Over time, the aim is to add fragments
of support in aid of the value-tracking RFC:

  http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html

also controlled by this command line switch. That will slowly move
variable locations to be defined by an instruction calculating a value,
and a DBG_INSTR_REF instruction referring to that value. Thus, this is
going to grow into a "use the new kind of variable locations" switch,
rather than just "use the new LiveDebugValues implementation".

Differential Revision: https://reviews.llvm.org/D83048
2020-08-25 14:58:48 +01:00
Eric Christopher 05777ab941 Temporarily Revert "[DebugInfo] Move constructor homing case in shouldOmitDefinition."
as it's causing test failures.

This reverts commit 589ce5f705.
2020-08-24 21:51:31 -07:00
Amy Huang 589ce5f705 [DebugInfo] Move constructor homing case in shouldOmitDefinition.
For some reason the ctor homing case was before the template
specialization case, and could have returned false too early.
I moved the code out into a separate function to avoid this.

Also added a run line to the template specialization test. I guess
all the -debug-info-kind=limited tests should still pass with =constructor,
but it's probably unnecessary to test for all of those.

Differential Revision: https://reviews.llvm.org/D86491
2020-08-24 20:17:59 -07:00
Raphael Isemann 105151ca56 Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
The orignal patch with the missing 'REQUIRES: asserts' as there is a debug-only
flag used in the test.

Original summary:

D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5
uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations
that creates the dwoID in the module from the ASTFileSignature
(`Buffer->Signature` being the array subclass that is now `std::array<uint8_t,
20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the
ASTFileSignature and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module
no longer match (which in turns causes that LLDB keeps warning about the dwoID's
not matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an
uint64_t which makes the dwoID match again (and should prevent issues like that
in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-24 14:52:53 +02:00
Bevin Hansson 577f8b157a [Fixed Point] Add codegen for fixed-point shifts.
This patch adds codegen to Clang for fixed-point shift
operations.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D83294
2020-08-24 14:37:16 +02:00
Bevin Hansson 808ac54645 [Fixed Point] Use FixedPointBuilder to codegen fixed-point IR.
This changes the methods in CGExprScalar to use
FixedPointBuilder to generate IR for fixed-point
conversions and operations.

Since FixedPointBuilder emits padded operations slightly
differently than the original code, some tests change.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D86282
2020-08-24 14:37:07 +02:00
Raphael Isemann 2b3074c0d1 Revert "Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)""
This reverts commit ada2e8ea67. Still breaking
on Fuchsia (and also Fedora) with exit code 1, so back to investigating.
2020-08-24 12:54:25 +02:00
Raphael Isemann ada2e8ea67 Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
This relands D84013 but with a test that relies on less shell features to
hopefully make the test pass on Fuchsia (where the test from the previous patch
version strangely failed with a plain "Exit code 1").

Original summary:

D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5 uint32_t.
However, it didn't update the code in ObjectFilePCHContainerOperations that creates
the dwoID in the module from the ASTFileSignature (`Buffer->Signature` being the
array subclass that is now `std::array<uint8_t, 20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature  (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the ASTFileSignature
and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module no
longer match (which in turns causes that LLDB keeps warning about the dwoID's not
matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an uint64_t which
makes the dwoID match again (and should prevent issues like that in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-24 11:51:32 +02:00
Raphael Isemann c1dd5df425 Revert "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
This reverts commit a4c3ed42ba.

The test is curiously failing with a plain exit code 1 on Fuchsia.
2020-08-21 16:08:37 +02:00
Raphael Isemann a4c3ed42ba Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)
D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5
uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations
that creates the dwoID in the module from the ASTFileSignature
(`Buffer->Signature` being the array subclass that is now `std::array<uint8_t,
20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the
ASTFileSignature and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module
no longer match (which in turns causes that LLDB keeps warning about the dwoID's
not matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an
uint64_t which makes the dwoID match again (and should prevent issues like that
in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-21 15:05:02 +02:00
Bevin Hansson 1a995a0af3 [ADT] Move FixedPoint.h from Clang to LLVM.
This patch moves FixedPointSemantics and APFixedPoint
from Clang to LLVM ADT.

This will make it easier to use the fixed-point
classes in LLVM for constructing an IR builder for
fixed-point and for reusing the APFixedPoint class
for constant evaluation purposes.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html

Reviewed By: leonardchan, rjmccall

Differential Revision: https://reviews.llvm.org/D85312
2020-08-20 10:29:45 +02:00
Craig Topper 724f570ad2 [X86] Add support 'tune' in target attribute
This adds parsing and codegen support for tune in target attribute.

I've implemented this so that arch in the target attribute implicitly disables tune from the command line. I'm not sure what gcc does here. But since -march implies -mtune. I assume 'arch' in the target attribute implies tune in the target attribute.

Differential Revision: https://reviews.llvm.org/D86187
2020-08-19 15:58:19 -07:00
Aaron Puchert 916b750a8d [CodeGen] Use existing EmitLambdaVLACapture (NFC) 2020-08-19 15:20:05 +02:00
Sander de Smalen 0353848cc9 [Clang][SVE] NFC: Move info about ACLE types into separate function.
This function returns a struct `BuiltinVectorTypeInfo` that contains
the builtin vector's element type, element count and number of vectors
(used for vector tuples).

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D86100
2020-08-19 11:04:20 +01:00
Craig Topper 4cbceb74bb [X86] Add basic support for -mtune command line option in clang
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.

-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.

Differential Revision: https://reviews.llvm.org/D85384
2020-08-18 15:13:19 -07:00
Zequan Wu 84fffa6728 [Coverage] Adjust skipped regions only if {Prev,Next}TokLoc is in the same file as regions' {start, end}Loc
Fix a bug if {Prev, Next}TokLoc is in different file from skipped regions' {start, end}Loc

Differential Revision: https://reviews.llvm.org/D86116
2020-08-18 13:26:19 -07:00
Eli Friedman 673dbe1b5e [clang codegen] Use IR "align" attribute for static array arguments.
Without the "align" attribute, marking the argument dereferenceable is
basically useless.  See also D80166.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 .

Differential Revision: https://reviews.llvm.org/D84992
2020-08-18 12:51:16 -07:00
Johannes Doerfert 95a25e4c32 [OpenMP][FIX] Do not use TBAA in type punning reduction GPU code PR46156
When we implement OpenMP GPU reductions we use type punning a lot during
the shuffle and reduce operations. This is not always compatible with
language rules on aliasing. So far we generated TBAA which later allowed
to remove some of the reduce code as accesses and initialization were
"known to not alias". With this patch we avoid TBAA in this step,
hopefully for all accesses that we need to.

Verified on the reproducer of PR46156 and QMCPack.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D86037
2020-08-16 14:38:31 -05:00
Gui Andrade 909a851dbf [CGAtomic] Mark atomic libcall functions `nounwind`
These functions won't ever unwind. This is useful for MemorySanitizer
as it simplifies handling __atomic_load in particular.

Differential Revision: https://reviews.llvm.org/D85573
2020-08-14 07:46:43 +00:00
Zequan Wu a31c89c1b7 [Coverage] Enable emitting gap area between macros
Differential Revision: https://reviews.llvm.org/D85176
2020-08-12 16:25:27 -07:00
Craig Topper 5c1fe4e20f [Target] Cache the command line derived feature map in TargetOptions.
We can use this to remove some calls to initFeatureMap from Sema
and CodeGen when a function doesn't have a target attribute.

This reduces compile time of the linux kernel where this map
is needed to diagnose some inline assembly constraints based
on whether sse, avx, or avx512 is enabled.

Differential Revision: https://reviews.llvm.org/D85807
2020-08-12 12:37:23 -07:00
Alexey Bataev fbd6d2c54e [OPENMP] Fix PR47063: crash when trying to get captured statetment.
Need to call getRawStmt() function instead, when trying to get inner
associated statement for the executable directive. Not all directives
use captured statements.
2020-08-12 12:05:58 -04:00
Alexey Bataev f4f3f678f1 [OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks.
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
 execution.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D84457
2020-08-12 11:28:19 -04:00
Alexey Bataev ddbd21d288 [OPENMP]Do not add TGT_OMP_TARGET_PARAM flag to non-captured mapped arguments.
If the arguments are mapped, but are actually not used in the target
region, the compiler still adds attribute TGT_OMP_TARGET_PARAM for such
arguments. It makes the libomptarget to add such parameters to the list
of arguments, passed to the kernel at the runtime, and may lead to
incorrect results/crashes during execution.

Differential Revision: https://reviews.llvm.org/D85755
2020-08-12 10:06:52 -04:00
Alexey Bataev 3651658bdd Revert "[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks."
This reverts commit ec9563c54e to
investigate compiler crash revelaed by the buildbots.
2020-08-12 09:50:32 -04:00
Alexey Bataev ec9563c54e [OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks.
Summary:
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
 execution.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84457
2020-08-12 09:37:24 -04:00
Kai Nacke b3aece0531 [SystemZ/ZOS] Add binary format goff and operating system zos to the triple
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.

Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay

Reviewed By: efriedma, tahonermann, hubert.reinterpertcast

Differential Revision: https://reviews.llvm.org/D82081
2020-08-11 05:26:26 -04:00
Wang, Pengfei 9512525947 [X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics.
When we use mask compare intrinsics under strict FP option, the masked
elements shouldn't raise any exception. So, we cann't replace the
intrinsic with a full compare + "and" operation.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D85385
2020-08-11 10:28:41 +08:00
Johannes Doerfert fa5d22a045 [OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the
OMPIRBuilder. This cuts down on the clang code as well as the
differences between the two, making further transitions easier. Tests
have changed but there should not be a real functional change. The most
interesting difference is probably that we stop generating local ident_t
allocations for now and just use globals. Given that this happens only
with debug info, the location part of the `ident_t` is probably bigger
than the test anyway. As the location part is already a global, we can
avoid the allocation, memcpy, and store in favor of a constant global
that is slightly bigger. This can be revisited if there are
complications.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D80735
2020-08-10 17:13:26 -05:00
Nick Desaulniers 4f2ad15db5 [Clang] implement -fno-eliminate-unused-debug-types
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Resubmit after breaking Windows and OSX builds.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D80242
2020-08-10 15:08:48 -07:00
Michael Liao c7b683c126 [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.
- Skip generating profile data on `__global__` function in the host
  compilation. It's a host-side stub function only and don't have
  profile instrumentation generated on the real function body. The extra
  profile data results in the malformed instrumentation profile data.
- Skip generating region mapping on functions in the wrong-side, i.e.,
  + For the device compilation, skip host-only functions; and,
  + For the host compilation, skip device-only functions (including
    `__global__` functions.)
- As the device-side profiling is not ready yet, only host-side profile
  code generation is checked.

Differential Revision: https://reviews.llvm.org/D85276
2020-08-10 11:01:46 -04:00
Xiangling Liao 6ef801aa6b [AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to
use the linkage type and function names Clang chooses for sinit related function.

On the backend side, this patch sets correct linkage and function names on aliases
created for sinit/sterm functions.

Differential Revision: https://reviews.llvm.org/D84534
2020-08-10 10:10:49 -04:00
Nick Desaulniers abb9bf4bcf Revert "[Clang] implement -fno-eliminate-unused-debug-types"
This reverts commit e486921fd6.

Breaks windows builds and osx builds.
2020-08-07 16:11:41 -07:00
Nick Desaulniers e486921fd6 [Clang] implement -fno-eliminate-unused-debug-types
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D80242
2020-08-07 14:13:48 -07:00
Alexey Bataev 4a7aedb843 [OPENMP]Simplify representation for atomic, critical, master and section
constrcut.

Several constructs may be represented wityout relying on CapturedStmt.
It saves memory and improves compilation speed.
2020-08-07 09:58:23 -04:00
Matt Arsenault 30eeb742f1 clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.

Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.

Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).

This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.

This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
2020-08-06 15:52:26 -04:00
Alexey Bataev 0af7835eae [OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation.
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83261
2020-08-06 12:25:19 -04:00