Commit Graph

13789 Commits

Author SHA1 Message Date
Eric Christopher 05777ab941 Temporarily Revert "[DebugInfo] Move constructor homing case in shouldOmitDefinition."
as it's causing test failures.

This reverts commit 589ce5f705.
2020-08-24 21:51:31 -07:00
Amy Huang 589ce5f705 [DebugInfo] Move constructor homing case in shouldOmitDefinition.
For some reason the ctor homing case was before the template
specialization case, and could have returned false too early.
I moved the code out into a separate function to avoid this.

Also added a run line to the template specialization test. I guess
all the -debug-info-kind=limited tests should still pass with =constructor,
but it's probably unnecessary to test for all of those.

Differential Revision: https://reviews.llvm.org/D86491
2020-08-24 20:17:59 -07:00
Raphael Isemann 105151ca56 Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
The orignal patch with the missing 'REQUIRES: asserts' as there is a debug-only
flag used in the test.

Original summary:

D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5
uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations
that creates the dwoID in the module from the ASTFileSignature
(`Buffer->Signature` being the array subclass that is now `std::array<uint8_t,
20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the
ASTFileSignature and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module
no longer match (which in turns causes that LLDB keeps warning about the dwoID's
not matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an
uint64_t which makes the dwoID match again (and should prevent issues like that
in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-24 14:52:53 +02:00
Bevin Hansson 577f8b157a [Fixed Point] Add codegen for fixed-point shifts.
This patch adds codegen to Clang for fixed-point shift
operations.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D83294
2020-08-24 14:37:16 +02:00
Bevin Hansson 808ac54645 [Fixed Point] Use FixedPointBuilder to codegen fixed-point IR.
This changes the methods in CGExprScalar to use
FixedPointBuilder to generate IR for fixed-point
conversions and operations.

Since FixedPointBuilder emits padded operations slightly
differently than the original code, some tests change.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D86282
2020-08-24 14:37:07 +02:00
Raphael Isemann 2b3074c0d1 Revert "Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)""
This reverts commit ada2e8ea67. Still breaking
on Fuchsia (and also Fedora) with exit code 1, so back to investigating.
2020-08-24 12:54:25 +02:00
Raphael Isemann ada2e8ea67 Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
This relands D84013 but with a test that relies on less shell features to
hopefully make the test pass on Fuchsia (where the test from the previous patch
version strangely failed with a plain "Exit code 1").

Original summary:

D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5 uint32_t.
However, it didn't update the code in ObjectFilePCHContainerOperations that creates
the dwoID in the module from the ASTFileSignature (`Buffer->Signature` being the
array subclass that is now `std::array<uint8_t, 20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature  (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the ASTFileSignature
and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module no
longer match (which in turns causes that LLDB keeps warning about the dwoID's not
matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an uint64_t which
makes the dwoID match again (and should prevent issues like that in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-24 11:51:32 +02:00
Raphael Isemann c1dd5df425 Revert "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"
This reverts commit a4c3ed42ba.

The test is curiously failing with a plain exit code 1 on Fuchsia.
2020-08-21 16:08:37 +02:00
Raphael Isemann a4c3ed42ba Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)
D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5
uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations
that creates the dwoID in the module from the ASTFileSignature
(`Buffer->Signature` being the array subclass that is now `std::array<uint8_t,
20>` instead of `std::array<uint32_t, 5>`).

```
  uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 | Buffer->Signature[0]
```

This code works with the old ASTFileSignature (where two uint32_t are enough to
fill the uint64_t), but after the patch this only took two bytes from the
ASTFileSignature and only partly filled the Signature uint64_t.

This caused that the dwoID in the module ref and the dwoID in the actual module
no longer match (which in turns causes that LLDB keeps warning about the dwoID's
not matching when debugging -gmodules-compiled binaries).

This patch just unifies the logic for turning the ASTFileSignature into an
uint64_t which makes the dwoID match again (and should prevent issues like that
in the future).

Reviewed By: aprantl, dang

Differential Revision: https://reviews.llvm.org/D84013
2020-08-21 15:05:02 +02:00
Bevin Hansson 1a995a0af3 [ADT] Move FixedPoint.h from Clang to LLVM.
This patch moves FixedPointSemantics and APFixedPoint
from Clang to LLVM ADT.

This will make it easier to use the fixed-point
classes in LLVM for constructing an IR builder for
fixed-point and for reusing the APFixedPoint class
for constant evaluation purposes.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html

Reviewed By: leonardchan, rjmccall

Differential Revision: https://reviews.llvm.org/D85312
2020-08-20 10:29:45 +02:00
Craig Topper 724f570ad2 [X86] Add support 'tune' in target attribute
This adds parsing and codegen support for tune in target attribute.

I've implemented this so that arch in the target attribute implicitly disables tune from the command line. I'm not sure what gcc does here. But since -march implies -mtune. I assume 'arch' in the target attribute implies tune in the target attribute.

Differential Revision: https://reviews.llvm.org/D86187
2020-08-19 15:58:19 -07:00
Aaron Puchert 916b750a8d [CodeGen] Use existing EmitLambdaVLACapture (NFC) 2020-08-19 15:20:05 +02:00
Sander de Smalen 0353848cc9 [Clang][SVE] NFC: Move info about ACLE types into separate function.
This function returns a struct `BuiltinVectorTypeInfo` that contains
the builtin vector's element type, element count and number of vectors
(used for vector tuples).

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D86100
2020-08-19 11:04:20 +01:00
Craig Topper 4cbceb74bb [X86] Add basic support for -mtune command line option in clang
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.

-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.

Differential Revision: https://reviews.llvm.org/D85384
2020-08-18 15:13:19 -07:00
Zequan Wu 84fffa6728 [Coverage] Adjust skipped regions only if {Prev,Next}TokLoc is in the same file as regions' {start, end}Loc
Fix a bug if {Prev, Next}TokLoc is in different file from skipped regions' {start, end}Loc

Differential Revision: https://reviews.llvm.org/D86116
2020-08-18 13:26:19 -07:00
Eli Friedman 673dbe1b5e [clang codegen] Use IR "align" attribute for static array arguments.
Without the "align" attribute, marking the argument dereferenceable is
basically useless.  See also D80166.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 .

Differential Revision: https://reviews.llvm.org/D84992
2020-08-18 12:51:16 -07:00
Johannes Doerfert 95a25e4c32 [OpenMP][FIX] Do not use TBAA in type punning reduction GPU code PR46156
When we implement OpenMP GPU reductions we use type punning a lot during
the shuffle and reduce operations. This is not always compatible with
language rules on aliasing. So far we generated TBAA which later allowed
to remove some of the reduce code as accesses and initialization were
"known to not alias". With this patch we avoid TBAA in this step,
hopefully for all accesses that we need to.

Verified on the reproducer of PR46156 and QMCPack.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D86037
2020-08-16 14:38:31 -05:00
Gui Andrade 909a851dbf [CGAtomic] Mark atomic libcall functions `nounwind`
These functions won't ever unwind. This is useful for MemorySanitizer
as it simplifies handling __atomic_load in particular.

Differential Revision: https://reviews.llvm.org/D85573
2020-08-14 07:46:43 +00:00
Zequan Wu a31c89c1b7 [Coverage] Enable emitting gap area between macros
Differential Revision: https://reviews.llvm.org/D85176
2020-08-12 16:25:27 -07:00
Craig Topper 5c1fe4e20f [Target] Cache the command line derived feature map in TargetOptions.
We can use this to remove some calls to initFeatureMap from Sema
and CodeGen when a function doesn't have a target attribute.

This reduces compile time of the linux kernel where this map
is needed to diagnose some inline assembly constraints based
on whether sse, avx, or avx512 is enabled.

Differential Revision: https://reviews.llvm.org/D85807
2020-08-12 12:37:23 -07:00
Alexey Bataev fbd6d2c54e [OPENMP] Fix PR47063: crash when trying to get captured statetment.
Need to call getRawStmt() function instead, when trying to get inner
associated statement for the executable directive. Not all directives
use captured statements.
2020-08-12 12:05:58 -04:00
Alexey Bataev f4f3f678f1 [OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks.
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
 execution.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D84457
2020-08-12 11:28:19 -04:00
Alexey Bataev ddbd21d288 [OPENMP]Do not add TGT_OMP_TARGET_PARAM flag to non-captured mapped arguments.
If the arguments are mapped, but are actually not used in the target
region, the compiler still adds attribute TGT_OMP_TARGET_PARAM for such
arguments. It makes the libomptarget to add such parameters to the list
of arguments, passed to the kernel at the runtime, and may lead to
incorrect results/crashes during execution.

Differential Revision: https://reviews.llvm.org/D85755
2020-08-12 10:06:52 -04:00
Alexey Bataev 3651658bdd Revert "[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks."
This reverts commit ec9563c54e to
investigate compiler crash revelaed by the buildbots.
2020-08-12 09:50:32 -04:00
Alexey Bataev ec9563c54e [OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks.
Summary:
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
 execution.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84457
2020-08-12 09:37:24 -04:00
Kai Nacke b3aece0531 [SystemZ/ZOS] Add binary format goff and operating system zos to the triple
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.

Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay

Reviewed By: efriedma, tahonermann, hubert.reinterpertcast

Differential Revision: https://reviews.llvm.org/D82081
2020-08-11 05:26:26 -04:00
Wang, Pengfei 9512525947 [X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics.
When we use mask compare intrinsics under strict FP option, the masked
elements shouldn't raise any exception. So, we cann't replace the
intrinsic with a full compare + "and" operation.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D85385
2020-08-11 10:28:41 +08:00
Johannes Doerfert fa5d22a045 [OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the
OMPIRBuilder. This cuts down on the clang code as well as the
differences between the two, making further transitions easier. Tests
have changed but there should not be a real functional change. The most
interesting difference is probably that we stop generating local ident_t
allocations for now and just use globals. Given that this happens only
with debug info, the location part of the `ident_t` is probably bigger
than the test anyway. As the location part is already a global, we can
avoid the allocation, memcpy, and store in favor of a constant global
that is slightly bigger. This can be revisited if there are
complications.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D80735
2020-08-10 17:13:26 -05:00
Nick Desaulniers 4f2ad15db5 [Clang] implement -fno-eliminate-unused-debug-types
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Resubmit after breaking Windows and OSX builds.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D80242
2020-08-10 15:08:48 -07:00
Michael Liao c7b683c126 [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.
- Skip generating profile data on `__global__` function in the host
  compilation. It's a host-side stub function only and don't have
  profile instrumentation generated on the real function body. The extra
  profile data results in the malformed instrumentation profile data.
- Skip generating region mapping on functions in the wrong-side, i.e.,
  + For the device compilation, skip host-only functions; and,
  + For the host compilation, skip device-only functions (including
    `__global__` functions.)
- As the device-side profiling is not ready yet, only host-side profile
  code generation is checked.

Differential Revision: https://reviews.llvm.org/D85276
2020-08-10 11:01:46 -04:00
Xiangling Liao 6ef801aa6b [AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to
use the linkage type and function names Clang chooses for sinit related function.

On the backend side, this patch sets correct linkage and function names on aliases
created for sinit/sterm functions.

Differential Revision: https://reviews.llvm.org/D84534
2020-08-10 10:10:49 -04:00
Nick Desaulniers abb9bf4bcf Revert "[Clang] implement -fno-eliminate-unused-debug-types"
This reverts commit e486921fd6.

Breaks windows builds and osx builds.
2020-08-07 16:11:41 -07:00
Nick Desaulniers e486921fd6 [Clang] implement -fno-eliminate-unused-debug-types
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D80242
2020-08-07 14:13:48 -07:00
Alexey Bataev 4a7aedb843 [OPENMP]Simplify representation for atomic, critical, master and section
constrcut.

Several constructs may be represented wityout relying on CapturedStmt.
It saves memory and improves compilation speed.
2020-08-07 09:58:23 -04:00
Matt Arsenault 30eeb742f1 clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.

Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.

Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).

This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.

This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
2020-08-06 15:52:26 -04:00
Alexey Bataev 0af7835eae [OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation.
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83261
2020-08-06 12:25:19 -04:00
Anatoly Trosinenko 5a07490d76 [ABI][NFC] Fix the confusion of ByVal and ByRef argument names
The second argument of getNaturalAlignIndirect() was `bool ByRef`, but
the implementation was just delegating to getIndirect() with `ByRef`
passed unchanged to `bool ByVal` parameter of getIndirect().

Fix a couple of /*ByRef=*/ comments as well.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D85113
2020-08-06 15:20:18 +03:00
Stanislav Mekhanoshin 105608a4c2 [AMDGPU] Added missing gfx1031 cases to CGOpenMPRuntimeGPU.cpp 2020-08-05 12:39:03 -07:00
Erich Keane 2143a90b34 Fix _ExtInt(1) to be a i1 in memory.
The _ExtInt(1) in getTypeForMem was hitting the bool logic for expanding
to an 8 bit value.  The result was an assert, or store i1 %0, i8* %2, align 1
since the parameter IS an i1.  This patch changes the 'forMem' test to
exclude ext-int from the bool test.
2020-08-05 10:54:51 -07:00
Joel E. Denny 002d61db2b [OpenMP] Fix `present` for exit from `omp target data`
Without this patch, the following example fails but shouldn't
according to OpenMP TR8:

```
 #pragma omp target enter data map(alloc:i)
 #pragma omp target data map(present, alloc: i)
 {
   #pragma omp target exit data map(delete:i)
 } // fails presence check here
```

OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states:

> If the map clause appears on a target, target data, target enter
> data or target exit data construct with a present map-type-modifier
> then on entry to the region if the corresponding list item does not
> appear in the device data environment an error occurs and the
> program terminates.

There is no corresponding statement about the exit from a region.
Thus, the `present` modifier should:

1. Check for presence upon entry into any region, including a `target
   exit data` region.  This behavior is already implemented correctly.

2. Should not check for presence upon exit from any region, including
   a `target` or `target data` region.  Without this patch, this
   behavior is not implemented correctly, breaking the above example.

In the case of `target data`, this patch fixes the latter behavior by
removing the `present` modifier from the map types Clang generates for
the runtime call at the end of the region.

In the case of `target`, we have not found a valid OpenMP program for
which such a fix would matter.  It appears that, if a program can
guarantee that data is present at the beginning of a `target` region
so that there's no error there, that data is also guaranteed to be
present at the end.  This patch adds a comment to the runtime to
document this case.

Reviewed By: grokos, RaviNarayanaswamy, ABataev

Differential Revision: https://reviews.llvm.org/D84422
2020-08-05 10:03:31 -04:00
Yonghong Song 00602ee7ef BPF: simplify IR generation for __builtin_btf_type_id()
This patch simplified IR generation for __builtin_btf_type_id().
For __builtin_btf_type_id(obj, flag), previously IR builtin
looks like
   if (obj is a lvalue)
     llvm.bpf.btf.type.id(obj.ptr, 1, flag)  !type
   else
     llvm.bpf.btf.type.id(obj, 0, flag)  !type
The purpose of the 2nd argument is to differentiate
   __builtin_btf_type_id(obj, flag) where obj is a lvalue
vs.
   __builtin_btf_type_id(obj.ptr, flag)

Note that obj or obj.ptr is never used by the backend
and the `obj` argument is only used to derive the type.
This code sequence is subject to potential llvm CSE when
  - obj is the same .e.g., nullptr
  - flag is the same
  - metadata type is different, e.g., typedef of struct "s"
    and strust "s".
In the above, we don't want CSE since their metadata is different.

This patch change IR builtin to
   llvm.bpf.btf.type.id(seq_num, flag)  !type
and seq_num is always increasing. This will prevent potential
llvm CSE.

Also report an error if the type name is empty for
remote relocation since remote relocation needs non-empty
type name to do relocation against vmlinux.

Differential Revision: https://reviews.llvm.org/D85174
2020-08-04 16:29:42 -07:00
Thorsten Schuett e18c6ef6b4 [clang] improve diagnostics for misaligned and large atomics
"Listing the alignment and access size (== expected alignment) in the warning
seems like a good idea."

solves PR 46947

  struct Foo {
    struct Bar {
      void * a;
      void * b;
    };
    Bar bar;
  };

  struct ThirtyTwo {
    struct Large {
      void * a;
      void * b;
      void * c;
      void * d;
    };
    Large bar;
  };

  void braz(Foo *foo, ThirtyTwo *braz) {
    Foo::Bar bar;
    __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED);

    ThirtyTwo::Large foobar;
    __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED);
  }

repro.cpp:21:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (16 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment]
  __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED);
  ^
repro.cpp:24:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (32 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment]
  __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED);
  ^
repro.cpp:24:3: warning: large atomic operation may incur significant performance penalty; the access size (32 bytes) exceeds the max lock-free size (16  bytes) [-Watomic-alignment]
3 warnings generated.

Differential Revision: https://reviews.llvm.org/D85102
2020-08-04 11:10:29 -07:00
Yonghong Song 6d67506964 [clang][BPF] support type exist/size and enum exist/value relocations
This patch added the following additional compile-once
run-everywhere (CO-RE) relocations:
  - existence/size of typedef, struct/union or enum type
  - enum value and enum value existence

These additional relocations will make CO-RE bpf programs more
adaptive for potential kernel internal data structure changes.

For existence/size relocations, the following two code patterns
are supported:
  1. uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
  2. <type> var;
     uint32_t __builtin_preserve_field_info(var, flag);
flag = 0 for existence relocation and flag = 1 for size relocation.

For enum value existence and enum value relocations, the following code
pattern is supported:
  uint64_t __builtin_preserve_enum_value(*(<enum_type> *)<enum_value>,
                                         flag);
flag = 0 means existence relocation and flag = 1 for enum value.
relocation. In the above <enum_type> can be an enum type or
a typedef to enum type. The <enum_value> needs to be an enumerator
value from the same enum type. The return type is uint64_t to
permit potential 64bit enumerator values.

Differential Revision: https://reviews.llvm.org/D83242
2020-08-04 08:39:53 -07:00
Kazushi (Jam) Marukawa 045e79e77c [VE] Extend integer arguments and return values smaller than 64 bits
In order to follow NEC Aurora SX VE ABI correctly, change to sign/zero
extend integer arguments and return values smaller than 64 bits in clang.
Also update regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D85071
2020-08-04 08:07:05 +09:00
Thomas Lively cb32792210 [WebAssembly] Implement prototype v128.load{32,64}_zero instructions
Specified in https://github.com/WebAssembly/simd/pull/237, these
instructions load the first vector lane from memory and zero the other
lanes. Since these instructions are not officially part of the SIMD
proposal, they are only available on an opt-in basis via LLVM
intrinsics and clang builtin functions. If these instructions are
merged to the proposal, this implementation will change so that the
instructions will be generated from normal IR. At that point the
intrinsics and builtin functions would be removed.

This PR also changes the opcodes for the experimental f32x4.qfm{a,s}
instructions because their opcodes conflicted with those of the
v128.load{32,64}_zero instructions. The new opcodes were chosen to
match those used in V8.

Differential Revision: https://reviews.llvm.org/D84820
2020-08-03 13:54:00 -07:00
Akira Hatanaka 41b1e97b12 [CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as
notail on x86-64

This is needed because the epilogue code inserted before tail calls on
x86-64 breaks the handshake between the caller and callee.

Calls to objc_retainAutoreleasedReturnValue used to have the same
problem, which was fixed in https://reviews.llvm.org/D59656.

rdar://problem/66029552

Differential Revision: https://reviews.llvm.org/D84540
2020-08-03 13:25:25 -07:00
Saiyedul Islam 160ff83765 [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3
Provides AMDGCN and NVPTX specific specialization of getGPUWarpSize,
getGPUThreadID, and getGPUNumThreads methods. Adds tests for AMDGCN
codegen for these methods in generic and simd modes. Also changes the
precondition in InitTempAlloca to be slightly more permissive. Useful for
AMDGCN OpenMP codegen where allocas are created with a cast to an
address space.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84260
2020-08-03 05:38:39 +00:00
Eli Friedman 8dfb5d767e [clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary
fptosi/fptoui have similar, but not identical, semantics.  In
particular, the behavior on overflow is different.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit.  (The
corresponding patch for 32-bit is more involved because the equivalent
intrinsics don't exist, as far as I can tell.)

Differential Revision: https://reviews.llvm.org/D84703
2020-07-30 15:41:54 -07:00
Richard Smith 1e7f026c3b PR46908: Emit undef destroying_delete_t as an aggregate RValue.
We previously used a non-aggregate RValue to represent the passed value,
which violated the assumptions of call arg lowering in some cases, in
particular on 32-bit Windows, where we'd end up producing an FCA store
with TBAA metadata, that the IR verifier would reject.
2020-07-30 14:50:01 -07:00
Johannes Doerfert ebad64dfe1 [OpenMP][FIX] Consistently use OpenMPIRBuilder if requested
When we use the OpenMPIRBuilder for the parallel region we need to also
use it to get the thread ID (among other things) in the body. This is
because CGOpenMPRuntime::getThreadID() and
CGOpenMPRuntime::emitUpdateLocation implicitly assumes that if they are
called from within a parallel region there is a certain structure to the
code and certain members of the OMPRegionInfo are initialized. It might
make sense to initialize them even if we use the OpenMPIRBuilder but we
would preferably get rid of such state instead.

Bug reported by Anchu Rajendran Sudhakumari.

Depends on D82470.

Reviewed By: anchu-rajendran

Differential Revision: https://reviews.llvm.org/D82822
2020-07-30 10:19:40 -05:00
Johannes Doerfert 19756ef53a [OpenMP][IRBuilder] Support allocas in nested parallel regions
We need to keep track of the alloca insertion point (which we already
communicate via the callback to the user) as we place allocas as well.

Reviewed By: fghanim, SouraVX

Differential Revision: https://reviews.llvm.org/D82470
2020-07-30 10:19:39 -05:00
Alexey Bataev 622e46156d [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 11:18:33 -04:00
Alexey Bataev b69357c2f4 Revert "[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region."
This reverts commit 142d0d3ed8 to
investigate undefined behavior revealed by buildbots.
2020-07-30 10:57:56 -04:00
Alexey Bataev 142d0d3ed8 [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.
It applies only for global pointers.

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 09:40:05 -04:00
Amy Huang f71deb43ab [DebugInfo] Fix to ctor homing to ignore classes with trivial ctors.
Previously ctor homing was omitting debug info for classes if they
have both trival and nontrivial constructors, but we should only omit debug
info if the class doesn't have any trivial constructors.

retained types list.

bug: https://bugs.llvm.org/show_bug.cgi?id=46537

Differential Revision: https://reviews.llvm.org/D84870
2020-07-29 19:55:20 -07:00
Arthur Eubanks 71d0a2b8a3 [DFSan][NewPM] Port DataFlowSanitizer to NewPM
Reviewed By: ychen, morehouse

Differential Revision: https://reviews.llvm.org/D84707
2020-07-29 10:19:15 -07:00
Joel E. Denny 9f2f3b9de6 [OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
next patch in this series implements OpenMP runtime support.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84711
2020-07-29 12:18:45 -04:00
Alexey Bader 8d27be8dba [OpenCL] Add global_device and global_host address spaces
This patch introduces 2 new address spaces in OpenCL: global_device and global_host
which are a subset of a global address space, so the address space scheme will be
looking like:

```
generic->global->host
                          ->device
             ->private
             ->local
constant
```

Justification: USM allocations may be associated with both host and device memory. We
want to give users a way to tell the compiler the allocation type of a USM pointer for
optimization purposes. (Link to the Unified Shared Memory extension:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc)

Before this patch USM pointer could be only in opencl_global
address space, hence a device backend can't tell if a particular pointer
points to host or device memory. On FPGAs at least we can generate more
efficient hardware code if the user tells us where the pointer can point -
being able to distinguish between these types of pointers at compile time
allows us to instantiate simpler load-store units to perform memory
transactions.

Patch by Dmitry Sidorov.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D82174
2020-07-29 17:24:53 +03:00
Thomas Lively 11bb7eef41 [WebAssembly] Remove intrinsics for SIMD widening ops
Instead, pattern match extends of extract_subvectors to generate
widening operations. Since extract_subvector is not a legal node, this
is implemented via a custom combine that recognizes extract_subvector
nodes before they are legalized. The combine produces custom ISD nodes
that are later pattern matched directly, just like the intrinsic was.

Also removes the clang builtins for these operations since the
instructions can now be generated from portable code sequences.

Differential Revision: https://reviews.llvm.org/D84556
2020-07-28 18:25:55 -07:00
Joel E. Denny 69fc33f0cd Revert "[OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)"
This reverts commit 3c3faae497.

It breaks a number of bots.
2020-07-28 20:30:05 -04:00
Joel E. Denny 3c3faae497 [OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
next patch in this series implements OpenMP runtime support.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84711
2020-07-28 19:15:18 -04:00
Zahira Ammarguellat 80bd6ae13e On Windows build, making the /bigobj flag global , instead of passing it per file.
To avoid having this flag be passed in per/file manner, we are instead
passing it globally.

This fixes this bug: https://bugs.llvm.org/show_bug.cgi?id=46733

Reviewed-by: aaron.ballman, beanz, meinersbur

Differential Revision: https://reviews.llvm.org/D84038
2020-07-28 18:04:36 -05:00
Richard Smith 740a164dec PR46377: Fix dependence calculation for function types and typedef
types.

We previously did not treat a function type as dependent if it had a
parameter pack with a non-dependent type -- such a function type depends
on the arity of the pack so is dependent even though none of the
parameter types is dependent. In order to properly handle this, we now
treat pack expansion types as always being dependent types (depending on
at least the pack arity), and always canonically being pack expansion
types, even in the unusual case when the pattern is not a dependent
type. This does mean that we can have canonical types that are pack
expansions that contain no unexpanded packs, which is unfortunate but
not inaccurate.

We also previously did not treat a typedef type as
instantiation-dependent if its canonical type was not
instantiation-dependent. That's wrong because instantiation-dependence
is a property of the type sugar, not of the type; an
instantiation-dependent type can have a non-instantiation-dependent
canonical type.
2020-07-28 13:23:13 -07:00
Zequan Wu b46176bbb0 Reland [Coverage] Add comment to skipped regions
Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
Add comment to skipped regions so we don't track execution count for lines containing only comments.

Differential Revision: https://reviews.llvm.org/D83592
2020-07-28 13:20:57 -07:00
Richard Smith 6c18f7db73 For PR46800, implement the GCC __builtin_complex builtin.
glibc's implementation of the CMPLX macro uses it (with -fgnuc-version
set to 4.7 or later).
2020-07-22 13:43:10 -07:00
Hans Wennborg 238bbd48c5 Revert abd45154b "[Coverage] Add comment to skipped regions"
This casued assertions during Chromium builds. See comment on the code review

> Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
> Add comment to skipped regions so we don't track execution count for lines containing only comments.
>
> Differential Revision: https://reviews.llvm.org/D84208

This reverts commit abd45154bd and the
follow-up 87d7254733.
2020-07-22 17:09:20 +02:00
Joel E. Denny aa82c40f0a [OpenMP] Implement TR8 `present` map type modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` map type modifier.  The next patch in this series implements
OpenMP runtime support.

This patch does not attempt to implement TR8 sec. 2.22.7.1 "map
Clause", p. 319, L14-16:

> If a map clause with a present map-type-modifier is present in a map
> clause, then the effect of the clause is ordered before all other
> map clauses that do not have the present modifier.

Compare to L10-11, which Clang does not appear to implement yet:

> For a given construct, the effect of a map clause with the to, from,
> or tofrom map-type is ordered before the effect of a map clause with
> the alloc, release, or delete map-type.

This patch also does not implement the `present` implicit-behavior for
`defaultmap` or the `present` motion-modifier for `target update`.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83061
2020-07-22 10:15:32 -04:00
Sjoerd Meijer 5567c62afa [Matrix] Add LowerMatrixIntrinsics to the NPM
Pass LowerMatrixIntrinsics wasn't running yet running under the new pass
manager, and this adds LowerMatrixIntrinsics to the pipeline (to the
same place as where it is running in the old PM).

Differential Revision: https://reviews.llvm.org/D84180
2020-07-22 09:47:53 +01:00
David Blaikie 36036aa70e Reapply "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression"
Reapply 49e5f603d4
which had been reverted in c94332919b.

Originally reverted because I hadn't updated it in quite a while when I
got around to committing it, so there were a bunch of missing changes to
new code since I'd written the patch.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D76646
2020-07-21 20:57:12 -07:00
Zequan Wu abd45154bd [Coverage] Add comment to skipped regions
Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
Add comment to skipped regions so we don't track execution count for lines containing only comments.

Differential Revision: https://reviews.llvm.org/D84208
2020-07-21 17:34:18 -07:00
Wang, Pengfei 18581fd2c4 [CFE] Add nomerge function attribute to inline assembly.
Sometimes we also want to avoid merging inline assembly. This patch add
the nomerge function attribute to inline assembly.

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D84225
2020-07-22 08:22:58 +08:00
Alexey Bataev 13bfe4b226 [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.
Summary:
Need to avoid an optimization for base pointer mapping for target data
directives.

Reviewers: jdoerfert, ye-luo

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84182
2020-07-21 15:48:32 -04:00
Arthur Eubanks b13b858182 [NewPM] Support optnone under new pass manager
OptNoneInstrumentation is part of StandardInstrumentations. It skips
functions (or loops) that are marked optnone.

The feature of skipping optional passes for optnone functions under NPM
is gated on a -enable-npm-optnone flag. Currently it is by default
false. That is because we still need to mark all required passes to be
required. Otherwise optnone functions will start having incorrect
semantics.  After that is done in following changes, we can remove the
flag and always enable this.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D83519
2020-07-21 09:53:43 -07:00
Saiyedul Islam fc7d2908ab [OpenMP] Use common interface to access GPU Grid Values
Use common interface for accessing target specific GPU grid values in NVPTX
OpenMP codegen as proposed in https://reviews.llvm.org/D80917

Originally authored by Greg Rodgers (@gregrodgers).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D83492
2020-07-21 05:25:46 +00:00
Logan Smith 8b6179f48c [NFC] Add missing 'override's 2020-07-20 14:39:36 -07:00
Joel E. Denny cbf64b5834 [OpenMP] Fix map clause for unused var: don't ignore it
For example, without this patch:

```
 $ cat test.c
 int main() {
   int x[3];
   #pragma omp target map(tofrom:x[0:3])
 #ifdef USE
   x[0] = 1
 #endif
   ;
   return 0;
 }
 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c
 $ grep '^@.offload_maptypes' test.ll
 $ echo $?
 1
 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c \
         -DUSE
 $ grep '^@.offload_maptypes' test.ll
 @.offload_maptypes = private unnamed_addr constant [1 x i64] [i64 35]
```

With this patch, both greps produce the same result.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83922
2020-07-17 21:37:27 -04:00
Michele Scandale 53880b8cb9 [CMake] Make `intrinsics_gen` dependency unconditional.
The `intrinsics_gen` target exists in the CMake exports since r309389
(see LLVMConfig.cmake.in), hence projects can depend on `intrinsics_gen`
even it they are built separately from LLVM.

Reviewed By: MaskRay, JDevlieghere

Differential Revision: https://reviews.llvm.org/D83454
2020-07-17 16:43:17 -07:00
Xiangling Liao ec6ada6264 [AIX] report_fatal_error on `-fregister_global_dtors_with_atexit` for static init
On AIX, the semantic of global_dtors contains __sterm functions associated with C++
cleanup actions and user-declared __attribute__((destructor)) functions. We should
never merely register __sterm with atexit(), so currently
-fregister_global_dtors_with_atexit does not work well on AIX: It would cause
finalization actions to not occur when unloading shared libraries.  We need to figure
out a way to handle that when we start supporting user-declared
__attribute__((destructor)) functions.

Currently we report_fatal_error on this option temporarily.

Differential Revision: https://reviews.llvm.org/D83974
2020-07-17 16:14:49 -04:00
Saiyedul Islam c7562e77b3 [OpenMP][NFC] Generalize CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU
Refactors CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU to make it a
generalization for OpenMP GPU Codegen. Target specific specialized
methods for NVPTX are defined in class CGOpenMPRuntimeNVPTX. This
paves the way for a clean and maintainable extension to more GPU
targets for OpenMP Codegen.

For original author (git blame) list of CGOpenMPRuntimeGPU code,
look in history of CGOpenMPRuntimeNVPTX.cpp and .h, after this commit.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83723
2020-07-17 14:38:04 +00:00
Eric Christopher 7bfaa40086 Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753.

An SROA change soon may obviate some of these problems.

This reverts commit 8d09f20798.
2020-07-16 11:54:04 -07:00
George Rokos fc47c0e0a6 [clang] Fix compilation warnings in OpenMP declare mapper codegen.
This patch fixes the compilation warnings that L is not a reference.
Thanks to Lingda Li for providing the patch.

Differential Revision: https://reviews.llvm.org/D83959
2020-07-16 11:04:12 -07:00
Xiangling Liao 69f3378ad6 [AIX]Generate debug info for static init related functions
Set the debug location for static init related functions(__dtor
and __finalize) so we can generate valid debug info on AIX by invoking
-g with clang or -debug-info-kind=limited with clang_cc1.

This also works for any other future targets who may use sinit and
sterm functions for static initialization, where a direct call to
dtor will be generated within finalize function body.

This patch also aims at validating that the debug info generated
is correct for AIX sinit related functions.

Differential Revision: https://reviews.llvm.org/D83702
2020-07-16 10:43:10 -04:00
George Rokos 537b16e9b8 [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime
This patch implements the code generation to use OpenMP 5.0 declare mapper (a.k.a. user-defined mapper) constructs.
Patch written by Lingda Li.

Differential Revision: https://reviews.llvm.org/D67833
2020-07-15 18:11:43 -07:00
Akira Hatanaka ed6b578040 [CodeGen] Emit a call instruction instead of an invoke if the called
llvm function is marked nounwind

This fixes cases where an invoke is emitted, despite the called llvm
function being marked nounwind, because ConstructAttributeList failed to
add the attribute to the attribute list. llvm optimization passes turn
invokes into calls and optimize away the exception handling code, but
it's better to avoid emitting the code in the front-end if the called
function is known not to raise an exception.

Differential Revision: https://reviews.llvm.org/D83906
2020-07-15 14:47:45 -07:00
Alexey Bataev 41d0af0074 [OPENMP]Fix PR46593: Reduction initializer missing construnctor call.
Summary:
If user-defined reductions with the initializer are used with classes,
the compiler misses the constructor call when trying to create a private
copy of the reduction variable.

Reviewers: jdoerfert

Subscribers: cfe-commits, yaxunl, guansong, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83334
2020-07-15 15:14:22 -04:00
Alexey Bataev 9dc327d1b7 [OPENMP]Fix PR46688: cast the type of the allocated variable to the initial one.
Summary:
If the original variable is marked for allocation in the different
address space using #pragma omp allocate, need to cast the allocated
variable to its original type with the original address space.
Otherwise, the compiler may crash trying to bitcast the type of the new
allocated variable to the original type in some cases, like passing this
variable as an argument in function calls.

Reviewers: jdoerfert

Subscribers: jholewinski, cfe-commits, yaxunl, guansong, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83696
2020-07-15 14:54:19 -04:00
Tim Northover 9697a9e2d3 Fix typo in identifier in assert. 2020-07-15 09:57:53 +01:00
Tim Northover 5165b2b5fd AArch64+ARM: make LLVM consider system registers volatile.
Some of the system registers readable on AArch64 and ARM platforms
return different values with each read (for example a timer counter),
these shouldn't be hoisted outside loops or otherwise interfered with,
but the normal @llvm.read_register intrinsic is only considered to read
memory.

This introduces a separate @llvm.read_volatile_register intrinsic and
maps all system-registers on ARM platforms to use it for the
__builtin_arm_rsr calls. Registers declared with asm("r9") or similar
are unaffected.
2020-07-15 09:47:36 +01:00
Tyker 8d09f20798 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-07-14 01:05:58 +02:00
Vedant Kumar 8c4a65b9b2 [ubsan] Check implicit casts in ObjC for-in statements
Check that the implicit cast from `id` used to construct the element
variable in an ObjC for-in statement is valid.

This check is included as part of a new `objc-cast` sanitizer, outside
of the main 'undefined' group, as (IIUC) the behavior it's checking for
is not technically UB.

The check can be extended to cover other kinds of invalid casts in ObjC.

Partially addresses: rdar://12903059, rdar://9542496

Differential Revision: https://reviews.llvm.org/D71491
2020-07-13 15:11:18 -07:00
Alexey Bataev 7075c056e9 [OPENMP]Fix compiler crash for target data directive without actual target codegen.
Summary:
Need to privatize addresses of the captured variables when trying to
emit the body of the target data directive in no target codegen mode.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83478
2020-07-13 10:52:24 -04:00
David Blaikie c94332919b Revert "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression"
Broke buildbots since I hadn't updated this patch in a while. Sorry for
the noise.

This reverts commit 49e5f603d4.
2020-07-12 20:29:19 -07:00
David Blaikie 49e5f603d4 Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression
There is a version that just tests (also called
isIntegerConstantExpression) & whereas this version is specifically used
when the value is of interest (a few call sites were actually refactored
to calling the test-only version) so let's make the API look more like
it.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D76646
2020-07-12 19:43:24 -07:00
Craig Topper b4dbb37f32 [X86] Rename X86_CPU_TYPE_COMPAT_ALIAS/X86_CPU_TYPE_COMPAT/X86_CPU_SUBTYPE_COMPAT macros. NFC
Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions
that are no longer needed.

We now only use these macros in places where we need compatibility
with libgcc/compiler-rt. So we don't need to call out _COMPAT
specifically.
2020-07-12 17:00:24 -07:00
Ten Tzen 66f1dcd872 [Windows SEH] Fix the frame-ptr of a nested-filter within a _finally
This change fixed a SEH bug (exposed by test58 & test61 in MSVC test xcpt4u.c);
when an Except-filter is located inside a finally, the frame-pointer generated today
via intrinsic @llvm.eh.recoverfp is the frame-pointer of the immediate
parent _finally, not the frame-ptr of outermost host function.

The fix is to retrieve the Establisher's frame-pointer that was previously saved in
parent's frame.
The prolog of a filter inside a _finally should be like code below:

%0 = call i8* @llvm.eh.recoverfp(i8* bitcast (@"?fin$0@0@main@@"), i8*%frame_pointer)
%1 = call i8* @llvm.localrecover(i8* bitcast (@"?fin$0@0@main@@"), i8*%0, i32 0)
%2 = bitcast i8* %1 to i8**
%3 = load i8*, i8** %2, align 8

Differential Revision: https://reviews.llvm.org/D77982
2020-07-12 01:37:56 -07:00
Johannes Doerfert c98699582a [OpenMP][NFC] Remove unused (always fixed) arguments
There are various runtime calls in the device runtime with unused, or
always fixed, arguments. This is bad for all sorts of reasons. Clean up
two before as we match them in OpenMPOpt now.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83268
2020-07-11 00:51:51 -05:00
Yaxun (Sam) Liu 849d4405f5 [HIP] Fix rocm detection
Do not detect device library by default in rocm detector.
Only detect device library in Rocm and HIP toolchain.

Separate detection of HIP runtime and Rocm device library.

Detect rocm path by version file in host toolchains.

Also added detecting rocm version and printing rocm
installation path and version with -v.

Fixed include path and device library detection for
ROCm 3.5.

Added --hip-version option. Renamed --hip-device-lib-path
to --rocm-device-lib-path.

Fixed default value for -fhip-new-launch-api.

Added default -std option for HIP.

Differential Revision: https://reviews.llvm.org/D82930
2020-07-10 23:20:15 -04:00
Akira Hatanaka 3a5617c02e Fix build error 2020-07-10 17:40:37 -07:00
Akira Hatanaka e9bf0a710c [CodeGen] Store the return value of the target function call to the
thunk's return value slot directly when the return type is an aggregate
instead of doing so via a temporary

This fixes PR45997 (https://bugs.llvm.org/show_bug.cgi?id=45997), which
is caused by a bug that has existed since we started passing and
returning C++ structs with ObjC strong pointer members (see
https://reviews.llvm.org/D44908) or structs annotated with trivial_abi
directly.

rdar://problem/63740936

Differential Revision: https://reviews.llvm.org/D82513
2020-07-10 17:24:13 -07:00
Aaron Ballman 006c49d890 Change behavior with zero-sized static array extents
Currently, Clang previously diagnosed this code by default:
  void f(int a[static 0]);
saying that "static has no effect on zero-length arrays", which was
accurate.

However, static array extents require that the caller of the function
pass a nonnull pointer to an array of *at least* that number of
elements, but it can pass more (see C17 6.7.6.3p6). Given that we allow
zero-sized arrays as a GNU extension and that it's valid to pass more
elements than specified by the static array extent, we now support
zero-sized static array extents with the usual semantics because it can
be useful in cases like:

  void my_bzero(char p[static 0], int n);
  my_bzero(&c+1, 0); //ok
  my_bzero(t+k,n-k); //ok, pattern from actual code
2020-07-10 15:58:11 -04:00
Zequan Wu 1fbb719470 [LPM] Port CGProfilePass from NPM to LPM
Reviewers: hans, chandlerc!, asbirlea, nikic

Reviewed By: hans, nikic

Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D83013
2020-07-10 09:04:51 -07:00
Ulrich Weigand 4c5a93bd58 [ABI] Handle C++20 [[no_unique_address]] attribute
Many platform ABIs have special support for passing aggregates that
either just contain a single member of floatint-point type, or else
a homogeneous set of members of the same floating-point type.

When making this determination, any extra "empty" members of the
aggregate type will typically be ignored.  However, in C++ (at least
in all prior versions), no data member would actually count as empty,
even if it's type is an empty record -- it would still be considered
to take up at least one byte of space, and therefore make those ABI
special cases not apply.

This is now changing in C++20, which introduced the [[no_unique_address]]
attribute.  Members of empty record type, if they also carry this
attribute, now do *not* take up any space in the type, and therefore
the ABI special cases for single-element or homogeneous aggregates
should apply.

The C++ Itanium ABI has been updated accordingly, and GCC 10 has
added support for this new case.  This patch now adds support to
LLVM.  This is cross-platform; it affects all platforms that use
the single-element or homogeneous aggregate ABI special case and
implement this using any of the following common subroutines
in lib/CodeGen/TargetInfo.cpp:
  isEmptyField
  isEmptyRecord
  isSingleElementStruct
  isHomogeneousAggregate
2020-07-10 14:01:05 +02:00
Fangrui Song c025bdf25a Revert D83013 "[LPM] Port CGProfilePass from NPM to LPM"
This reverts commit c92a8c0a0f.

It breaks builds and has unaddressed review comments.
2020-07-09 13:34:04 -07:00
Zequan Wu c92a8c0a0f [LPM] Port CGProfilePass from NPM to LPM
Reviewers: hans, chandlerc!, asbirlea, nikic

Reviewed By: hans, nikic

Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D83013
2020-07-09 13:03:42 -07:00
cchen 2da9572a9b [OPENMP50] extend array section for stride (Parsing/Sema/AST)
Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: yaxunl, guansong, arphaman, sstefan1, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82800
2020-07-09 13:28:51 -05:00
Anatoly Trosinenko 67422e4294 [MSP430] Align the _Complex ABI with current msp430-gcc
Assembler output is checked against msp430-gcc 9.2.0.50 from TI.

Reviewed By: asl

Differential Revision: https://reviews.llvm.org/D82646
2020-07-09 18:28:48 +03:00
sstefan1 6aab27ba85 [OpenMPIRBuilder][Fix] Move llvm::omp::types to OpenMPIRBuilder.
Summary:
D82193 exposed a problem with global type definitions in
`OMPConstants.h`. This causes a race when running in thinLTO mode.
Types now live inside of OpenMPIRBuilder to prevent this from happening.

Reviewers: jdoerfert

Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D83176
2020-07-08 17:23:55 +02:00
Ulrich Weigand 80a1b95b8e [SystemZ ABI] Allow class types in GetSingleElementType
The SystemZ ABI specifies that aggregate types with just a single
member of floating-point type shall be passed as if they were just
a scalar of that type.  This applies to both struct and class types
(but not unions).

However, the current ABI support code in clang only checks this
case for struct types, which means that for class types, generated
code does not adhere to the platform ABI.

Fixed by accepting both struct and class types in the
SystemZABIInfo::GetSingleElementType routine.
2020-07-07 19:56:19 +02:00
Jennifer Yu 6cf0dac1ca orrectly generate invert xor value for Binary Atomics of int size > 64
When using __sync_nand_and_fetch with __int128, a problem is found that
the wrong value for the 'invert' value gets emitted to the xor in case
where the int size is greater than 64 bits.

This is because uses of llvm::ConstantInt::get which zero extends the
greater than 64 bits, so instead -1 that we require, it end up
getting 18446744073709551615

This patch replaces the call to llvm::ConstantInt::get with the call
to llvm::Constant::getAllOnesValue which works for all integer types.

Reviewers: jfp, erichkeane, rjmccall, hfinkel

Differential Revision: https://reviews.llvm.org/D82832
2020-07-07 10:20:14 -07:00
Wouter van Oortmerssen 16d83c395a [WebAssembly] Added 64-bit memory.grow/size/copy/fill
This covers both the existing memory functions as well as the new bulk memory proposal.
Added new test files since changes where also required in the inputs.

Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit.

Differential Revision: https://reviews.llvm.org/D82821
2020-07-06 12:49:50 -07:00
Chuanqi Xu 8849831d55 [Coroutines] Warning if return type of coroutine_handle::address is not void*
User can own a version of coroutine_handle::address() whose return type is not
void* by using template specialization for coroutine_handle<> for some
promise_type.

In this case, the codes may violate the capability with existing async C APIs
that accepted a void* data parameter which was then passed back to the
user-provided callback.

Patch by ChuanqiXu

Differential Revision: https://reviews.llvm.org/D82442
2020-07-06 13:46:01 +08:00
Roman Lebedev 7ea46aee36
Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
Assume bundle can have more than one entry with the same name,
but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses
getOperandBundle("align"), which internally assumes that it isn't the
case, and happily crashes otherwise.

Minimal reduced reproducer: run `opt -alignment-from-assumptions` on

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%0 = type { i64, %1*, i8*, i64, %2, i32, %3*, i8* }
%1 = type opaque
%2 = type { i8, i8, i16 }
%3 = type { i32, i32, i32, i32 }

; Function Attrs: nounwind
define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 {
bb:
  call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ]
  ret i32 0
}

; Function Attrs: nounwind willreturn
declare void @llvm.assume(i1) #1

attributes #0 = { nounwind "reciprocal-estimates"="none" }
attributes #1 = { nounwind willreturn }


This is what we'd have with -mllvm -enable-knowledge-retention

This reverts commit c95ffadb24.
2020-07-04 23:49:23 +03:00
Bruno Ricci 473fbc90d1
[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper
In general there is no way to get to the ASTContext from most AST nodes
(Decls are one of the exception). This will be a problem when implementing
the rest of APValue::dump since we need the ASTContext to dump some kinds of
APValues.

The ASTContext* in ASTDumper and TextNodeDumper is not always non-null.
This is because we still want to be able to use the various dump() functions
in a debugger.

No functional changes intended.

Reverted in fcf4d5e449 since a few dump()
functions in lldb where missed.
2020-07-03 13:59:22 +01:00
Bruno Ricci fcf4d5e449
Revert "[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper"
This reverts commit aa7fd905e4.

I missed some dump() functions.
2020-07-02 19:40:09 +01:00
Bruno Ricci aa7fd905e4
[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper
In general there is no way to get to the ASTContext from most AST nodes
(Decls are one of the exception). This will be a problem when implementing
the rest of APValue::dump since we need the ASTContext to dump some kinds of
APValues.

The ASTContext* in ASTDumper and TextNodeDumper is not always
non-null. This is because we still want to be able to use the various
dump() functions in a debugger.

No functional changes intended.
2020-07-02 19:29:02 +01:00
Alexander Belyaev 2a36f29fce [clang] Re-add deleted forward declaration. 2020-07-02 08:57:48 +02:00
Valentin Clement 2ddba3082c [flang][openmp] Use common Directive and Clause enum from llvm/Frontend
Summary:
This patch is removing the custom enumeration for OpenMP Directives and Clauses and replace them
with the newly tablegen generated one from llvm/Frontend. This is a first patch and some will follow to share the same
infrastructure where possible. The next patch should use the clauses allowance defined in the tablegen file.

Reviewers: jdoerfert, DavidTruby, sscalpone, kiranchandramohan, ichoyjx

Reviewed By: DavidTruby, ichoyjx

Subscribers: jholewinski, cfe-commits, dblaikie, MaskRay, ymandel, ichoyjx, mgorny, yaxunl, guansong, jfb, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm, #flang, #clang

Differential Revision: https://reviews.llvm.org/D82906
2020-07-01 20:58:11 -04:00
zoecarver e7c5da57a5 [CodeGen] Add public function to emit C++ destructor call.
Adds `CodeGen::getCXXDestructorImplicitParam`, to retrieve a C++ destructor's implicit parameter (after the "this" pointer) based on the ABI in the given CodeGenModule.

This will allow other frontends (Swift, for example) to easily emit calls to object destructors with correct ABI semantics and calling convetions.

This is needed for Swift C++ interop. Here's the corresponding Swift change: https://github.com/apple/swift/pull/32291

Differential Revision: https://reviews.llvm.org/D82392
2020-07-01 11:01:23 -07:00
Xun Li 565e37c770 [Coroutines] Fix code coverage for coroutine
Summary:
Previously, source-based coverage analysis does not work properly for coroutine.
This patch adds processing of coroutine body and co_return in the coverage analysis, so that we can handle them properly.
For coroutine body, we should only look at the actual function body and ignore the compiler-generated things; for co_return, we need to terminate the region similar to return statement.
Added a test, and confirms that it now works properly. (without this patch, the statement after the if statement will be treated wrongly)

Reviewers: lewissbaker, modocache, junparser

Reviewed By: modocache

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82928
2020-07-01 10:11:40 -07:00
Erich Keane 2831a317b6 Implement AVX ABI Warning/error
The x86-64 "avx" feature changes how >128 bit vector types are passed,
instead of being passed in separate 128 bit registers, they can be
passed in 256 bit registers.

"avx512f" does the same thing, except it switches from 256 bit registers
to 512 bit registers.

The result of both of these is an ABI incompatibility between functions
compiled with and without these features.

This patch implements a warning/error pair upon an attempt to call a
function that would run afoul of this. First, if a function is called
that would have its ABI changed, we issue a warning.

Second, if said call is made in a situation where the caller and callee
are known to have different calling conventions (such as the case of
'target'), we instead issue an error.

Differential Revision: https://reviews.llvm.org/D82562
2020-07-01 07:14:31 -07:00
Simon Pilgrim 36aaffbf56 Fix Wdocumentation warnings due to outdated parameter list. NFC. 2020-07-01 12:01:18 +01:00
Richard Smith 4eff2beefb [c++20] consteval functions don't get vtable slots.
For the Itanium C++ ABI, this implements the rule added in
https://github.com/itanium-cxx-abi/cxx-abi/pull/83

For the MS C++ ABI, this implements the direction that seemed most
plausible based on personal correspondence with MSVC developers, but is
subject to change as they decide their ABI rule.
2020-06-30 18:22:09 -07:00
Craig Topper 3537939cda [X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI
This replaces the switch statement implementation in the clang's
X86.cpp with a lookup table in X86TargetParser.cpp.

I've used constexpr and copy of the FeatureBitset from
SubtargetFeature.h to store the features in a lookup table.
After the lookup the bitset is translated into strings for use
by the rest of the frontend code.

I had to modify the implementation of the FeatureBitset to avoid
bugs in gcc 5.5 constexpr handling. It seems to not like the
same array entry to be used on the left side and right hand side
of an assignment or &= or |=. I've also used uint32_t instead of
uint64_t and sized based on the X86::CPU_FEATURE_MAX.

I've initialized the features for different CPUs outside of the
table so that we can express inheritance in an adhoc way. This
was one of the big limitations of the switch and we had resorted
to labels and gotos.

Differential Revision: https://reviews.llvm.org/D82731
2020-06-30 12:04:58 -07:00
Francesco Petrogalli 67e4330fac [sve][acle] Implement some of the C intrinsics for brain float.
Summary:
The following intrinsics have been extended to support brain float types:

svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data)
bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data)
bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data)
bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data)
bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svdup[_n]_bf16(bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op)

svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7)
svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index)

svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2)

Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82345
2020-06-29 16:09:08 +00:00
Bevin Hansson fefa34faf5 [CodeGen] Use the common semantic for fixed-point codegen, not the result semantic.
Summary:
Using the result semantic is wrong in some cases, such as
unsigned fixed-point + signed integer. In this case, the
result semantic is unsigned and the common semantic is
signed.

Reviewers: leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82662
2020-06-29 16:22:29 +02:00
Fady Ghanim 80e15b4574 [Clang][OpenMP][OMPBuilder] Moving OMP allocation and cache creation code to OMPBuilderCBHelpers
Summary:
Modified the OMPBuilderCBHelpers in the following ways:
- Moved location of class definition and deleted all constructors
- Moved OpenMP-specific address allocation of local variables
- Moved threadprivate variable creation for the current thread

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79676
2020-06-28 19:04:20 -04:00
Melanie Blower f4aaed3bf1 Reland D81869 "Modify FPFeatures to use delta not absolute settings"
This reverts commit defd43a5b3.
with correction to solve msan report

To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.

Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D81869
2020-06-27 01:34:57 -07:00
Matt Arsenault 9e03bdebc1 AMDGPU: Add llvm.amdgcn.sqrt intrinsic
I spread the GlobalISel test into the regular one, which I've been
avoiding so far.
2020-06-26 15:07:07 -04:00
Melanie Blower defd43a5b3 Revert "Revert "Revert "Modify FPFeatures to use delta not absolute settings"""
This reverts commit 9518763d71.
Memory sanitizer fails in CGFPOptionsRAII::CGFPOptionsRAII dtor
2020-06-26 08:47:04 -07:00
Melanie Blower 9518763d71 Revert "Revert "Modify FPFeatures to use delta not absolute settings""
This reverts commit b55d723ed6.
Reapply Modify FPFeatures to use delta not absolute settings

To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.

Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D81869
2020-06-26 08:00:08 -07:00
Melanie Blower b55d723ed6 Revert "Modify FPFeatures to use delta not absolute settings"
This reverts commit 3a748cbf86.
I'm reverting this commit because I forgot to format the commit message
propertly. Sorry for the thrash.
2020-06-26 07:52:57 -07:00
Melanie Blower 3a748cbf86 Modify FPFeatures to use delta not absolute settings 2020-06-26 07:41:09 -07:00
Francesco Petrogalli 7200fa38a9 [sve][acle] Add some C intrinsics for brain float types.
Summary:
The following intrinsics has been added:

svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op)

svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices)

svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices)

svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices)

Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82429
2020-06-25 16:31:01 +00:00
Andrew Wock 15edd7aaa7 [FPEnv] PowerPC-specific builtin constrained FP enablement
This change enables PowerPC compiler builtins to generate constrained
floating point operations when clang is indicated to do so.

A couple of possibly unexpected backend divergences between constrained
floating point and regular behavior are highlighted under the test tag
FIXME-CHECK. This may be something for those on the PPC backend to look
at.

Patch by: Drew Wock <drew.wock@sas.com>

Differential Revision: https://reviews.llvm.org/D82020
2020-06-25 11:42:58 -04:00
Alexey Bataev 32ea3397be [OPENMP]Dynamic globalization for parallel target regions.
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.

Reviewers: jdoerfert

Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82324
2020-06-25 08:25:24 -04:00
Tyker c95ffadb24 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-06-25 12:59:44 +02:00
Nigel Perks dc3f8913d2 Fix crash on XCore on unused inline in EmitTargetMetadata
EmitTargetMetadata passed to emitTargetMD a null pointer as returned
from GetGlobalValue, for an unused inline function which has been
removed from the module at that point.

A FIXME in CodeGenModule.cpp commented that the calling code in
EmitTargetMetadata should be moved into the one target that needs it
(XCore). A review comment agreed. So the calling loop has been moved
into the XCore subclass. The check for null is done in that loop.

Differential Revision: https://reviews.llvm.org/D77068
2020-06-24 12:48:17 -07:00
Michael Liao ebc9e0f1f0 Fix coding style. NFC.
- Remove `else` after `return`.
2020-06-24 13:13:42 -04:00
Cullen Rhodes 05e10ee0ae [AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics
Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D82399
2020-06-24 10:06:31 +00:00
Cullen Rhodes fd2c4b8999 [AArch64][SVE] Add bfloat16 support to svlen intrinsic
Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D82186
2020-06-24 10:05:51 +00:00
Kazushi (Jam) Marukawa 96d4ccf00c [VE] Clang toolchain for VE
Summary:
This patch enables compilation of C code for the VE target with Clang.

Differential Revision: https://reviews.llvm.org/D79411
2020-06-24 10:12:09 +02:00
Eli Friedman bf8b63ed29 [clang codegen] Fix alignment of "Address" for incomplete array pointer.
The code was assuming all incomplete types don't have meaningful
alignment, but incomplete arrays do have meaningful alignment.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45710

Differential Revision: https://reviews.llvm.org/D79052
2020-06-23 17:16:17 -07:00
David Blaikie 4935419d77 Remove clang::Codegen::EHPadEndScope as unused
Unused since r255423 / D15140 /  4e52d6f811

Found indirectly by assessing -debug-info-kind=constructors and
observing the EHPadEndScope type was never emitted because the
constructor is never called. (all credit to Amy Huang for identifying
this issue)
2020-06-23 15:18:49 -07:00
Mikhail Maltsev 3f353a2e5a [BFloat] Add convert/copy instrinsic support
This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

Specifically it adds intrinsic support in clang and llvm for Arm and AArch64.

The bfloat type, and its properties are specified in the Arm Architecture Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
  - Alexandros Lamprineas
  - Luke Cheeseman
  - Mikhail Maltsev
  - Momchil Velikov
  - Luke Geeson

Differential Revision: https://reviews.llvm.org/D80928
2020-06-23 14:27:05 +00:00
Alexey Bataev cb90e6a7c0 [OPENMP50]Codegen for scan directives in parallel for simd regions.
Summary:
Added codegen for scan directives in parallel for simd regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp parallel for simd reduction(inscan, op : ...)
for() {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
 #pragma omp parallel
{
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82115
2020-06-23 08:41:11 -04:00
Mikhail Maltsev 9c579540ff [ARM] BFloat MatMul Intrinsics&CodeGen
Summary:
This patch adds support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman
 - Simon Tatham

Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM

Reviewed By: MarkMurrayARM

Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81740
2020-06-23 12:06:37 +00:00
Sander de Smalen 121e585ec8 [AArch64][SVE] ACLE: Add bfloat16 to struct load/stores.
This patch contains:
- Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4.
- New bfloat16 ACLE builtins for svld(2|3|4)[_vnum] and svst(2|3|4)[_vnum]

Reviewers: stuij, efriedma, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Tags: #clang, #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D82187
2020-06-23 12:12:35 +01:00
Craig Topper 0dfc8e1837 [X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI
This was orignally done so we could separate the compatibility
values and the llvm internal only features into a separate entries
in the feature array. This was needed when we explicitly had to
convert the feature into the proper 32-bit chunk at every reference
and we didn't want things moving around.

Now everything is in an array and we have helper funtions or macros
to convert encoding to index. So we renumbering is no longer an
issue.
2020-06-22 11:46:21 -07:00
Mikhail Maltsev 3a4feb1d53 [ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors
Currently, in order to extract an element from a bf16 vector, we cast
the vector to an i16 vector, perform the extraction, and cast the result to
bfloat. This behavior was copied from the old fp16 implementation.

The goal of this patch is to achieve optimal code generation for lane
copying intrinsics in a subsequent patch (LLVM fails to fold certain
combinations of bitcast, insertelement, extractelement and
shufflevector instructions leading to the generation of suboptimal code).

Differential Revision: https://reviews.llvm.org/D82206
2020-06-22 17:35:43 +00:00
Zhi Zhuang 37fb860301 Add support of __builtin_expect_with_probability
Add a new builtin-function __builtin_expect_with_probability and
intrinsic llvm.expect.with.probability.
The interface is __builtin_expect_with_probability(long expr, long
expected, double probability).
It is mainly the same as __builtin_expect besides one more argument
indicating the probability of expression equal to expected value. The
probability should be a constant floating-point expression and be in
range [0.0, 1.0] inclusive.
It is similar to builtin-expect-with-probability function in GCC
built-in functions.

Differential Revision: https://reviews.llvm.org/D79830
2020-06-22 10:21:28 -07:00