Commit Graph

14592 Commits

Author SHA1 Message Date
Nico Weber d7ec48d71b [clang] accept -fsanitize-ignorelist= in addition to -fsanitize-blacklist=
Use that for internal names (including the default ignorelists of the
sanitizers).

Differential Revision: https://reviews.llvm.org/D101832
2021-05-04 10:24:00 -04:00
serge-sans-paille b83b23275b Introduce -Wreserved-identifier
Warn when a declaration uses an identifier that doesn't obey the reserved
identifier rule from C and/or C++.

Differential Revision: https://reviews.llvm.org/D93095
2021-05-04 11:19:01 +02:00
Alex Lorenz 2669abaecf [clang][CodeGen] Use llvm::stable_sort for multi version resolver options
The use of llvm::sort causes periodic failures on the bot with EXPENSIVE_CHECKS enabled,
as the regular sort pre-shuffles the array in the expensive checks mode, leading to a
non-deterministic test result which causes the CodeGenCXX/attr-cpuspecific-outoflinedefs.cpp
testcase to fail on the bot (http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/).
2021-05-03 20:07:00 -07:00
Valentin Clement 63f8226f25 [OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions
Add function to create the offload_maptypes and the offload_mapnames globals. These two functions
are used in clang. They will be used in the Flang/MLIR lowering as well.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101503
2021-05-03 15:42:32 -04:00
Giorgis Georgakoudis a27ca15dd0 [OpenMP] Fix non-determinism in clang task codegen
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101739
2021-05-03 10:34:38 -07:00
Saurabh Jha 696becbd13 [Matrix] Remove bitcast when casting between matrices of the same size
In matrix type casts, we were doing bitcast when the matrices had the same size. This was incorrect and this patch fixes that.
Also added some new CodeGen tests for signed <-> usigned conversions

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D101754
2021-05-03 15:31:43 +01:00
Yaxun (Sam) Liu c58a6a6fb4 [HIP] Fix device lib selection
Choose optimized device lib bitcode by fp options
for performance.

Reviewed by: Artem Belevich, Fangrui Song

Differential Revision: https://reviews.llvm.org/D101654
2021-05-01 20:31:11 -04:00
Yaxun (Sam) Liu 0175999805 [AMDGPU] Add options -mamdgpu-ieee -mno-amdgpu-ieee
AMDGPU backend need to know whether floating point opcodes that support exception
flag gathering quiet and propagate signaling NaN inputs per IEEE754-2008, which is
conveyed by a function attribute "amdgpu-ieee". "amdgpu-ieee"="false" turns this off.
Without this function attribute backend assumes it is on for compute functions.

-mamdgpu-ieee and -mno-amdgpu-ieee are added to Clang to control this function attribute.
By default it is on. -mno-amdgpu-ieee requires -fno-honor-nans or equivalent.

Reviewed by: Matt Arsenault

Differential Revision: https://reviews.llvm.org/D77013
2021-05-01 09:02:55 -04:00
Nemanja Ivanovic c3da07d216 [PowerPC] Provide fastmath sqrt and div functions in altivec.h
This adds the long overdue implementations of these functions
that have been part of the ABI document and are now part of
the "Power Vector Intrinsic Programming Reference" (PVIPR).

The approach is to add new builtins and to emit code with
the fast flag regardless of whether fastmath was specified
on the command line.

Differential revision: https://reviews.llvm.org/D101209
2021-04-30 19:17:48 -05:00
Joel E. Denny 82e99f5035 [OpenMP] Fix second debug name from map clause
This patch fixes a bug from D89802.  For example, without it, Clang
generates x as the debug map name for both x and y in the following
example:

```
 #pragma omp target map(to: x, y)
 x = y = 1;
```

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D101564
2021-04-30 16:26:59 -04:00
Florian Hahn 6c31295493
[clang] Refactor mustprogress handling, add it to all loops in c++11+.
Currently Clang does not add mustprogress to inifinite loops with a
known constant condition, matching C11 behavior. The forward progress
guarantee in C++11 and later should allow us to add mustprogress to any
loop (http://eel.is/c++draft/intro.progress#1).

This allows us to simplify the code dealing with adding mustprogress a
bit.

Reviewed By: aaron.ballman, lebedev.ri

Differential Revision: https://reviews.llvm.org/D96418
2021-04-30 14:13:47 +01:00
Akira Hatanaka 2e1d9ebd46 [ObjC][ARC] Don't enter the cleanup scope if the initializer expression
isn't an ExprWithCleanups

This patch fixes a bug where a temporary ObjC pointer is released before
the end of the full expression.

This fixes PR50043.

rdar://77030453

Differential Revision: https://reviews.llvm.org/D101502
2021-04-29 16:04:30 -07:00
Chirag Khandelwal c204106188 [Clang][OpenMP] Frontend work for sections - D89671
This patch is child of D89671, contains the clang
implementation to use the OpenMP IRBuilder's section
construct.

Co-author: @anchu-rajendran

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91054
2021-04-29 19:52:27 +05:30
Evgeny Leviant 6a0283d0d2 [NewPM] Add an option to dump pass structure
Patch adds -debug-pass-structure option to dump pass structure when
new pass manager is used.

Differential revision: https://reviews.llvm.org/D99599
2021-04-29 10:29:42 +03:00
Dan Liew 1bbbcff99d [NFC] Rename SanitizeAddressDtorKind codegen opt to not have `Kind` suffix.
This is post commit follow up based on discussions in
https://reviews.llvm.org/D101122.

Differential Revision: https://reviews.llvm.org/D101490

(cherry picked from commit f4c7e82d1b21e637c4e0c53125b126c407d8bdbf)
2021-04-28 18:37:16 -07:00
Ryan Santhirarajan 0395f9e70b [ARM] Neon Polynomial vadd Intrinsic fix
The Neon vadd intrinsics were added to the ARMSIMD intrinsic map,
however due to being defined under an AArch64 guard in arm_neon.td,
were not previously useable on ARM. This change rectifies that.

It is important to note that poly128 is not valid on ARM, thus it was
extracted out of the original arm_neon.td definition and separated
for the sake of AArch64.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D100772
2021-04-28 11:59:40 -07:00
Nico Weber 0f1137ba79 [clang/Basic] Make TargetInfo.h not use DataLayout again
Reverts parts of https://reviews.llvm.org/D17183, but keeps the
resetDataLayout() API and adds an assert that checks that datalayout string and
user label prefix are in sync.

Approach 1 in https://reviews.llvm.org/D17183#2653279
Reduces number of TUs build for 'clang-format' from 689 to 575.

I also implemented approach 2 in D100764. If someone feels motivated
to make us use DataLayout more, it's easy to revert this change here
and go with D100764 instead. I don't plan on doing more work in this
area though, so I prefer going with the smaller, more self-consistent change.

Differential Revision: https://reviews.llvm.org/D100776
2021-04-27 22:26:10 -04:00
Yonghong Song a2a3ca8d97 BPF: emit debuginfo for Function of DeclRefExpr if requested
Commit e3d8ee35e4 ("reland "[DebugInfo] Support to emit debugInfo
for extern variables"") added support to emit debugInfo for
extern variables if requested by the target. Currently, only
BPF target enables this feature by default.

As BPF ecosystem grows, callback function started to get
support, e.g., recently bpf_for_each_map_elem() is introduced
(https://lwn.net/Articles/846504/) with a callback function as an
argument. In the future we may have something like below as
a demonstration of use case :
    extern int do_work(int);
    long bpf_helper(void *callback_fn, void *callback_ctx, ...);
    long prog_main() {
        struct { ... } ctx = { ... };
        return bpf_helper(&do_work, &ctx, ...);
    }
Basically bpf helper may have a callback function and the
callback function is defined in another file or in the kernel.
In this case, we would like to know the debuginfo types for
do_work(), so the verifier can proper verify the safety of
bpf_helper() call.

For the following example,
    extern int do_work(int);
    long bpf_helper(void *callback_fn);
    long prog() {
        return bpf_helper(&do_work);
    }

Currently, there is no debuginfo generated for extern function do_work().
In the IR, we have,
    ...
    define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 {
    entry:
      %call = tail call i64 @bpf_helper(i8* bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11
      ret i64 %call, !dbg !12
    }
    ...
    declare dso_local i32 @do_work(i32) #1
    ...

This patch added support for the above callback function use case, and
the generated IR looks like below:
    ...
    declare !dbg !17 dso_local i32 @do_work(i32) #1
    ...
    !17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2)
    !18 = !DISubroutineType(types: !19)
    !19 = !{!20, !20}
    !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

The TargetInfo.allowDebugInfoForExternalVar is renamed to
TargetInfo.allowDebugInfoForExternalRef as now it guards
both extern variable and extern function debuginfo generation.

Differential Revision: https://reviews.llvm.org/D100567
2021-04-26 16:53:25 -07:00
Alexey Bader 7818906ca1 [SYCL] Implement SYCL address space attributes handling
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.

Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.

Differential Revision: https://reviews.llvm.org/D89909
2021-04-26 13:44:10 +03:00
Levy Hsu 8cf54c7ff5 [RISCV] [1/2] Add IR intrinsic for Zbe extension
RV32/64:
bcompress
bdecompress

RV64 ONLY:
bcompressw
bdecompressw

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D101143
2021-04-25 19:14:34 -07:00
Thomas Lively 502f54049d [WebAssembly] Finalize wasm_simd128.h intrinsics
Adds new intrinsics for instructions that are in the final SIMD spec but did not
previously have intrinsics. Also updates the names of existing intrinsics to
reflect the final names of the underlying instructions in the spec. Keeps the
old names as deprecated functions to ease the transition to the new names.

Differential Revision: https://reviews.llvm.org/D101112
2021-04-23 13:37:27 -07:00
Fangrui Song a92dbadffe [OpenMP] Fix -Wdeprecated-copy 2021-04-23 10:49:19 -07:00
Dávid Bolvanský 2cae7025c1 Reland "[Clang] Propagate guaranteed alignment for malloc and others"
This relands commit 6914a0ed2b. Crash in InstCombine was fixed.
2021-04-23 14:05:57 +02:00
Dávid Bolvanský 6914a0ed2b Revert "[Clang] Propagate guaranteed alignment for malloc and others"
This reverts commit c2297544c0. Some buildbots are broken.
2021-04-23 11:33:33 +02:00
Dávid Bolvanský c2297544c0 [Clang] Propagate guaranteed alignment for malloc and others
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.

Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D100879
2021-04-23 11:07:14 +02:00
Fangrui Song 2786e673c7 [IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all}
The Linux kernel objtool diagnostic `call without frame pointer save/setup`
arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism
introduced in D100251, it's trivial to respect the command line
-m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it.

Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan)
Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan)

Also document the function attribute "frame-pointer" which is long overdue.

Differential Revision: https://reviews.llvm.org/D101016
2021-04-22 18:07:30 -07:00
Levy Hsu b49337bbb9 [RISCV] [1/2] Add IR intrinsic for Zbp extension
RV32/64:
    grev
    grevi
    gorc
    gorci
    shfl
    shfli
    unshfl
    unshfli

RV64 ONLY:
    grevw
    greviw
    gorcw
    gorciw
    shflw
    shfli     (For non-existing shfliw)
    unshfli   (For non-existing unshfliw)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100830
2021-04-22 16:34:51 -07:00
Fangrui Song ef5e7f90ea Temporarily revert the code part of D100981 "Delete le32/le64 targets"
This partially reverts commit 77ac823fd2.

Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934).
Temporarily brings back the code part to give them some time for migration.
2021-04-22 10:18:44 -07:00
Hamza Mahfooz be2277fbf2
[Matrix] Support #pragma clang fp
From https://bugs.llvm.org/show_bug.cgi?id=49739:

Currently, `#pragma clang fp` are ignored for matrix types.

For the code below, the `contract` fast-math flag should be added to the generated call to `llvm.matrix.multiply` and `fadd`

```
typedef float fx2x2_t __attribute__((matrix_type(2, 2)));

void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) {
  #pragma clang fp contract(fast)
  C = A*B + C;
}
```

Reviewed By: fhahn, mibintc

Differential Revision: https://reviews.llvm.org/D100834
2021-04-22 11:45:34 +01:00
Giorgis Georgakoudis a2dbfb6b72 [OpenMP] Simplify offloading parallel call codegen
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.

Reviewed By: jdoerfert, Meinersbur

Differential Revision: https://reviews.llvm.org/D95976
2021-04-21 18:46:07 -07:00
Fangrui Song 77ac823fd2 Delete le32/le64 targets
They are unused now.

Note: NaCl is still used and is currently expected to be needed until 2022-06
(https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html).

Differential Revision: https://reviews.llvm.org/D100981
2021-04-21 18:44:12 -07:00
Fangrui Song 775a9483e5 [IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables
On ELF targets, if a function has uwtable or personality, or does not have
nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module.

Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified.
(i.e. If -g[123], every function gets `.eh_frame`.
This behavior is strange but that is the status quo on GCC and Clang.)

Let's take asan as an example. Other sanitizers are similar.
`asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true,
so every function gets `.eh_frame` if `-g[123]` is specified.
This is the root cause that
`-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame
while
`-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame.

This patch

* sets the nounwind attribute on sanitizer module ctor/dtor.
* let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute.

The "uwtable" mechanism is generic: synthesized functions not cloned/specialized
from existing ones should consider `Function::createWithDefaultAttr` instead of
`Function::create` if they want to get some default attributes which
have more of module semantics.

Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955
https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc.

Differential Revision: https://reviews.llvm.org/D100251
2021-04-21 15:58:20 -07:00
Alexey Bataev 079884225a [OPENMP]Fix PR49698: OpenMP declare mapper causes segmentation fault.
The implicitly generated mappings for allocation/deallocation in mappers
runtime should be mapped as implicit, also no need to clear member_of
flag to avoid ref counter increment. Also, the ref counter should not be
incremented for the very first element that comes from the mapper
function.

Differential Revision: https://reviews.llvm.org/D100673
2021-04-21 10:38:31 -07:00
Erich Keane 0ed613612c Ensure target-multiversioning emits deferred declarations
As reported in PR50025, sometimes we would end up not emitting functions
needed by inline multiversioned variants. This is because we typically
use the 'deferred decl' mechanism to emit these.  However, the variants
are emitted after that typically happens.  This fixes that by ensuring
we re-run deferred decls after this happens. Also, the multiversion
emission is done recursively to ensure that MV functions that require
other MV functions to be emitted get emitted.
2021-04-20 08:10:26 -07:00
Jan Svoboda 6a72ed239c [clang] NFC: Fix range-based for loop warnings related to decl lookup 2021-04-19 18:31:31 +02:00
Xun Li 5faba87938 Revert "[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass"
This reverts commit fa6b54c44a.
The commited patch broke mlir tests. It seems that mlir tests depend on coroutine function properties set in CoroEarly pass.
2021-04-18 17:22:28 -07:00
Xun Li fa6b54c44a [Coroutines] Set presplit attribute in Clang instead of CoroEarly pass
Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining.
The presplit coroutine attributes are set in CoroEarly pass.
However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine.
This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920

To fix this, we set the attributes in the Clang front-end instead of in CoroEarly pass.

Reviewed By: rjmccall, ChuanqiXu

Differential Revision: https://reviews.llvm.org/D100282
2021-04-18 15:41:09 -07:00
Xun Li c0211e8d7d Revert "[Coroutines] Move CoroEarly pass to before AlwaysInliner"
This reverts commit 2b50f5a434.
Forgot to update the description of the commit to sync with phabricator. Going to redo the commit.
2021-04-18 15:38:19 -07:00
Xun Li 2b50f5a434 [Coroutines] Move CoroEarly pass to before AlwaysInliner
Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining.
The presplit coroutine attributes are set in CoroEarly pass.
However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine.
This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920

Differential Revision: https://reviews.llvm.org/D100282
2021-04-18 14:54:04 -07:00
Yaxun (Sam) Liu d5c0f00e21 [CUDA][HIP] Mark device var used by host only
Add device variables to llvm.compiler.used if they are
ODR-used by either host or device functions.

This is necessary to prevent them from being
eliminated by whole-program optimization
where the compiler has no way to know a device
variable is used by some host code.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D98814
2021-04-17 11:25:25 -04:00
Serge Guelton d6de1e1a71 Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from

        if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

to

        if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

Introduce a getValueAsBool that normalize the check, with the following
behavior:

no attributes or attribute set to "false" => return false
attribute set to "true" => return true

Differential Revision: https://reviews.llvm.org/D99299
2021-04-17 08:17:33 +02:00
Thomas Lively 5c729750a6 [WebAssembly] Remove saturating fp-to-int target intrinsics
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead.
This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u},
which are now represented in IR as a saturating truncation to a v2i32 followed by
a concatenation with a zero vector.

Differential Revision: https://reviews.llvm.org/D100596
2021-04-16 12:11:20 -07:00
Alexey Bataev 10c7b9f64f [OPENMP]Fix PR49115: Incorrect results for scan directive.
For combined worksharing directives need to emit the temp arrays outside
of the parallel region and update them in the master thread only.

Differential Revision: https://reviews.llvm.org/D100187
2021-04-16 06:25:35 -07:00
Joshua Haberman 8344675908 Implemented [[clang::musttail]] attribute for guaranteed tail calls.
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.

The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.

Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.

Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D99517
2021-04-15 17:12:21 -07:00
Momchil Velikov f9d932e673 [clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794
2021-04-15 22:58:14 +01:00
Martin Storsjö 8e0f2e89ff [clang] [AArch64] Fix handling of HFAs passed to Windows variadic functions
The documentation says that for variadic functions, all composites
are treated similarly, no special handling of HFAs/HVAs, not even
for the fixed arguments of a variadic function.

Differential Revision: https://reviews.llvm.org/D100467
2021-04-15 22:21:27 +03:00
cchen e0c2125d1d [OpenMP] Added codegen for masked directive
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514
2021-04-15 12:55:07 -05:00
Thomas Lively 6a18cc23ef [WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u}
Removes the builtins and intrinsics used to opt in to using these instructions
and replaces them with normal ISel patterns now that they are no longer
prototypes.

Differential Revision: https://reviews.llvm.org/D100402
2021-04-14 13:43:09 -07:00
Thomas Lively af7925b4dd [WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u}
Add a custom DAG combine and ISD opcode for detecting patterns like

  (uint_to_fp (extract_subvector ...))

before the extract_subvector is expanded to ensure that they will ultimately
lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions
are no longer prototypes and can now be produced via standard IR, this commit
also removes the target intrinsics and builtins that had been used to prototype
the instructions.

Differential Revision: https://reviews.llvm.org/D100425
2021-04-14 10:42:45 -07:00
Thomas Lively af7ab81ce3 [WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops
Now that these instructions are no longer prototypes, we do not need to be
careful about keeping them opt-in and can use the standard LLVM infrastructure
for them. This commit removes the bespoke intrinsics we were using to represent
these operations in favor of the corresponding target-independent intrinsics.
The clang builtins are preserved because there is no standard way to easily
represent these operations in C/C++.

For consistency with the scalar codegen in the Wasm backend, the intrinsic used
to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though
@llvm.roundeven better captures the semantics of the underlying Wasm
instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is
left to a potential future patch.

Differential Revision: https://reviews.llvm.org/D100411
2021-04-14 09:19:27 -07:00
Martin Storsjö 3637c5c8ec [clang] [AArch64] Fix Windows va_arg handling for larger structs
Aggregate types over 16 bytes are passed by reference.

Contrary to the x86_64 ABI, smaller structs with an odd (non power
of two) are padded and passed in registers.

Differential Revision: https://reviews.llvm.org/D100374
2021-04-14 14:51:53 +03:00
Liu, Chen3 1c4108ab66 [i386] Modify the alignment of __m128/__m256/__m512 vector type according i386 abi.
According to i386 System V ABI:

1. when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call.
2. when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call.

The current method of clang passing __m512 parameter are as follow:

1. when target supports avx512, passing it with 64 byte alignment;
2. when target supports avx, passing it with 32 byte alignment;
3. Otherwise, passing it with 16 byte alignment.

Passing __m256 parameter are as follow:

1. when target supports avx or avx512, passing it with 32 byte alignment;
2. Otherwise, passing it with 16 byte alignment.

This pach will passing __m128/__m256/__m512 following i386 System V ABI and
apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks at present.

Differential Revision: https://reviews.llvm.org/D78564
2021-04-14 16:44:54 +08:00
Ben Dunbobbin eae2d4b852 [Windows Itanium][PS4] handle dllimport/export w.r.t vtables/rtti
The existing Windows Itanium patches for dllimport/export
behaviour w.r.t vtables/rtti can't be adopted for PS4 due to
backwards compatibility reasons (see comments on
https://reviews.llvm.org/D90299).

This commit adds our PS4 scheme for this to Clang.

Differential Revision: https://reviews.llvm.org/D93203
2021-04-13 11:41:10 +01:00
Esme-Yi dff922f39b Reland [DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.""
This reverts commit c965e14a12.
2021-04-12 11:05:55 +00:00
Esme-Yi c965e14a12 Revert "[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions."
This reverts commit 62fa9b9388.
2021-04-12 10:36:46 +00:00
Esme-Yi 62fa9b9388 [DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.
Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx.

Reviewed By: aprantl, shchenz

Differential Revision: https://reviews.llvm.org/D99250
2021-04-12 07:42:54 +00:00
yifeng.dongyifeng 3a6a80b641 [Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters.
The first one is the real parameters of the coroutine function, the
other one just for copying parameters to the coroutine frame.

Considering the following c++ code:
```
struct coro {
  ...
};

coro foo(struct test & t) {
  ...
  co_await suspend_always();
    ...
    co_await suspend_always();
    ...
    co_await suspend_always();
}

int main(int argc, char *argv[]) {
  auto c = foo(...);
    c.handle.resume();
      ...
  }
```

Function foo is the standard coroutine function, and it has only
one parameter named t (ignoring this at first),
when we use the llvm code to compile this function, we can get the
following ir:

```
!2921 = distinct !DISubprogram(name: "foo", linkageName:
"_ZN6Object3fooE4test", scope: !2211, file: !45, li\
ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\
nition | DISPFlagOptimized, unit: !44, declaration: !2328,
retainedNodes: !2922)
!2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45,
line: 48, type: !838)
...
!2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags:
DIFlagArtificial)
```
We can find there are two `the same` DIVariable named t in the same
dwarf scope for foo.resume.
And when we try to use llvm-dwarfdump to dump the dwarf info of this
elf, we get the following output:

```
0x00006684:   DW_TAG_subprogram
                DW_AT_low_pc    (0x00000000004013a0)
                DW_AT_high_pc   (0x00000000004013a8)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_object_pointer    (0x0000669c)
                DW_AT_GNU_all_call_sites        (true)
                DW_AT_specification     (0x00005b5c "_ZN6Object3fooE4test")

0x000066a5:     DW_TAG_formal_parameter
                DW_AT_name    ("t")
                DW_AT_decl_file       ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp")
                DW_AT_decl_line       (48)
                DW_AT_type    (0x00004146 "test")

0x000066ba:     DW_TAG_variable
                  DW_AT_name    ("t")
                  DW_AT_type    (0x00004146 "test")
                  DW_AT_artificial      (true)
```
The elf also has two 't' in the same scope.
But unluckily, it might let the debugger
confused. And failed to print parameters for O0 or above.
This patch will make coroutine parameters and move
parameters use the same DIVar and try to fix the problems
that I mentioned before.

Test Plan: check-clang

Reviewed By: aprantl, jmorse

Differential Revision: https://reviews.llvm.org/D97533
2021-04-12 11:10:47 +08:00
Saurabh Jha 71ab6c98a0
[Matrix] Implement C-style explicit type conversions for matrix types.
This implements C-style type conversions for matrix types, as specified
in clang/docs/MatrixTypes.rst.

Fixes PR47141.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D99037
2021-04-10 11:48:41 +01:00
Roman Lebedev 6270b3a1ea
Temporairly revert "[CGCall] Annotate `this` argument with alignment"
As per @jyknight, "It seems like there's a bug with vtable thunks getting the wrong information."
See https://reviews.llvm.org/D99790#2680857, https://godbolt.org/z/MxhYMe1q7

This reverts commit 0aa0458f14.
2021-04-10 10:43:16 +03:00
Ben Shi 4f173c0c42 [clang][AVR] Support variable decorator '__flash'
Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D96853
2021-04-10 11:23:55 +08:00
cchen 1a43fd2769 [OpenMP51] Initial support for masked directive and filter clause
Adds basic parsing/sema/serialization support for the #pragma omp masked
directive.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99995
2021-04-09 14:00:36 -05:00
Soumi Manna 5d7cb79416 RISCVABIInfo::classifyArgumentType: Fix static analyzer warnings with uninitialized variables warnings - NFCI
Differential Revision: https://reviews.llvm.org/D100172
2021-04-09 15:23:32 +01:00
Yaxun (Sam) Liu 25942d7c49 [AMDGPU] Allow relaxed/consume memory order for atomic inc/dec
Reviewed by: Jon Chesterfield

Differential Revision: https://reviews.llvm.org/D100144
2021-04-09 09:23:41 -04:00
David Blaikie eb8a28e2cf DebugInfo: Include inline namespaces in template specialization parameter names
This ensures these types have distinct names if they are distinct types
(eg: if one is an instantiation with a type in one inline namespace, and
another from a type with the same simple name, but in a different inline
namespace).
2021-04-08 17:37:55 -07:00
Xiangling Liao d508561798 [AIX] Support init priority attribute
Differential Revision: https://reviews.llvm.org/D99291
2021-04-08 15:40:09 -04:00
Dávid Bolvanský 2cb8c10342 Revert "Reduce the number of attributes attached to each function"
This reverts commit 053dc95839. It causes perf regressions - see discussion in D97116.
2021-04-08 17:28:57 +02:00
Saurabh Jha 7d8513b7f2 [clang] Move int <-> float scalar conversion to a separate function
As prelude to this patch https://reviews.llvm.org/D99037, we want to
move the int-float conversion
into a separate function that can be reused by matrix cast

Differential Revision: https://reviews.llvm.org/D100051
2021-04-07 12:34:16 -07:00
Roman Lebedev 0aa0458f14
[CGCall] Annotate `this` argument with alignment
As it is being noted in D99249, lack of alignment information on `this`
has been preventing LICM from happening.

For some time now, lack of alignment attribute does *not* imply
natural alignment, but an alignment of `1`.
Also, we used to treat dereferenceable as implying alignment,
but we no longer do, so it's a bugfix.

Differential Revision: https://reviews.llvm.org/D99790
2021-04-07 11:02:01 +03:00
Yaxun (Sam) Liu 61d065e21f Let clang atomic builtins fetch add/sub support floating point types
Recently atomicrmw started to support fadd/fsub:

https://reviews.llvm.org/D53965

However clang atomic builtins fetch add/sub still does not support
emitting atomicrmw fadd/fsub.

This patch adds that.

Reviewed by: John McCall, Artem Belevich, Matt Arsenault, JF Bastien,
James Y Knight, Louis Dionne, Olivier Giroux

Differential Revision: https://reviews.llvm.org/D71726
2021-04-06 15:44:00 -04:00
Simon Pilgrim 2901dc7575 Don't directly dereference getAs<> casts to avoid potential null dereferences. NFCI.
Replace with castAs<> which asserts the cast is valid.

Fixes a number of static analyzer warnings.
2021-04-06 12:24:19 +01:00
Yevgeny Rouban 39e3e3aa51 [NewPM] Redesign of PreserveCFG Checker
The reason for the NewPM redesign is described in the commit
  cba3e783389a: [NewPM] Disable PreservedCFGChecker ...

The checker introduces an internal custom CFG analysis that tracks
current up-to date CFG snapshot. The analysis is invalidated along
any other CFG related analysis (the key is CFGAnalyses). If the CFG
analysis is not invalidated at a functional pass exit then the checker
asserts that the CFG snapshot taken from this analysis is equals to
a snapshot of the current CFG.

Along the way:
- the function CFG::printDiff() is simplified by removing function
  name calculation. The name is printed by the caller;
- fixed CFG invalidated condition (see CFG::invalidate());
- StandardInstrumentations::registerCallbacks() gets additional
  optional parameter of type FunctionAnalysisManager*, which is
  needed by the checker to get the custom CFG analysis;
- several PM related tests updated to explicitly set
  -verify-cfg-preserved=1 as they need.

This patch is safe to land as the CFGChecker is left switched off
(the options -verify-cfg-preserved is false by default). It will be
switched on by a separate patch to minimize possible reverts.

Reviewed By: skatkov, kuhar

Differential Revision: https://reviews.llvm.org/D91327
2021-04-06 12:35:49 +07:00
Jennifer Yu 7078ef4722 [OPENMP51]Initial support for nocontext clause.
Added basic parsing/sema/serialization support for the 'nocontext' clause.

Differential Revision: https://reviews.llvm.org/D99848
2021-04-05 11:45:49 -07:00
Craig Topper b4f2e80600 [RISCV] Refactor conversion of B extensions to IR intrinsics a little to reduce clang binary size.
These all pass 1 type to getIntrinsic. So rather than assigning
IntrinsicTypes for each builtin which invokes the SmallVector
constructor, just select the intrinsic ID with a switch and
share a single assignment of IntrinsicTypes.
2021-04-02 23:49:44 -07:00
Jennifer Yu 0fe8af9468 Fix build bot problem with missing OMPC_novariants in switch. 2021-04-02 13:58:39 -07:00
Levy Hsu f78d932cf2 [RISCV] Add IR intrinsics for Zbc extension
Head files are included in a separate patch in case the name needs to be changed.

RV32 / 64:
clmul
clmulh
clmulr

Differential Revision: https://reviews.llvm.org/D99711
2021-04-02 12:09:13 -07:00
Levy Hsu 944adbf285 Recommit "[RISCV] Add IR intrinsic for Zbb extension"
Forgot to amend the Author.

Original commit message:

Header files are included in a separate patch in case the name needs to be changed.

RV32 / 64:
orc.b

Differential Revision: https://reviews.llvm.org/D99320
2021-04-02 11:50:19 -07:00
Craig Topper 1f0b309f24 Revert "[RISCV] Add IR intrinsic for Zbb extension"
This reverts commit 1808194590.

I forgot to change the author.
2021-04-02 11:47:02 -07:00
Craig Topper 1808194590 [RISCV] Add IR intrinsic for Zbb extension
Header files are included in a separate patch in case the name needs to be changed.

RV32 / 64:
orc.b
2021-04-02 11:23:57 -07:00
Levy Hsu b001d574d7 [RISCV] Add IR intrinsic for Zbr extension
Implementation for RISC-V Zbr extension intrinsic.

Header files are included in separate patch in case the name needs to be changed

RV32 / 64:
        crc32b
        crc32h
        crc32w
        crc32cb
        crc32ch
        crc32cw

RV64 Only:
        crc32d
        crc32cd

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99009
2021-04-02 10:58:45 -07:00
Joseph Huber 69ca50bd7d [OpenMP] Pass mapping names to add components in a user defined mapper
Summary:
Currently the mapping names are not passed to the mapper components that set up
the array region. This means array mappings will not have their names availible
in the runtime. This patch fixes this by passing the argument name to the region
correctly. This means that the mapped variable's name will be the declared
mapper that placed it on the device.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D99681
2021-04-01 15:51:03 -04:00
Raphael Isemann 60854c328d Avoid calling ParseCommandLineOptions in BackendUtil if possible
Calling `ParseCommandLineOptions` should only be called from `main` as the
CommandLine setup code isn't thread-safe. As BackendUtil is part of the
generic Clang FrontendAction logic, a process which has several threads executing
Clang FrontendActions will randomly crash in the unsafe setup code.

This patch avoids calling the function unless either the debug-pass option or
limit-float-precision option is set. Without these two options set the
`ParseCommandLineOptions` call doesn't do anything beside parsing
the command line `clang` which doesn't set any options.

See also D99652 where LLDB received a workaround for this crash.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D99740
2021-04-01 19:41:16 +02:00
Alexey Bataev a28e835e94 [OPENMP]Fix PR48885: Crash in passing firstprivate args to tasks on Apple M1.
Need to bitcast the function pointer passed as a parameter to the real
type to avoid possible problem with calling conventions.

Differential Revision: https://reviews.llvm.org/D99521
2021-03-31 13:00:58 -07:00
Alexey Bataev 66da4f6fc9 [OPENMP]Fix PR48658: [OpenMP 5.0] Compiler crash when OpenMP atomic sync hints used.
No need to consider hint clause kind as the main atomic clause kind at the
codegen.

Differential Revision: https://reviews.llvm.org/D99611
2021-03-31 12:58:24 -07:00
Thomas Lively 45783d0e8a [WebAssembly] Implement i64x2 comparisons
Removes the prototype builtin and intrinsic for i64x2.eq and implements that
instruction as well as the other i64x2 comparison instructions in the final SIMD
spec. Unsigned comparisons were not included in the final spec, so they still
need to be scalarized via a custom lowering.

Differential Revision: https://reviews.llvm.org/D99623
2021-03-31 10:46:17 -07:00
Wei Mi d535a05ca1 [ThinLTO] During module importing, close one source module before open
another one for distributed mode.

Currently during module importing, ThinLTO opens all the source modules,
collect functions to be imported and append them to the destination module,
then leave all the modules open through out the lto backend pipeline. This
patch refactors it in the way that one source module will be closed before
another source module is opened. All the source modules will be closed after
importing phase is done. It will save some amount of memory when there are
many source modules to be imported.

Note that this patch only changes the distributed thinlto mode. For in
process thinlto mode, one source module is shared acorss different thinlto
backend threads so it is not changed in this patch.

Differential Revision: https://reviews.llvm.org/D99554
2021-03-30 14:37:29 -07:00
Mike Rice b7899ba0e8 [OPENMP51]Initial support for the dispatch directive.
Added basic parsing/sema/serialization support for dispatch directive.

Differential Revision: https://reviews.llvm.org/D99537
2021-03-30 14:12:53 -07:00
Alexey Bataev e2c7bf08cc [OPENMP]Fix PR48607: Crash during clang openmp codegen for firstprivate() of `float _Complex`.
Need to cast the argument for the debug wrapper function call to the
corresponding parameter type to avoid crash.

Differential Revision: https://reviews.llvm.org/D99617
2021-03-30 13:39:45 -07:00
Alexey Bataev 1696b8ae96 [OPENMP]Fix PR48740: OpenMP declare reduction in C does not require an initializer
If no initializer-clause is specified, the private variables will be
initialized following the rules for initialization of objects with static
storage duration.

Need to adjust the implementation to the current version of the
standard.

Differential Revision: https://reviews.llvm.org/D99539
2021-03-30 05:38:20 -07:00
Raphael Isemann 1cbba533ec [ObjC][CodeGen] Fix missing debug info in situations where an instance and class property have the same identifier
Since the introduction of class properties in Objective-C it is possible to declare a class and an instance
property with the same identifier in an interface/protocol.

Right now Clang just generates debug information for whatever property comes first in the source file.
The second property is ignored as it's filtered out by the set of already emitted properties (which is just
using the identifier of the property to check for equivalence).  I don't think generating debug info in this case
was never supported as the identifier filter is in place since 7123bca7fb
(which precedes the introduction of class properties).

This patch expands the filter to take in account identifier + whether the property is class/instance. This
ensures that both properties are emitted in this special situation.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D99512
2021-03-30 11:07:16 +02:00
Johannes Doerfert 03cc8a1ba0 [OpenMP][NFC] Move the `noinline` to the parallel entry point
The `noinline` for non-SPMD parallel functions is probably not necessary
but as long as we use it we should put it on the outermost parallel
function, which is the wrapper, not the actual outlined function.

Resolves PR49752

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D99506
2021-03-30 01:12:45 -05:00
Alexey Bataev 0411b23319 [OPENMP]Map data field with l-value reference types.
Added initial support dfor the mapping of the data members with l-value
reference types.

Differential Revision: https://reviews.llvm.org/D98812
2021-03-29 07:07:09 -07:00
Alexey Bataev f6f21dcd6c [OPENMP]Fix PR49636: Assertion `(!Entry.getAddress() || Entry.getAddress() == Addr) && "Resetting with the new address."' failed.
The original issue is caused by the fact that the variable is allocated
with incorrect type i1 instead of i8. This causes the bitcasting of the
declaration to i8 type and the bitcast expression does not match the
original variable.
To fix the problem, the UndefValue initializer and the original
variable should be emitted with type i8, not i1.

Differential Revision: https://reviews.llvm.org/D99297
2021-03-29 06:55:57 -07:00
Alexey Bataev dcf96178cb [OPENMP]Fix PR49052: Clang crashed when compiling target code with assert(0).
Need to insert a basic block during generation of the target region to
avoid crash for the GPU to be able always calling a cleanup action.
This cleanup action is required for the correct emission of the target
region for the GPU.

Differential Revision: https://reviews.llvm.org/D99445
2021-03-29 06:36:06 -07:00
Matt Arsenault 9a0c9402fa Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 07e46367ba.
2021-03-29 08:55:30 -04:00
Oliver Stannard 07e46367ba Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""
Reverting because test 'Bindings/Go/go.test' is failing on most
buildbots.

This reverts commit fc9df30991.
2021-03-29 11:32:22 +01:00
Matt Arsenault fc9df30991 Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 20d5c42e0e.
2021-03-28 13:35:21 -04:00
Nico Weber 20d5c42e0e Revert "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 4fefed6563.
Broke check-clang everywhere.
2021-03-28 13:02:52 -04:00
Matt Arsenault 4fefed6563 OpaquePtr: Turn inalloca into a type attribute
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
2021-03-28 11:12:23 -04:00
Xun Li f490a5969b [OpenMP][InstrProfiling] Fix a missing instr profiling counter
When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98135
2021-03-25 13:52:36 -07:00
Xun Li c7a39c833a [Coroutine][Clang] Force emit lifetime intrinsics for Coroutines
tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available.

Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame).
The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame.
In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap.
Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime.

To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point.
The following is a common code pattern called "Symmetric Transfer" in coroutine:
```
auto tmp = await_suspend();
__builtin_coro_resume(tmp.address());
return;
```
In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine.
During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards.
However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape.
To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived.
I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines.

Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893

Differential Revision: https://reviews.llvm.org/D99227
2021-03-25 13:46:20 -07:00
Yaxun (Sam) Liu cc9477166a [CUDA][HIP] add __builtin_get_device_side_mangled_name
Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D99301
2021-03-25 15:25:29 -04:00
Djordje Todorovic 8420a53324 [Debugify] Expose original debug info preservation check as CC1 option
In order to test the preservation of the original Debug Info metadata
in your projects, a front end option could be very useful, since users
usually report that a concrete entity (e.g. variable x, or function fn2())
is missing debug info. The [0] is an example of running the utility
on GDB Project.

This depends on: D82546 and D82545.

Differential Revision: https://reviews.llvm.org/D82547
2021-03-25 05:29:42 -07:00
Alexey Bataev 7654bb6303 [OPENMP]Fix PR48571: critical/master in outlined contexts cause crash.
If emit inlined region for master/critical directives, no need to clear
lambda/block context data, otherwise the variables cannot be found and
it causes a crash at compile time.

Differential Revision: https://reviews.llvm.org/D99280
2021-03-24 10:15:24 -07:00
Nemanja Ivanovic 4020932706 [PowerPC] Make altivec.h work with AIX which has no __int128
There are a number of functions in altivec.h that use
vector __int128 which isn't supported on AIX. Those functions
need to be guarded for targets that don't support the type.
Furthermore, the functions that produce quadword instructions
without using the type need a builtin. This patch adds the
macro guards to altivec.h using the __SIZEOF_INT128__ which
is only defined on targets that support the __int128 type.
2021-03-24 00:35:51 -05:00
Bruno Cardoso Lopes 431e3138a1 [CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode
- Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order.
- Remove stronger checks for cmpxch.

Effectively, this addresses http://wg21.link/p0418

Differential Revision: https://reviews.llvm.org/D98995
2021-03-23 16:45:37 -07:00
Bradley Smith 48f5a392cb [IR] Add vscale_range IR function attribute
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.

Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.

Differential Revision: https://reviews.llvm.org/D98030
2021-03-22 12:05:06 +00:00
Roman Lebedev be87321280
[clang][Codegen] EmitBranchOnBoolExpr(): emit prof branch counts even at -O0
This restores the original behaviour before i unadvertedly broke it in
e3a4701627 and clang/test/Profile/ caught it.
2021-03-21 23:24:27 +03:00
Roman Lebedev e3a4701627
[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights
08196e0b2e exposed LowerExpectIntrinsic's
internal implementation detail in the form of
LikelyBranchWeight/UnlikelyBranchWeight options to the outside.

While this isn't incorrect from the results viewpoint,
this is suboptimal from the layering viewpoint,
and causes confusion - should transforms also use those weights,
or should they use something else, D98898?

So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight
internal again, and fixing all the code that used it directly,
which currently is only clang codegen, thankfully,
to emit proper @llvm.expect intrinsics instead.
2021-03-21 22:50:21 +03:00
Roman Lebedev 37d6be9052
Revert "[BranchProbability] move options for 'likely' and 'unlikely'"
Upon reviewing D98898 i've come to realization that these are
implementation detail of LowerExpectIntrinsicPass,
and they should not be exposed to outside of it.

This reverts commit ee8b53815d.
2021-03-21 22:50:21 +03:00
Sanjay Patel ee8b53815d [BranchProbability] move options for 'likely' and 'unlikely'
This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.

See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).

Differential Revision: https://reviews.llvm.org/D98945
2021-03-20 14:46:46 -04:00
Benjamin Kramer 19d2c65ddd [CodeGen] Don't crash on for loops with cond variables and no increment
This looks like an oversight from a875721d8a, creating IR that refers
to `for.inc` even if it doesn't exist.

Differential Revision: https://reviews.llvm.org/D98980
2021-03-19 20:43:52 +01:00
Hongtao Yu fc1812a0ad [UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name.
C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent.

static int go(int);

static int go(a) int a;
{
  return a;
}

Test Plan:

Differential Revision: https://reviews.llvm.org/D98799
2021-03-18 22:11:16 -07:00
Thomas Lively f5764a8654 [WebAssembly] Finalize SIMD names and opcodes
Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.

Depends on D98466.

Differential Revision: https://reviews.llvm.org/D98676
2021-03-18 11:21:25 -07:00
Thomas Lively 2f2ae08da9 [WebAssembly] Remove experimental SIMD instructions
Removes the instruction definitions, intrinsics, and builtins for qfma/qfms,
signselect, and prefetch instructions, which were not included in the final
WebAssembly SIMD spec.

Depends on D98457.

Differential Revision: https://reviews.llvm.org/D98466
2021-03-18 11:21:24 -07:00
Alex Lorenz d672d5219a Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue"
This reverts commit 809a1e0ffd.

Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local.

Differential Revision: https://reviews.llvm.org/D98458
2021-03-17 17:27:41 -07:00
Richard Smith a875721d8a PR49585: Emit the jump destination for a for loop 'continue' from within the scope of the condition variable.
The condition variable is in scope in the loop increment, so we need to
emit the jump destination from wthin the scope of the condition
variable.

For GCC compatibility (and compatibility with real-world 'FOR_EACH'
macros), 'continue' is permitted in a statement expression within the
condition of a for loop, though, so there are two cases here:

* If the for loop has no condition variable, we can emit the jump
  destination before emitting the condition.

* If the for loop has a condition variable, we must defer emitting the
  jump destination until after emitting the variable. We diagnose a
  'continue' appearing in the initializer of the condition variable,
  because it would jump past the initializer into the scope of that
  variable.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D98816
2021-03-17 16:24:04 -07:00
Mike Rice 410f09af09 [OPENMP51]Initial support for the interop directive.
Added basic parsing/sema/serialization support for interop directive.
Support for the 'init' clause.

Differential Revision: https://reviews.llvm.org/D98558
2021-03-17 09:42:07 -07:00
Bradley Smith cf0da91ba5 [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.

Differential Revision: https://reviews.llvm.org/D98487
2021-03-17 11:41:22 +00:00
Vassil Vassilev 0cb7e7ca0c Make iteration over the DeclContext::lookup_result safe.
The idiom:
```
DeclContext::lookup_result R = DeclContext::lookup(Name);
for (auto *D : R) {...}
```

is not safe when in the loop body we trigger deserialization from an AST file.
The deserialization can insert new declarations in the StoredDeclsList whose
underlying type is a vector. When the vector decides to reallocate its storage
the pointer we hold becomes invalid.

This patch replaces a SmallVector with an singly-linked list. The current
approach stores a SmallVector<NamedDecl*, 4> which is around 8 pointers.
The linked list is 3, 5, or 7. We do better in terms of memory usage for small
cases (and worse in terms of locality -- the linked list entries won't be near
each other, but will be near their corresponding declarations, and we were going
to fetch those memory pages anyway). For larger cases: the vector uses a
doubling strategy for reallocation, so will generally be between half-full and
full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth
of space allocated currently and will be 2N pointers with the linked list. So we
break even when there are N=6 entries and slightly lose in terms of memory usage
after that. We suspect that's still a win on average.

Thanks to @rsmith!

Differential revision: https://reviews.llvm.org/D91524
2021-03-17 08:59:04 +00:00
Luke Drummond 440f6bdf34 [OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall
`CodeGenFunction::EmitRuntimeCall` automatically sets the right calling
convention for the callee so we can avoid setting it ourselves.

As requested in https://reviews.llvm.org/D98411

Reviewed by: anastasia
Differential Revision: https://reviews.llvm.org/D98705
2021-03-16 16:22:19 +00:00
Amy Huang f5352dd9da Emit inline implementation of __builtin__wmemchr on MSVCRT platforms.
The MSVC runtime library doesn't have a definition for wmemchr,
so provide an inline implementation.

Differential Revision: https://reviews.llvm.org/D98472
2021-03-15 15:30:55 -07:00
Jonas Paulsson 9cfd301ec8 [SystemZ] Test for isinf and isfinite in testFPKind().
Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes
for finite) in testFPKind() and handle with TDC.

Review: Ulrich Weigand.

Differential Revision: https://reviews.llvm.org/D97901
2021-03-15 15:02:39 -06:00
Stelios Ioannou ab86edbc88 [AArch64] Implement __rndr, __rndrrs intrinsics
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.

These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.

The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.

[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838

Differential Revision: https://reviews.llvm.org/D98264

Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
2021-03-15 17:51:48 +00:00
Luke Drummond fcfd3fda71 [OpenCL] Respect calling convention for builtin
`__translate_sampler_initializer` has a calling convention of
`spir_func`, but clang generated calls to it using the default CC.

Instruction Combining was lowering these mismatching calling conventions
to `store i1* undef` which itself was subsequently lowered to a trap
instruction by simplifyCFG resulting in runtime `SIGILL`

There are arguably two bugs here: but whether there's any wisdom in
converting an obviously invalid call into a runtime crash over aborting
with a sensible error message will require further discussion. So for
now it's enough to set the right calling convention on the runtime
helper.

Reviewed By: svenh, bader

Differential Revision: https://reviews.llvm.org/D98411
2021-03-15 17:26:51 +00:00
Thomas Preud'homme f60b35340f Stop traping on sNaN in __builtin_isinf
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: mibintc

Differential Revision: https://reviews.llvm.org/D97125
2021-03-15 15:38:08 +00:00
xling-Liao 0bf2da53c1 [NFC] Adjust SmallVector.h header to workaround XL build compiler issue
In order to prevent further building issues related to the usage of SmallVector
in other compilation unit, this patch adjusts the llvm.h header as a workaround
instead.

Besides, this patch reverts previous workarounds:

1. Revert "[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX"
This reverts commit 561fb7f60a.

2.Revert "[clang][cli] Fix build failure in CompilerInvocation"
This reverts commit 8dc70bdcd0.

Differential Revision: https://reviews.llvm.org/D98552
2021-03-12 21:41:36 -06:00
Amy Huang d7cd208f08 [DebugInfo] Add an attribute to force type info to be emitted for types that are required to be complete.
This was motivated by the fact that constructor type homing (debug info
optimization that we want to turn on by default) drops some libc++ types,
so an attribute would allow us to override constructor homing and emit
them anyway. I'm currently looking into the particular libc++ issue, but
even if we do fix that, this issue might come up elsewhere and it might be
nice to have this.

As I've implemented it now, the attribute isn't specific to the
constructor homing optimization and overrides all of the debug info
optimizations.

Open to discussion about naming, specifics on what the attribute should do, etc.

Differential Revision: https://reviews.llvm.org/D97411
2021-03-12 12:30:01 -08:00
Nikita Popov 42eb658f65 [OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.

There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
2021-03-12 21:01:16 +01:00
Simon Pilgrim 731b3d7664 [clang] Use Constant::getAllOnesValue helper. NFCI.
Avoid -1ULL which MSVC tends to complain about
2021-03-12 15:16:36 +00:00
Sriraman Tallam cdb42a4cc4 Disable unique linkage suffixes ifor global vars until demanglers can be fixed.
D96109 added support for unique internal linkage names for both internal
linkage functions and global variables. There was a lot of discussion on how to
get the demangling right for functions but I completely missed the point that
demanglers do not support suffixes for global vars. For example:

$ c++filt _ZL3foo
foo
$ c++filt _ZL3foo.uniq.123
_ZL3foo.uniq.123

The demangling for functions works as expected.

I am not sure of the impact of this. I don't understand how debuggers and other
tools depend on the correctness of global variable demangling so I am
pre-emptively disabling it until we can get the demangling support added.

Importantly, uniquefying global variables is not needed right now as we do not
do profile attribution to global vars based on sampling. It was added for
completeness and so this feature is not exactly missed.

Differential Revision: https://reviews.llvm.org/D98392
2021-03-11 20:59:30 -08:00
David Blaikie bd2bdad19e void cast to suppress -Wunused-variable in non-asserts build 2021-03-11 17:51:31 -08:00
Florian Hahn c92ec0dd92
[Matrix] Add support for matrix-by-scalar division.
This patch extends the matrix spec to allow matrix-by-scalar division.

Originally support for `/` was left out to avoid ambiguity for the
matrix-matrix version of `/`, which could either be elementwise or
specified as matrix multiplication M1 * (1/M2).

For the matrix-scalar version, no ambiguity exists; `*` is also
an elementwise operation in that case. Matrix-by-scalar division
is commonly supported by systems including Matlab, Mathematica
or NumPy.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D97857
2021-03-11 22:21:23 +00:00
Joseph Huber 807466ef28 [OpenMP] Restore backwards compatibility for libomptarget
Summary:
The changes introduced in D87946 changed the API for libomptarget
functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x
but was not given a backward-compatible interface. This change will
require people using Clang 13.x or 12.x to recompile their offloading
programs.

Reviewed By: jdoerfert cchen

Differential Revision: https://reviews.llvm.org/D98358
2021-03-11 09:52:11 -05:00
Nikita Popov 46354bac76 [OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC)
Explicitly pass loaded type when creating loads, in preparation
for the deprecation of these APIs.

There are still a couple of uses left.
2021-03-11 14:40:57 +01:00
Nikita Popov 68e01339cc [CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC)
These are incompatible with opaque pointers. This is in preparation
of dropping this API on the IRBuilder side as well.

Instead explicitly pass the loaded type.
2021-03-11 10:41:23 +01:00
Olivier Goffart 5baea05601 [SEH] Fix capture of this in lambda functions
Commit 1b04bdc2f3 added support for
capturing the 'this' pointer in a SEH context (__finally or __except),
But the case in which the 'this' pointer is part of a lambda capture
was not handled properly

Differential Revision: https://reviews.llvm.org/D97687
2021-03-11 09:12:42 +01:00
Zakk Chen d6a0560bf2 [Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Demonstrate how to generate vadd/vfadd intrinsic functions

1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D95016
2021-03-10 18:43:43 -08:00
Arthur Eubanks c8227f06b3 [clang] Don't assert in EmitAggregateCopy on trivial_abi types
Fixes PR42961.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D97872
2021-03-10 10:16:06 -08:00
Jingu Kang 25951c5ab8 [AArch64] Add missing intrinsics for scalar FP rounding
Differential Revision: https://reviews.llvm.org/D98269
2021-03-10 13:22:29 +00:00
serge-sans-paille ea8e5b87ac [NFC] Remove duplicate isNoBuiltinFunc method
It's available both in CodeGenOptions and in LangOptions, and LangOptions
implementation is slightly better as it uses a StringRef instead of a char
pointer, so use it.

Differential Revision: https://reviews.llvm.org/D98175
2021-03-10 09:18:55 +01:00
Xiangling Liao 561fb7f60a [NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX
LLVM is recommending to use SmallVector (that is, omitting the N), in the
absence of a well-motivated choice for the number of inlined elements N.

However, this doesn't work well with XL compiler on AIX since some header(s)
aren't properly picked up with it. We need to take a further look into the real
issue underneath and fix it in a later patch.

But currently we'd like to use this patch to unblock the build compiler issue
first.

Differential Revision: https://reviews.llvm.org/D98265
2021-03-09 13:03:52 -05:00
diggerlin 46d4d1fea4 [AIX] do not emit visibility attribute into IR when there is -mignore-xcoff-visibility
SUMMARY:

n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility"
we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR."

in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS

Reviewer: Jason Liu,

Differential Revision: https://reviews.llvm.org/D89986
2021-03-09 10:38:00 -05:00
Akira Hatanaka dca5737945 Move ObjCARCUtil.h back to llvm/Analysis
Instead of adding the header to llvm/IR, just duplicate the marker
string in the auto upgrader.
2021-03-08 16:35:24 -08:00
Min-Yih Hsu 5eb7a5814a [cfe][M68k](7/8) Clang basic support
This is the first patch supporting M68k in Clang
 - Register M68k as a target
 - Target specific CodeGen support
 - Target specific attribute support

Authors: myhsu, m4yers, glaubitz

Differential Revision: https://reviews.llvm.org/D88393
2021-03-08 12:30:57 -08:00
Sriraman Tallam 78d0e91865 Refactor -funique-internal-linakge-names implementation.
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.

This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.

Differential Revision: https://reviews.llvm.org/D96109
2021-03-05 13:32:17 -08:00
Jingu Kang 9b302513f6 [AArch64] Add missing intrinsics for vrnd 2021-03-05 11:26:12 +00:00
Michael Kruse b119120673 [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
 * Only the worksharing-loop directive.
 * Recognizes only the nowait clause.
 * No loop nests with more than one loop.
 * Untested with templates, exceptions.
 * Semantic checking left to the existing infrastructure.

This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
 * The distance function: How many loop iterations there will be before entering the loop nest.
 * The loop variable function: Conversion from a logical iteration number to the loop variable.

These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.

The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.

For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.

Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94973
2021-03-04 22:52:59 -06:00
David Blaikie cedc53254a Fix clang for header move in LLVM/IR 2021-03-04 16:20:44 -08:00
Heejin Ahn 561abd83ff [WebAssembly] Disable uses of __clang_call_terminate
Background:

Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.

`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
  __cxa_begin_catch(exn);
  std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.

In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.

The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```

In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.

---

Problem:

While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.

Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.

---

Solution (?):

This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.

The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.

And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.

Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
  and `__clang_call_terminate` function in terminate pads anymore; we
  simply generate calls to `std::terminate()`, which is the default
  implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
  `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
  already `catch_all` now (because they don't need the exception
  pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
  tested `LateEHPrepare::ensureSingleBBTermPads` and
  `HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
  declarations

Fixes https://github.com/emscripten-core/emscripten/issues/13582.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97834
2021-03-04 14:26:35 -08:00
Reid Kleckner 1c2e7d200d [MS] Fix crash involving gnu stmt exprs and inalloca
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up
unreachable blocks. This invalidates the reference to the deferred
replacement placeholder. Cope with it.

Fixes PR25102 (from 2015!)
2021-03-04 13:57:46 -08:00
Gui Andrade 10264a1b21 Introduce noundef attribute at call sites for stricter poison analysis
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.

In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.

Differential Revision: https://reviews.llvm.org/D81678
2021-03-04 12:15:12 -08:00
Zequan Wu 9783e20988 Revert "Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements.""
Reland with update on test case ContinuousSyncmode/basic.c.

This reverts commit fe5c2c3ca6.
2021-03-04 11:52:43 -08:00
Akira Hatanaka 1900503595 [ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of
explicitly emitting retainRV or claimRV calls in the IR

This reapplies ed4718eccb, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5f.

Original commit message:

Background:

This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
  which indicates the call is implicitly followed by a marker
  instruction and an implicit retainRV/claimRV call that consumes the
  call result. In addition, it emits a call to
  @llvm.objc.clang.arc.noop.use, which consumes the call result, to
  prevent the middle-end passes from changing the return type of the
  called function. This is currently done only when the target is arm64
  and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  claimRV is attached to the call since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since the ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if retainRV is attached to the call and
  does nothing if claimRV is attached to it.

- SCCP refrains from replacing the return value of a call with a
  constant value if the call has the operand bundle. This ensures the
  call always has at least one user (the call to
  @llvm.objc.clang.arc.noop.use).

- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
  multiple operand bundles of the same kind were being added to a call.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-03-04 11:22:30 -08:00
Alexey Bataev 711179b581 [OPENMP]Fix PR48759: "fatal error" when compile with preprocessed file.
If the file in line directive does not exist on the system we need, to
use the original file to get its file id.

Differential Revision: https://reviews.llvm.org/D97945
2021-03-04 07:26:57 -08:00
Nico Weber fe5c2c3ca6 Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements."
This reverts commit 2d7374a0c6.
Breaks ContinuousSyncMode/basic.c in check-profile on macOS.
2021-03-04 08:53:30 -05:00
Thomas Preud'homme b7aeece47c Revert "Stop traping on sNaN in __builtin_isinf"
This reverts commit 1b6eb56aa0 because the
invert logic for isfinite is incorrect.
2021-03-04 12:07:35 +00:00
Wang, Pengfei e7e67c930a Add Windows ehcont section support (/guard:ehcont).
Add option /guard:ehcont

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D96709
2021-03-04 11:47:29 +08:00
Soumi Manna eec7f8f7b1 [WebAssembly] Add missing default cases in switch statements
unsigned variable 'IntNo' has been declared but not been defined inside function
EmitWebAssemblyBuiltinExpr().

static code analysis tool complains about uninitialized variable "IntNo" since
this enters to default branch without setting any intrinsics and calls Function
*Callee = CGM.getIntrinsic(IntNo).

This patch fixes the problem by adding default cases in switch statements.
2021-03-03 13:15:23 -08:00
Zequan Wu 2d7374a0c6 [Coverage] Emit gap region between statements if first statements contains terminate statements.
Differential Revision: https://reviews.llvm.org/D97101
2021-03-03 11:25:49 -08:00
Hans Wennborg 0a5dd06718 Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR"
This caused miscompiles of Chromium tests for iOS due clobbering of live
registers. See discussion on the code review for details.

> Background:
>
> This fixes a longstanding problem where llvm breaks ARC's autorelease
> optimization (see the link below) by separating calls from the marker
> instructions or retainRV/claimRV calls. The backend changes are in
> https://reviews.llvm.org/D92569.
>
> https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
>
> What this patch does to fix the problem:
>
> - The front-end adds operand bundle "clang.arc.attachedcall" to calls,
>   which indicates the call is implicitly followed by a marker
>   instruction and an implicit retainRV/claimRV call that consumes the
>   call result. In addition, it emits a call to
>   @llvm.objc.clang.arc.noop.use, which consumes the call result, to
>   prevent the middle-end passes from changing the return type of the
>   called function. This is currently done only when the target is arm64
>   and the optimization level is higher than -O0.
>
> - ARC optimizer temporarily emits retainRV/claimRV calls after the calls
>   with the operand bundle in the IR and removes the inserted calls after
>   processing the function.
>
> - ARC contract pass emits retainRV/claimRV calls after the call with the
>   operand bundle. It doesn't remove the operand bundle on the call since
>   the backend needs it to emit the marker instruction. The retainRV and
>   claimRV calls are emitted late in the pipeline to prevent optimization
>   passes from transforming the IR in a way that makes it harder for the
>   ARC middle-end passes to figure out the def-use relationship between
>   the call and the retainRV/claimRV calls (which is the cause of
>   PR31925).
>
> - The function inliner removes an autoreleaseRV call in the callee if
>   nothing in the callee prevents it from being paired up with the
>   retainRV/claimRV call in the caller. It then inserts a release call if
>   claimRV is attached to the call since autoreleaseRV+claimRV is
>   equivalent to a release. If it cannot find an autoreleaseRV call, it
>   tries to transfer the operand bundle to a function call in the callee.
>   This is important since the ARC optimizer can remove the autoreleaseRV
>   returning the callee result, which makes it impossible to pair it up
>   with the retainRV/claimRV call in the caller. If that fails, it simply
>   emits a retain call in the IR if retainRV is attached to the call and
>   does nothing if claimRV is attached to it.
>
> - SCCP refrains from replacing the return value of a call with a
>   constant value if the call has the operand bundle. This ensures the
>   call always has at least one user (the call to
>   @llvm.objc.clang.arc.noop.use).
>
> - This patch also fixes a bug in replaceUsesOfNonProtoConstant where
>   multiple operand bundles of the same kind were being added to a call.
>
> Future work:
>
> - Use the operand bundle on x86-64.
>
> - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
>   calls with the operand bundles.
>
> rdar://71443534
>
> Differential Revision: https://reviews.llvm.org/D92808

This reverts commit ed4718eccb.
2021-03-03 15:51:40 +01:00
Thomas Preud'homme 1b6eb56aa0 Stop traping on sNaN in __builtin_isinf
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: mibintc

Differential Revision: https://reviews.llvm.org/D97125
2021-03-02 15:54:56 +00:00
Alexey Bataev 0caf736d7e [OPENMP50]Mapping of the subcomponents with the 'default' mappers.
If the mapped structure has data members, which have 'default' mappers,
need to map these members individually using their 'default' mappers.

Differential Revision: https://reviews.llvm.org/D92195
2021-03-02 07:11:06 -08:00
Richard Smith 9e2579dbf4 Fix infinite recursion during IR emission if a constant-initialized lifetime-extended temporary object's initializer refers back to the same object.
`GetAddrOfGlobalTemporary` previously tried to emit the initializer of
a global temporary before updating the global temporary map. Emitting the
initializer could recurse back into `GetAddrOfGlobalTemporary` for the same
temporary, resulting in an infinite recursion.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D97733
2021-03-01 22:19:21 -08:00
Yuanfang Chen 5de2d189e6 [Diagnose] Unify MCContext and LLVMContext diagnosing
The situation with inline asm/MC error reporting is kind of messy at the
moment. The errors from MC layout are not reliably propagated and users
have to specify an inlineasm handler separately to get inlineasm
diagnose. The latter issue is not a correctness issue but could be improved.

* Kill LLVMContext inlineasm diagnose handler and migrate it to use
  DiagnoseInfo/DiagnoseHandler.
* Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This
  covers use cases like inlineasm, MC, and any clients using SourceMgr.
* Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step
  is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all
  use cases, only one of them is used.
* If LLVMContext is available, let MCContext uses LLVMContext's diagnose
  handler; if LLVMContext is not available, MCContext uses its own default
  diagnose handler which just prints SMDiagnostic.
* Change a few clients(Clang, llc, lldb) to use the new way of reporting.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D97449
2021-03-01 15:58:37 -08:00
Yaxun (Sam) Liu 53d30381f5 Fix build failure due to dump()
Change-Id: I86b534223d63bf8bb8f49af5a64b300efbeba77b
2021-03-01 16:44:08 -05:00
Yaxun (Sam) Liu 5cf2a37f12 [HIP] Emit kernel symbol
Currently clang uses stub function to launch kernel. This is inconvenient
to interop with C++ programs since the stub function has different name
as kernel, which is required by ROCm debugger.

This patch emits a variable symbol which has the same name as the kernel
and uses it to register and launch the kernel. This allows C++ program to
launch a kernel by using the original kernel name.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D86376
2021-03-01 16:31:40 -05:00
Sean Fertile 3f40dbbbc7 [PowerPC][AIX] Enable passing vectors in variadic functions.
Differential Revision: https://reviews.llvm.org/D97474
2021-03-01 13:08:28 -05:00
Arthur Eubanks 040c1b49d7 Move EntryExitInstrumentation pass location
This seems to be more of a Clang thing rather than a generic LLVM thing,
so this moves it out of LLVM pipelines and as Clang extension hooks into
LLVM pipelines.

Move the post-inline EEInstrumentation out of the backend pipeline and
into a late pass, similar to other sanitizer passes. It doesn't fit
into the codegen pipeline.

Also fix up EntryExitInstrumentation not running at -O0 under the new
PM. PR49143

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D97608
2021-03-01 10:08:10 -08:00
Olivier Goffart 1b04bdc2f3 [SEH] capture 'this'
Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue
are correctly initialized to the recovered value.
For lambda capture, we also need to make sure to fill the LambdaCaptureFields

Differential Revision: https://reviews.llvm.org/D97534
2021-03-01 11:57:35 +01:00
Fangrui Song 8afdacba9d Add GNU attribute 'retain'
For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a
`__attribute__((retain))` function/variable to prevent linker garbage
collection. (See AttrDocs.td for the linker support).

This patch adds `retain` functions/variables to the `llvm.used` list, which has
the desired linker GC semantics. Note: `retain` does not imply `used`,
so an unused function/variable can be dropped by Sema.

Before 'retain' was introduced, previous ELF solutions require inline asm or
linker tricks, e.g.  `asm volatile(".reloc 0, R_X86_64_NONE, target");`
(architecture dependent) or define a non-local symbol in the section and use
`ld -u`. There was no elegant source-level solution.

With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets.

Differential Revision: https://reviews.llvm.org/D97447
2021-02-26 16:37:50 -08:00
Fangrui Song 28cb620321 Change some addUsedGlobal to addUsedOrCompilerUsedGlobal
An global value in the `llvm.used` list does not have GC root semantics on ELF targets.
This will be changed in a subsequent backend patch.

Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to
prevent undesired GC root semantics.

Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets.

GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections",
which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior).
For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that
the future LLD change will not affect them.

rnk kindly categorized the changes:
```
ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac.
MS C++ ABI stuff: wants GC root semantics, no change
OpenMP: unsure, but GC root semantics probably don't hurt
CodeGenModule: affected in this patch to *not* use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics
Coverage: Probably want GC root semantics
CGExpr.cpp: refers to LTO, wants GC root
CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run)
CGDecl.cpp: Changed in this patch for __attribute__((used))
```

Differential Revision: https://reviews.llvm.org/D97446
2021-02-26 10:42:07 -08:00
Petr Hosek 8459b8ef39 [Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir}
These flags affect coverage mapping (-fcoverage-mapping), not
-fprofile-[instr-]generate so it makes more sense to use the
-fcoverage-* prefix.

Differential Revision: https://reviews.llvm.org/D97434
2021-02-25 21:40:12 -08:00
James Y Knight 24539f1ef2 Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg.
And then push those change throughout LLVM.

Keep the old signature in Clang's CGBuilder for now -- that will be
updated in a follow-on patch (D97224).

The MLIR LLVM-IR dialect is not updated to support the new alignment
attribute, but preserves its existing behavior.

Differential Revision: https://reviews.llvm.org/D97223
2021-02-25 18:29:42 -05:00
Vitaly Buka 9a887f652c [clang,NFC] Fix typos in file headers 2021-02-25 12:47:02 -08:00
Akira Hatanaka ec4408ad69 [CodeGen] Call ConvertTypeForMem instead of ConvertType
This fixes a crash that occurs when the type passed to the method is
`_Bool`.

rdar://74493389
2021-02-25 12:11:18 -08:00
Dan Liew 5d64dd8e3c [Clang][ASan] Introduce `-fsanitize-address-destructor-kind=` driver & frontend option.
The new `-fsanitize-address-destructor-kind=` option allows control over how module
destructors are emitted by ASan.

The new option is consumed by both the driver and the frontend and is propagated into
codegen options by the frontend.

Both the legacy and new pass manager code have been updated to consume the new option
from the codegen options.

It would be nice if the new utility functions (`AsanDtorKindToString` and
`AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be
consumed by other language frontends. Unfortunately that doesn't work because
the clang driver doesn't link against the LLVM instrumentation library.

rdar://71609176

Differential Revision: https://reviews.llvm.org/D96572
2021-02-25 12:02:21 -08:00
Jan Svoboda a25e4a6da3 [clang][cli] Store additional optimization remarks info
After a revision of D96274 changed `DiagnosticOptions` to not store all remark arguments **as-written**, it is no longer possible to reconstruct the arguments accurately from the class.

This is caused by the fact that for `-Rpass=regexp` and friends, `DiagnosticOptions` store only the group name `pass` and not `regexp`. This is the same representation used for the plain `-Rpass` argument.

Note that each argument must be generated exactly once in `CompilerInvocation::generateCC1CommandLine`, otherwise each subsequent call would produce more arguments than the previous one. Currently this works out because of the way `RoundTrip` splits the responsibilities for certain arguments based on what arguments were queried during parsing. However, this invariant breaks when we move to single round-trip for the whole `CompilerInvocation`.

This patch ensures that for one `-Rpass=regexp` argument, we don't generate two arguments (`-Rpass` from `DiagnosticOptions` and `-Rpass=regexp` from `CodeGenOptions`) by shifting the responsibility for handling both cases to `CodeGenOptions`. To distinguish between the cases correctly, additional information is stored in `CodeGenOptions`.

The `CodeGenOptions` parser of `-Rpass[=regexp]` arguments also looks at `-Rno-pass` and `-R[no-]everything`, which is necessary for generating the correct argument regardless of the ordering of `CodeGenOptions`/`DiagnosticOptions` parsing/generation.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96847
2021-02-25 11:02:49 +01:00
Yaxun (Sam) Liu 47acdec1dd [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc
For -fgpu-rdc mode, static device vars in different TU's may have the same name.
To support accessing file-scope static device variables in host code, we need to give them
a distinct name and external linkage. This can be done by postfixing each static device variable with
a distinct CUID (Compilation Unit ID) hash.

Since the static device variables have different name across compilation units, now we let
them have external linkage so that they can be looked up by the runtime.

Reviewed by: Artem Belevich, and Jon Chesterfield

Differential Revision: https://reviews.llvm.org/D85223
2021-02-24 18:23:45 -05:00
Vitaly Buka 8560c2d426 [ThinLTO, NewPM] Run OptimizerLastEPCallbacks from buildThinLTOPreLinkDefaultPipeline
-O1 and above do dont call real optimizer pipeline in ThinLTO PreLink.
Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO.
This results in missing sanitizer passes with ThinLTO.

Simple working solution is just call OptimizerLastEPCallbacks
at the end of buildThinLTOPreLinkDefaultPipeline.

Differential Revision: https://reviews.llvm.org/D96320
2021-02-23 22:14:41 -08:00
Dávid Bolvanský 053dc95839 Reduce the number of attributes attached to each function
Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D97116
2021-02-24 07:08:44 +01:00
Yaxun (Sam) Liu a3ce7f5cd2 [HIP] Fix managed variable linkage
Currently managed variables are emitted as undefined symbols, which
causes difficulty for diagnosing undefined symbols for non-managed
variables.

This patch transforms managed variables in device compilation so that
they can be emitted as normal variables.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D96195
2021-02-23 22:34:45 -05:00
Hsiangkai Wang 1a35a1b074 [RISCV] Add vadd with mask and without mask builtin.
Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93446
2021-02-24 07:57:31 +08:00
James Y Knight fe2dcd89ac DebugInfo: Emit "LocalToUnit" flag on local member function decls.
Previously, the definition was so-marked, but the declaration was
not. This resulted in LLVM's dwarf emission treating the function as
being external, and incorrectly emitting DW_AT_external.

Differential Revision: https://reviews.llvm.org/D96044
2021-02-22 17:55:25 -05:00
Melanie Blower e64fcdf8d5 [clang][patch] Inclusive language, modify filename SanitizerBlacklist.h to NoSanitizeList.h
This patch responds to a comment from @vitalybuka in D96203: suggestion to
do the change incrementally, and start by modifying this file name. I modified
the file name and made the other changes that follow from that rename.

Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman

Differential Revision: https://reviews.llvm.org/D96974
2021-02-22 15:11:37 -05:00
Ryan Santhiraraja 2c25efcbd3 [AArch64] Adding SHA3 Intrinsics support
This patch adds the following SHA3 Intrinsics:
        vsha512hq_u64,
        vsha512h2q_u64,
        vsha512su0q_u64,
        vsha512su1q_u64
        veor3q_u8
        veor3q_u16
        veor3q_u32
        veor3q_u64
        veor3q_s8
        veor3q_s16
        veor3q_s32
        veor3q_s64
        vrax1q_u64
        vxarq_u64
        vbcaxq_u8
        vbcaxq_u16
        vbcaxq_u32
        vbcaxq_u64
        vbcaxq_s8
        vbcaxq_s16
        vbcaxq_s32
        vbcaxq_s64

    Note need to include +sha3 and +crypto when building from the front-end

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D96381
2021-02-22 12:09:20 +00:00
Dávid Bolvanský ee51c42e00 Reduce the number of attributes attached to each function
This takes advantage of the implicit default behavior to reduce the number of
attributes.
2021-02-20 06:57:47 +01:00
Petr Hosek 3275b18f89 [Coverage] Normalize compilation dir as well
This matches debug info behavior.

Differential Revision: https://reviews.llvm.org/D97001
2021-02-19 15:29:03 -08:00
Christopher Tetreault 55448ab540 [AArch64] Adding Neon Polynomial vadd Intrinsics
This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett, ctetreau

Differential Revision: https://reviews.llvm.org/D96825
2021-02-19 14:48:12 -08:00
Teresa Johnson 0923a60ea7 [clang] Emit type metadata on available_externally vtables for WPD
When WPD is enabled, via WholeProgramVTables, emit type metadata for
available_externally vtables. Additionally, add the vtables to the
llvm.compiler.used global so that they are not prematurely eliminated
(before *LTO analysis).

This is needed to avoid devirtualizing calls to a function overriding a
class defined in a header file but with a strong definition in a shared
library. Without type metadata on the available_externally vtables from
the header, the WPD analysis never sees what a derived class is
overriding. Even if the available_externally base class functions are
pure virtual, because shared library definitions are already treated
conservatively (committed patches D91583, D96721, and D96722) we will
not devirtualize, which would be unsafe since the library might contain
overrides that aren't visible to the LTO unit.

An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.

Differential Revision: https://reviews.llvm.org/D96919
2021-02-19 12:42:34 -08:00
Petr Hosek 5fbd1a333a [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 14:34:39 -08:00
Sterling Augustine fc97a63db0 Move a second variable only used in an assert into the assert.
This prevents unused variable warnings when building without asserts.
2021-02-18 13:26:07 -08:00
Sterling Augustine 4544a63b77 Move variable only used in an assert into the assert.
This prevents unused variable warnings when building without asserts.
2021-02-18 13:04:58 -08:00
Petr Hosek fbf8b957fd Revert "[Coverage] Store compilation dir separately in coverage mapping"
This reverts commit 97ec8fa5bb since
the test is failing on some bots.
2021-02-18 12:50:24 -08:00
Pengxuan Zheng 0ec32f1326 Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics"
Revert the patch due to buildbot failures.

This reverts commit d9645059c5.
2021-02-18 12:38:16 -08:00
Petr Hosek 97ec8fa5bb [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 12:27:42 -08:00
Zequan Wu d83511dd26 [Coverage] Emit gap region after conditions when macro is present. 2021-02-18 11:41:04 -08:00
Pengxuan Zheng d9645059c5 [AArch64] Adding Neon Polynomial vadd Intrinsics
This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett

Differential Revision: https://reviews.llvm.org/D96825
2021-02-18 11:33:24 -08:00
Jonas Paulsson e57bd1ff4f [CFE, SystemZ] New target hook testFPKind() for checks of FP values.
The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the
lowering in constrained FP mode of builtin_isnan from an FP comparison to
integer operations to avoid trapping.

SystemZ has a special instruction "Test Data Class" which is the preferred
way to do this check. This patch adds a new target hook "testFPKind()" that
lets SystemZ emit the s390_tdc intrinsic instead.

testFPKind() takes the BuiltinID as an argument and is expected to soon
handle more opcodes than just 'builtin_isnan'.

Review: Thomas Preud'homme, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96568
2021-02-18 12:36:46 -06:00
Jeroen Dobbelaere 46757ccb49 [clang] functions with the 'const' or 'pure' attribute must always return.
As described in
* https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute
* https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute

An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96960
2021-02-18 17:29:46 +01:00
Hsiangkai Wang 766ee1096f [Clang][RISCV] Define RISC-V V builtin types
Add the types for the RISC-V V extension builtins.

These types will be used by the RISC-V V intrinsics which require
types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or
<vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size
attribute does not work for us as it doesn't create a scalable
vector type. We want these types to be opaque and have no operators
defined for them. We want them to be sizeless. This makes them
similar to the ARM SVE builtin types. But we will have quite a bit
more types. This patch adds around 60. Later patches will add
another 230 or so types representing tuples of these types similar
to the x2/x3/x4 types in ARM SVE. But with extra complexity that
these types are combined with the LMUL concept that is unique to
RISCV.

For more background see this RFC
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html

Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D92715
2021-02-18 10:17:31 +08:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Igor Kudrin aa84289629 [DebugInfo] Keep the DWARF64 flag in the module metadata
This allows the option to affect the LTO output. Module::Max helps to
generate debug info for all modules in the same format.

Differential Revision: https://reviews.llvm.org/D96597
2021-02-17 17:03:34 +07:00
Alexey Bataev 60d71a286b [OPENMP50]Allow overlapping mapping in target constructs.
OpenMP 5.0 removed a lot of restriction for overlapped mapped items
comparing to OpenMP 4.5. Patch restricts the checks for overlapped data
mappings only for OpenMP 4.5 and less and reorders mapping of the
arguments so, that present and alloc mappings are processed first and
then all others.

Differential Revision: https://reviews.llvm.org/D86119
2021-02-16 14:42:08 -08:00
Michael Kruse 6c05005238 [OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur).
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.

This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.

A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.

I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).

Differential Revision: https://reviews.llvm.org/D76342
2021-02-16 09:45:07 -08:00
serge-sans-paille 3c8bf29f14
Reduce the number of attributes attached to each function
This takes advantage of the implicit default behavior to reduce the number of
attributes, which in turns reduces compilation time. I've observed -3% in
instruction count when compiling sqlite3 amalgamation with -O0

Differential Revision: https://reviews.llvm.org/D96400
2021-02-16 16:19:54 +01:00
Marco Antognini e54811ff7e Restore diagnostic handler after CodeGenAction::ExecuteAction
Fix dangling pointer to local variable and address some typos.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D96487
2021-02-15 10:33:00 +00:00
Wang, Pengfei 61da20575d [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
This is a follow up of D92940.

We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93179
2021-02-15 08:52:06 +08:00
Malhar 74ddacd30d [Clang] Ensure vector predication loop metadata is always emitted when pragma is specified.
This patch ensures that vector predication and vectorization width
pragmas work together correctly/as expected. Specifically, this patch
fixes the issue that when vectorization_width > 1, the vector
predication behaviour (this would matter if it has NOT been disabled
explicitly by a pragma) was getting ignored, which was incorrect.

The fix here removes the dependence of vector predication on the
vectorization width. The loop metadata corresponding to clang loop
pragma vectorize_predicate is always emitted, if the pragma is
specified, even if vectorization is disabled by vectorize_width(1)
or vectorize(disable) since the option is also used for interleaving
by the LoopVectorize pass.

Reviewed By: dmgreen, Meinersbur

Differential Revision: https://reviews.llvm.org/D94779
2021-02-13 17:35:54 -06:00
Artur Gainullin ff50b121e3 [SYCL] Ignore file-scope asm during device-side SYCL compilation.
Reviewed By: bader, eandrews

Differential Revision: https://reviews.llvm.org/D96538
2021-02-12 17:00:45 -08:00
Florian Hahn 6280bb4cd8 [clang] Remove redundant condition (NFC). 2021-02-12 20:14:24 +00:00
Florian Hahn 51bf4c0e6d [clang] Add -ffinite-loops & -fno-finite-loops options.
This patch adds 2 new options to control when Clang adds `mustprogress`:

  1. -ffinite-loops: assume all loops are finite; mustprogress is added
     to all loops, regardless of the selected language standard.
  2. -fno-finite-loops: assume no loop is finite; mustprogress is not
     added to any loop or function. We could add mustprogress to
     functions without loops, but we would have to detect that in Clang,
     which is probably not worth it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96419
2021-02-12 19:25:49 +00:00
Amy Huang 3fe465fb2c Revert "[DebugInfo] Add an attribute to force type info to be emitted for"
Didn't mean to commit this.

This reverts commit 1b5c2915a2.
2021-02-12 10:18:17 -08:00
Amy Huang 1b5c2915a2 [DebugInfo] Add an attribute to force type info to be emitted for
class types.

The goal is to provide a way to bypass constructor homing when emitting
class definitions and force class definitions in the debug info.

Not sure about the wording of the attribute, or whether it should be
specific to classes with constructors
2021-02-12 10:16:49 -08:00
Akira Hatanaka ed4718eccb [ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of
explicitly emitting retainRV or claimRV calls in the IR

Background:

This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
  which indicates the call is implicitly followed by a marker
  instruction and an implicit retainRV/claimRV call that consumes the
  call result. In addition, it emits a call to
  @llvm.objc.clang.arc.noop.use, which consumes the call result, to
  prevent the middle-end passes from changing the return type of the
  called function. This is currently done only when the target is arm64
  and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  claimRV is attached to the call since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since the ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if retainRV is attached to the call and
  does nothing if claimRV is attached to it.

- SCCP refrains from replacing the return value of a call with a
  constant value if the call has the operand bundle. This ensures the
  call always has at least one user (the call to
  @llvm.objc.clang.arc.noop.use).

- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
  multiple operand bundles of the same kind were being added to a call.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-02-12 09:51:57 -08:00
Yaxun (Sam) Liu 053e61d54e Relands "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit e384e94fbe.
2021-02-12 10:53:59 -05:00
Vitaly Buka 686b65f85f [Msan, NewPM] Reduce size of msan binaries
EarlyCSEPass called after msan redices code size by about 10%.
Similar optimization exists for legacy pass manager in
addGeneralOptsForMemorySanitizer.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D96406
2021-02-11 16:07:18 -08:00
Vitaly Buka f2f59d2a06 [NFC] Extract function which registers sanitizer passes
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D96481
2021-02-11 15:29:48 -08:00
Pengxuan Zheng 61cca0f2e5 [AArch64] Adding Neon Sm3 & Sm4 Intrinsics
This adds SM3 and SM4 Intrinsics support for AArch64, specifically:
        vsm3ss1q_u32
        vsm3tt1aq_u32
        vsm3tt1bq_u32
        vsm3tt2aq_u32
        vsm3tt2bq_u32
        vsm3partw1q_u32
        vsm3partw2q_u32
        vsm4eq_u32
        vsm4ekeyq_u32

Reviewed By: labrinea

Differential Revision: https://reviews.llvm.org/D95655
2021-02-11 14:20:20 -08:00
Vitaly Buka b6051f52ac [Clang, NewPM] Add KMSan support
Depends on D96320.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D96328
2021-02-10 14:07:49 -08:00
Artem Belevich 2aa01ccec3 [CUDA, NVPTX] Allow targeting sm_86 GPUs.
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o
adding any new functionality.

Differential Revision: https://reviews.llvm.org/D95974
2021-02-09 11:01:10 -08:00
Nico Weber de1966e542 Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly"
This reverts commit 4a64d8fe39.
Makes clang crash when buildling trivial iOS programs, see comment
after https://reviews.llvm.org/D92808#2551401
2021-02-09 11:06:32 -05:00
Anastasia Stulova 79b222c39f [OpenCL] Fix types with signed prefix in arginfo metadata.
Signed prefix is removed and the single word spelling is
printed for the scalar types.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D96161
2021-02-09 15:13:19 +00:00
Wang, Pengfei dd2460ed5d [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.
Intrinsics *reduce_add/mul_ps/pd have assumption that the elements in
the vector are reassociable. So we need to always assign the reassoc
flag when we call _mm_reduce_* intrinsics.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D96231
2021-02-09 21:14:06 +08:00
Xiangling Liao 6b1e2fc893 [FE] Manipulate the first byte of guard variable type in both load and store operation
As Itanium ABI[http://itanium-cxx-abi.github.io/cxx-abi/abi.html#once-ctor]
points out:

"The size of the guard variable is 64 bits. The first byte (i.e. the byte at
the address of the full variable) shall contain the value 0 prior to
initialization of the associated variable, and 1 after initialization is complete."

Differential Revision: https://reviews.llvm.org/D95822
2021-02-08 11:14:34 -05:00
Anastasia Stulova ecc8ac3f08 [OpenCL] Fix pipe type printing in arg info metadata
Pipe element type spelling for arg info metadata
should follow the same behavior as normal type spelling.

We should only use the canonical type spelling in the
base type field.

This patch also removed duplication in type handling.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D96151
2021-02-08 16:05:13 +00:00
Petr Hosek 9fd9b5a9c9 Don't emit coverage mapping for excluded functions
When a function or a file is excluded using -fprofile-list= option,
don't emit coverage mapping as doing so confuses users since those
functions would always have zero count. This also reduces the binary
size considerably in cases where only a few functions or files are
being instrumented.

Differential Revision: https://reviews.llvm.org/D96000
2021-02-05 13:03:57 -08:00
Yaxun (Sam) Liu b008ea304d [CUDA][HIP] Fix device variable linkage
For -fgpu-rdc, shadow variables should not be internalized, otherwise
they cannot be accessed by other TUs. This is necessary because
the shadow variable of external device variables are always
emitted as undefined symbols, which need to resolve to a global
symbols.

Managed variables need to be emitted as undefined symbols
in device compilations.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D95901
2021-02-05 15:11:12 -05:00
Thomas Preud'homme 00a62547da Stop traping on sNaN in __builtin_isnan
__builtin_isnan currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: kpn

Differential Revision: https://reviews.llvm.org/D95948
2021-02-05 18:28:48 +00:00
Michael Liao 01bf529db2 Recommit of a2fdf9d4d7.
- The failures are all cc1-based tests due to the missing `-aux-triple` options,
which is always prepared by the driver in CUDA/HIP compilation.
- Add extra check on the missing aux-targetinfo to prevent crashing.

[hip][cuda] Enable extended lambda support on Windows.

- On Windows, extended lambda has extra issues due to the numbering
schemes are different between the host compilation (Microsoft C++ ABI)
and the device compilation (Itanium C++ ABI. Additional device side
lambda number is required per lambda for the host compilation to
correctly mangle the device-side lambda name.
- A hybrid numbering context `MSHIPNumberingContext` is introduced to
number a lambda for both host- and device-compilations.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D69322

This reverts commit 4874ff0241.
2021-02-05 11:27:30 -05:00
Akira Hatanaka 4a64d8fe39 [ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly
emitting retainRV or claimRV calls in the IR

This reapplies 3fe3946d9a without the
changes made to lib/IR/AutoUpgrade.cpp, which was violating layering.

Original commit message:

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.rv" to calls, which
  indicates the call is implicitly followed by a marker instruction and
  an implicit retainRV/claimRV call that consumes the call result. In
  addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
  consumes the call result, to prevent the middle-end passes from changing
  the return type of the called function. This is currently done only when
  the target is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  the call is annotated with claimRV since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if the implicit call is a call to
  retainRV and does nothing if it's a call to claimRV.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-02-05 06:09:42 -08:00
Akira Hatanaka 2fbbb18c1d Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly"
This reverts commit 3fe3946d9a.

The commit violates layering by including a header from Analysis in
lib/IR/AutoUpgrade.cpp.
2021-02-05 06:00:05 -08:00
Akira Hatanaka 3fe3946d9a [ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly
emitting retainRV or claimRV calls in the IR

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.rv" to calls, which
  indicates the call is implicitly followed by a marker instruction and
  an implicit retainRV/claimRV call that consumes the call result. In
  addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
  consumes the call result, to prevent the middle-end passes from changing
  the return type of the called function. This is currently done only when
  the target is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  the call is annotated with claimRV since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if the implicit call is a call to
  retainRV and does nothing if it's a call to claimRV.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-02-05 05:55:18 -08:00
Nico Weber 4874ff0241 Revert "[hip][cuda] Enable extended lambda support on Windows."
This reverts commit a2fdf9d4d7.
Slightly speculative, seeing several cuda tests fail on this
Windows bot: http://45.33.8.238/win/32620/step_7.txt
2021-02-04 07:10:46 -05:00
Michael Liao a2fdf9d4d7 [hip][cuda] Enable extended lambda support on Windows.
- On Windows, extended lambda has extra issues due to the numbering
  schemes are different between the host compilation (Microsoft C++ ABI)
  and the device compilation (Itanium C++ ABI. Additional device side
  lambda number is required per lambda for the host compilation to
  correctly mangle the device-side lambda name.
- A hybrid numbering context `MSHIPNumberingContext` is introduced to
  number a lambda for both host- and device-compilations.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D69322
2021-02-04 01:38:29 -05:00
Nico Weber b995314143 Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit 97ba5cde52.
Still breaks tests: https://reviews.llvm.org/D76802#2540647
2021-02-03 19:14:34 -05:00
Zequan Wu 4dc08cc3aa [Coverage] Propogate counter to condition of conditional operator
Clang usually propagates counter mapping region for conditions of `if`, `while`,
`for`, etc from parent counter. We should do the same for condition of conditional operator.

Differential Revision: https://reviews.llvm.org/D95918
2021-02-03 13:33:22 -08:00
Yaxun (Sam) Liu 0b2af1a288 [NFC][CUDA] Refactor registering device variable
Extract registering device variable to CUDA runtime codegen function since it
will be called in multiple places.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D95558
2021-02-03 14:29:51 -05:00
Kevin P. Neal 81b69879c9 [FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins
Currently clang is not correctly retrieving from the AST the metadata for
constrained FP builtins. This patch fixes that for the X86 specific builtins.

Differential Revision: https://reviews.llvm.org/D94614
2021-02-03 11:49:17 -05:00
Petr Hosek 97ba5cde52 [InstrProfiling] Use !associated metadata for counters, data and values
C identifier name input sections such as __llvm_prf_* are GC roots so
they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the
C identifier name semantics.

The !associated metadata may be attached to a global object declaration
with a single argument that references another global object, and it
gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded
by the linker, setting up !associated metadata allows linker to discard
counters, data and values associated with that function symbol.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2021-02-02 23:19:51 -08:00
Tom Weaver 4f1320b77d Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit df3e39f60b.

introduced failing test instrprof-gc-sections.c
causing build bot to fail:
http://lab.llvm.org:8011/#/builders/53/builds/1184
2021-02-02 14:19:31 +00:00
Hans Wennborg 0479c53b6c [dllimport] Honor always_inline when deciding whether a dllimport function should be available for inlining (PR48925)
Normally, Clang will not make dllimport functions available for inlining
if they reference non-imported symbols, as this can lead to confusing
link errors. But if the function is marked always_inline, the user
presumably knows what they're doing and the attribute should be honored.

Differential revision: https://reviews.llvm.org/D95673
2021-02-02 10:28:32 +01:00
Petr Hosek df3e39f60b [InstrProfiling] Use !associated metadata for counters, data and values
C identifier name input sections such as __llvm_prf_* are GC roots so
they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the
C identifier name semantics.

The !associated metadata may be attached to a global object declaration
with a single argument that references another global object, and it
gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded
by the linker, setting up !associated metadata allows linker to discard
counters, data and values associated with that function symbol.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2021-02-01 15:01:43 -08:00
xgupta 94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Amy Huang d5f5deee9e Reland "[DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas"
with fix to test case and stringrefs.

Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.

I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.

Bug: https://bugs.llvm.org/show_bug.cgi?id=48432

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D95187

This reverts 9b21d4b943
2021-01-28 18:44:48 -08:00
Amy Huang 9b21d4b943 Revert "[DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas."
for test failures.

This reverts commit d73564c510.
2021-01-28 16:41:26 -08:00
Amy Huang d73564c510 [DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas.
Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.

I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.

Bug: https://bugs.llvm.org/show_bug.cgi?id=48432

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D95187
2021-01-28 16:30:38 -08:00
Thomas Lively 4b68b64dcc [WebAssembly] Prototype i8x16 to i32x4 widening instructions
As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the
opcodes used in V8:
https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h

Differential Revision: https://reviews.llvm.org/D95557
2021-01-28 10:59:32 -08:00
Freddy Ye 1edb76cc91 [X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D94466
2021-01-27 22:54:17 +08:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Fangrui Song 34b60d8a56 Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Fangrui Song 189f311130 CGDebugInfo CreatedLimitedType: Drop file/line for RecordType with invalid location
For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`),
its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`.

In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that
in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line
(the transitively included file uses va_arg (`__builtin_va_arg`)).
This seems arbitrary. Drop that.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D94735
2021-01-26 11:53:25 -08:00
Fangrui Song 31d375f178 CGDebugInfo: Drop Loc.isInvalid() special case from getLineNumber
`getLineNumber()` picks CurLoc if the parameter is invalid. This appears to
mainly work around missing SourceLocation information for some constructs, but
sometimes adds unintended locations.

* For `CodeGenObjC/debug-info-blocks.m`, `CurLoc` has been advanced to the closing brace. The debug line of `ImplicitVarParameter` is set to the line of `}` because this implicit parameter has an invalid `SourceLocation`. The debug line is a bit arbitrary - perhaps the location of `^{` is better.
* The file/line of Clang synthesized `__va_list_tag` is arbitrarily attached a `#include` line. D94735

Drop the special case to make getLineNumber less magic and add CurLoc fallback in its callers instead.

Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical.

Reviewed By: #debug-info, aprantl

Differential Revision: https://reviews.llvm.org/D94391
2021-01-26 11:44:41 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Freddy Ye b3b0acdc6f [NFC] Refine some uninitialized used variables.
These warning are reported by static code analysis tool: Klocwork

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D95421
2021-01-26 16:51:05 +08:00
Johannes Doerfert bd756286d2 [OpenMP][FIX] Enforce a function boundary for a new data environment
Whenever we enter a new OpenMP data environment we want to enter a
function to simplify reasoning. Later we probably want to remove the
entire specialization wrt. the if clause and pass the result to the
runtime, for now this should fix PR48686.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D94315
2021-01-25 22:43:37 -06:00
Richard Smith 925ae8c790 Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV"
This reverts commit 53176c1680, which
introduceed a layering violation. LLVM's IR library can't include
headers from Analysis.
2021-01-25 13:53:38 -08:00
Akira Hatanaka 53176c1680 [ObjC][ARC] Annotate calls with attributes instead of emitting retainRV
or claimRV calls in the IR

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end annotates calls with attribute "clang.arc.rv"="retain"
  or "clang.arc.rv"="claim", which indicates the call is implicitly
  followed by a marker instruction and a retainRV/claimRV call that
  consumes the call result. This is currently done only when the target
  is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the
  annotated calls in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the annotated
  calls. It doesn't remove the attribute on the call since the backend
  needs it to emit the marker instruction. The retainRV/claimRV calls
  are emitted late in the pipeline to prevent optimization passes from
  transforming the IR in a way that makes it harder for the ARC
  middle-end passes to figure out the def-use relationship between the
  call and the retainRV/claimRV calls (which is the cause of PR31925).

- The function inliner removes the autoreleaseRV call in the callee that
  returns the result if nothing in the callee prevents it from being
  paired up with the calls annotated with "clang.arc.rv"="retain/claim"
  in the caller. If the call is annotated with "claim", a release call
  is inserted since autoreleaseRV+claimRV is equivalent to a release. If
  it cannot find an autoreleaseRV call, it tries to transfer the
  attributes to a function call in the callee. This is important since
  ARC optimizer can remove the autoreleaseRV call returning the callee
  result, which makes it impossible to pair it up with the retainRV or
  claimRV call in the caller. If that fails, it simply emits a retain
  call in the IR if the call is annotated with "retain" and does nothing
  if it's annotated with "claim".

- This patch teaches dead argument elimination pass not to change the
  return type of a function if any of the calls to the function are
  annotated with attribute "clang.arc.rv". This is necessary since the
  pass can incorrectly determine nothing in the IR uses the function
  return, which can happen since the front-end no longer explicitly
  emits retainRV/claimRV calls in the IR, and change its return type to
  'void'.

Future work:

- Use the attribute on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the attributes.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-01-25 11:57:08 -08:00
Keith Smiley c3324450b2 [clang] Add -fprofile-prefix-map
This flag allows you to re-write absolute paths in coverage data analogous to -fdebug-prefix-map. This flag is also implied by -ffile-prefix-map.
2021-01-25 10:14:04 -08:00
George Koehler 018984ae68 [PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets
In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call.
va_arg looked in the wrong registers for (the pointer representation of) an
object in Objective-C, and for some types in C++. Fix va_arg to look in the
general-purpose registers, not the floating-point registers. Also fix va_arg
for some C++ types, like a member function pointer, that are aggregates for
the ABI.

Anthony Richardby found the problem in Objective-C. Eli Friedman suggested
part of this fix.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47921

Reviewed By: efriedma, nemanjai

Differential Revision: https://reviews.llvm.org/D90329
2021-01-23 00:13:36 -05:00
Bjorn Pettersson ea2cfda386 [CGExpr] Use getCharWidth() more consistently in CCGExprConstant. NFC
Most of CGExprConstant.cpp is using the CharUnits abstraction
and is using getCharWidth() (directly of indirectly) when converting
between size of a char and size in bits. This patch is making that
abstraction more consistent by adding CharTy to the CodeGenTypeCache
(honoring getCharWidth() when mapping from char to LLVM IR types,
instead of using Int8Ty directly).

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D94979
2021-01-22 21:12:17 +01:00
Bjorn Pettersson 72f863fd37 [CodeGen] Use getCharWidth() more consistently in CGRecordLowering. NFC
When using getByteArrayType the requested size is calculated in
char units, but the type used for the array was hardcoded to the
Int8Ty. This patch is using getCharWIdth a bit more consistently
by using getIntNTy in combination with getCharWidth, instead
of explictly using getInt8Ty.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D94977
2021-01-22 21:12:17 +01:00
Yaxun (Sam) Liu 622eaa4a4c [HIP] Support __managed__ attribute
This patch implements codegen for __managed__ variable attribute for HIP.

Diagnostics will be added later.

Differential Revision: https://reviews.llvm.org/D94814
2021-01-22 11:43:58 -05:00
Akira Hatanaka 3d349ed7e1 [CodeGen][ObjC] Fix broken IR generated when there is a nil receiver
check

This patch fixes a bug in emitARCOperationAfterCall where it inserts the
fall-back call after a bitcast instruction and then replaces the
bitcast's operand with the result of the fall-back call. The generated
IR without this patch looks like this:

msgSend.call:                                     ; preds = %entry
  %call = call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend
  br label %msgSend.cont

msgSend.null-receiver:                            ; preds = %entry
  call void @llvm.objc.release(i8* %4)
  br label %msgSend.cont

msgSend.cont:
  %8 = phi i8* [ %call, %msgSend.call ], [ null, %msgSend.null-receiver ]
  %9 = bitcast i8* %10 to %0*
  %10 = call i8* @llvm.objc.retain(i8* %8)

Notice that `%9 = bitcast i8* %10` to %0* is taking operand %10 which is
defined after it.

To fix the bug, this patch modifies the insert point to point to the
bitcast instruction so that the fall-back call is inserted before the
bitcast. In addition, it teaches the function to look at phi
instructions that are generated when there is a check for a null
receiver and insert the retainRV/claimRV instruction right after the
call instead of inserting a fall-back call right after the phi
instruction.

rdar://73360225

Differential Revision: https://reviews.llvm.org/D95181
2021-01-21 17:38:46 -08:00
Jon Roelofs 1deee5cacb Fix crash when emitting NullReturn guards for functions returning BOOL
CodeGenModule::EmitNullConstant() creates constants with their "in memory"
type, not their "in vregs" type. The one place where this difference matters is
when the type is _Bool, as that is an i1 when in vregs and an i8 in memory.

Fixes: rdar://73361264
2021-01-21 14:29:36 -08:00
Joseph Huber e4eaf9d820 [OpenMP] Add support for mapping names in mapper API
Summary:
The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D94806
2021-01-21 09:26:44 -05:00
Amy Huang a3d7cee7f9 [CodeView] Emit function types in -gline-tables-only.
This change adds function types to further differentiate between
FUNC_IDs in -gline-tables-only.

Size increase of object files in clang are
Before: 917990 kb
After:  999312 kb

Bug: https://bugs.llvm.org/show_bug.cgi?id=48432

Differential Revision: https://reviews.llvm.org/D95001
2021-01-20 12:47:35 -08:00
Thomas Lively 11802eced5 [WebAssembly] Prototype new f64x2 conversions
As proposed in https://github.com/WebAssembly/simd/pull/383.

Differential Revision: https://reviews.llvm.org/D95012
2021-01-20 11:28:06 -08:00
Hans Wennborg 8ba442bc21 Revert "Following up on PR48517, fix handling of template arguments that refer"
Combined with 'da98651 - Revert "DR2064:
decltype(E) is only a dependent', this change (5a391d3) caused verifier
errors when building Chromium. See https://crbug.com/1168494#c1 for a
reproducer.

Additionally it reverts changes that were dependent on this one, see
below.

> Following up on PR48517, fix handling of template arguments that refer
> to dependent declarations.
>
> Treat an id-expression that names a local variable in a templated
> function as being instantiation-dependent.
>
> This addresses a language defect whereby a reference to a dependent
> declaration can be formed without any construct being value-dependent.
> Fixing that through value-dependence turns out to be problematic, so
> instead this patch takes the approach (proposed on the core reflector)
> of allowing the use of pointers or references to (but not values of)
> dependent declarations inside value-dependent expressions, and instead
> treating template arguments as dependent if they evaluate to a constant
> involving such dependent declarations.
>
> This ends up affecting a bunch of OpenMP tests, due to OpenMP
> imprecisely handling instantiation-dependent constructs, bailing out
> early instead of processing dependent constructs to the extent possible
> when handling the template.
>
> Previously committed as 8c1f2d15b8, and
> reverted because a dependency commit was reverted.

This reverts commit 5a391d38ac.

It also restores clang/test/SemaCXX/coroutines.cpp to its state before
da986511fb.

Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type."

> Previously committed as 9e08e51a20, and
> reverted because a dependency commit was reverted. This incorporates the
> following follow-on commits that were also reverted:
>
> 7e84aa1b81 by Simon Pilgrim
> ed13d8c667 by me
> 95c7b6cadb by Sam McCall
> 430d5d8429 by Dave Zarzycki

This reverts commit 4b574008ae.

Revert "[msabi] Mangle a template argument referring to array-to-pointer decay"

> [msabi] Mangle a template argument referring to array-to-pointer decay
> applied to an array the same as the array itself.
>
> This follows MS ABI, and corrects a regression from the implementation
> of generalized non-type template parameters, where we "forgot" how to
> mangle this case.

This reverts commit 18e093faf7.
2021-01-20 15:55:35 +01:00
Alexey Bataev b272698de7 [OPENMP]Do not use OMP_MAP_TARGET_PARAM for data movement directives.
OMP_MAP_TARGET_PARAM flag is used to mark the data that shoud be passed
as arguments to the target kernels, nothing else. But the compiler still
marks the data with OMP_MAP_TARGET_PARAM flags even if the data is
passed to the data movement directives, like target data, target update
etc. This flag is just ignored for this directives and the compiler does
not need to emit it.

Reviewed By: cchen

Differential Revision: https://reviews.llvm.org/D91261
2021-01-19 12:41:15 -08:00
Shilei Tian 82e537a9d2 [Clang][OpenMP] Fixed an issue that clang crashed when compiling OpenMP program in device only mode without host IR
D94745 rewrites the `deviceRTLs` using OpenMP and compiles it by directly
calling the device compilation. `clang` crashes because entry in
`OffloadEntriesDeviceGlobalVar` is unintialized. Current design supposes the
device compilation can only be invoked after host compilation with the host IR
such that `clang` can initialize `OffloadEntriesDeviceGlobalVar` from host IR.
This avoids us using device compilation directly, especially when we only have
code wrapped into `declare target` which are all device code. The same issue
also exists for `OffloadEntriesInfoManager`.

In this patch, we simply initialized an entry if it is not in the maps. Not sure
we need an option to tell the device compiler that it is invoked standalone.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94871
2021-01-19 14:18:42 -05:00
Richard Smith 4b574008ae [c++20] P1907R1: Support for generalized non-type template arguments of scalar type.
Previously committed as 9e08e51a20, and
reverted because a dependency commit was reverted. This incorporates the
following follow-on commits that were also reverted:

7e84aa1b81 by Simon Pilgrim
ed13d8c667 by me
95c7b6cadb by Sam McCall
430d5d8429 by Dave Zarzycki
2021-01-18 21:05:01 -08:00
Amy Huang 6227069bdc [DebugInfo][CodeView] Change in line tables only mode to emit type information
for function scopes, rather than using the qualified name.

In line-tables-only mode, we used to emit qualified names as the display name for functions when using CodeView.
This patch changes to emitting the parent scopes instead, with forward declarations for class types.
The total object file size ends up being slightly smaller than if we use the full qualified names.

Differential Revision: https://reviews.llvm.org/D94639
2021-01-15 09:28:27 -08:00
Qiu Chaofan 168be42083 [Clang] Mutate long-double math builtins into f128 under IEEE-quad
Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic
is used for long double. This patch mutates call to related builtins
into f128 version on PowerPC. And in theory, this should be applied to
other targets when their backend supports IEEE 128-bit style libcalls.

GCC already has these mutations except nansl, which is not available on
PowerPC along with other variants (nans, nansf).

Reviewed By: RKSimon, nemanjai

Differential Revision: https://reviews.llvm.org/D92080
2021-01-15 16:56:20 +08:00
Lucas Prates 2b1e25befe [AArch64] Adding ACLE intrinsics for the LS64 extension
This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes
atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`,
and `__arm_st64bv0`. These are selected into the LS64 instructions
LD64B, ST64B, ST64BV and ST64BV0, respectively.

Based on patches written by Simon Tatham.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D93232
2021-01-14 09:43:58 +00:00
Zequan Wu e53bbd9951 [IR] move nomerge attribute from function declaration/definition to callsites
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.

Differential Revision: https://reviews.llvm.org/D94537
2021-01-12 12:10:46 -08:00
David Truby e5f51fdd65 [clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate
MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto
specification on that platform, so match msvc's behaviour.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611

Co-authored-by: Peter Waller <peter.waller@arm.com>

Differential Revision: https://reviews.llvm.org/D92751
2021-01-12 19:44:01 +00:00
Bevin Hansson c4944a6f53 [Fixed Point] Add codegen for conversion between fixed-point and floating point.
The patch adds the required methods to FixedPointBuilder
for converting between fixed-point and floating point,
and uses them from Clang.

This depends on D54749.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D86632
2021-01-12 13:53:01 +01:00
Fangrui Song b88c8f1aab CGDebugInfo: Delete unused parameters 2021-01-11 13:39:03 -08:00
Fangrui Song f4cec703ec Add an assert to CGDebugInfo::getTypeOrNull 2021-01-11 13:25:20 -08:00
Joe Ellis 8ea72b3887 [clang][AArch64][SVE] Avoid going through memory for coerced VLST return values
VLST return values are coerced to VLATs in the function epilog for
consistency with the VLAT ABI. Previously, this coercion was done
through memory. It is preferable to use the
llvm.experimental.vector.insert intrinsic to avoid going through memory
here.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D94290
2021-01-11 12:10:59 +00:00
Fangrui Song b8d2842088 CGDebugInfo: Delete unneeded UnwrapTypeForDebugInfo
Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical.
2021-01-10 22:22:07 -08:00
Fangrui Song 6215c1b778 CGDebugInfo: Delete redundant test 2021-01-10 22:22:06 -08:00
Fangrui Song 02bc320545 CGDebugInfo: Delete unused DIFile* parameter 2021-01-10 15:03:40 -08:00
Fangrui Song abfe348e6b [test] Improve CodeGenCXX/difile_entry.cpp
The test added in D87147 did not actually test PR47391.
Use an absolute path to test the canonicalization.
2021-01-10 12:24:49 -08:00
Fangrui Song e2e82c9983 [CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data
ELF -fno-pic sets dso_local on a function declaration to allow direct accesses
when taking its address (similar to a data symbol). The emitted code follows the
traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced.

If the function is not defined in the executable, a canonical PLT entry will be
needed at link time. This is similar to a copy relocation and is incompatible
with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in
a shared object).

This patch gives -fno-pic code a way to avoid such a canonical PLT entry.

The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie
-fdirect-access-external-data). While we could set dso_local to avoid GOT when
taking the address of a function declaration (there is an ignorable difference
about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit
and can just cause trouble, so we don't make the generalization.
2021-01-09 16:31:56 -08:00
Fangrui Song 38a716c30f Make -fno-pic respect -fno-direct-access-external-data
D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations.
(The option works for -fpie but is a no-op for -fno-pic and -fpic.)

This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from
global variable declarations. This usually causes the backend to emit a GOT
indirection for external data access. With a GOT relocation, the subsequent
-no-pie link will not have copy relocation even if the data symbol turns out to
be defined by a shared object.

Differential Revision: https://reviews.llvm.org/D92714
2021-01-09 00:32:02 -08:00
Fangrui Song 1d3ebbf537 Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations
GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made
-fpie code emit R_X86_64_PC32 to reference external data symbols by default.
Clang adopted -mpie-copy-relocations D19996 as a flexible alternative.

The name -mpie-copy-relocations can be improved [1] and does not capture the
idea that this option can apply to -fno-pic and -fpic [2], so this patch
introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations
their aliases for compatibility.

[1]
For
```
extern int var;
int get() { return var; }
```
if var is defined in another translation unit in the link unit, there is no copy
relocation.

[2]
-fno-pic -fno-direct-access-external-data is useful to avoid copy relocations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888
If a shared object is linked with -Bsymbolic or --dynamic-list and exports a
data symbol, normally the data symbol cannot be accessed by -fno-pic code
(because by default an absolute relocation is produced which will lead to a copy
relocation). -fno-direct-access-external-data can prevent copy relocations.

-fpic -fdirect-access-external-data can avoid GOT indirection. This is like the
undefined counterpart of -fno-semantic-interposition. However, the user should
define var in another translation unit and link with -Bsymbolic or
--dynamic-list, otherwise the linker will error in a -shared link. Generally
the user has better tools for their goal but I want to mention that this
combination is valid.

On COFF, the behavior is like always -fdirect-access-external-data.
`__declspec(dllimport)` is needed to enable indirect access.

There is currently no plan to affect non-ELF behaviors or -fpic behaviors.

-fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch.

GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D92633
2021-01-09 00:32:01 -08:00
Umesh Kalappa 33c8e16f66 PR47391: Canonicalize DIFiles
Like @aprantl suggested, modify to  use the canonicalized DIFile, if we
don't know the  loc info and filename for the compiler generated
functions for example static initialization functions.

Reviewed By: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D87147
2021-01-08 22:11:16 -08:00
Arthur Eubanks 756dd70766 [NewPM] Run ObjC ARC passes
Match the legacy PM in running various ObjC ARC passes.

This requires making some module passes into function passes. These were
initially ported as module passes since they add function declarations
(e.g. https://reviews.llvm.org/D86178), but that's still up for debate
and other passes do so.

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D93743
2021-01-08 15:47:11 -08:00
Hongtao Yu 0e23fd676c [Driver] Add DWARF64 flag: -gdwarf64
@ikudrin enabled support for dwarf64 in D87011.  Adding a clang flag so it can be used through that compilation pass.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D90507
2021-01-08 12:58:38 -08:00
Heejin Ahn 7be271537e [WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin
`wasm_rethrow_in_catch` intrinsic and builtin are used in order to
rethrow an exception when the exception is caught but there is no
matching clause within the current `catch`. For example,
```
try {
  foo();
} catch (int n) {
  ...
}
```
If the caught exception does not correspond to C++ `int` type, it should
be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch`
because at the time I thought there would be another intrinsic for C++'s
`throw` keyword, which rethrows an exception. It turned out that `throw`
keyword doesn't require wasm's `rethrow` instruction, so we rename
`rethrow_in_catch` to just `rethrow` here.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94038
2021-01-08 06:55:04 -08:00
David Sherwood 38d18d9353 [SVE] Add support to vectorize_width loop pragma for scalable vectors
This patch adds support for two new variants of the vectorize_width
pragma:

1. vectorize_width(X[, fixed|scalable]) where an optional second
parameter is passed to the vectorize_width pragma, which indicates if
the user wishes to use fixed width or scalable vectorization. For
example the user can now write something like:

  #pragma clang loop vectorize_width(4, fixed)
or
  #pragma clang loop vectorize_width(4, scalable)

In the absence of a second parameter it is assumed the user wants
fixed width vectorization, in order to maintain compatibility with
existing code.
2. vectorize_width(fixed|scalable) where the width is left unspecified,
but the user hints what type of vectorization they prefer, either
fixed width or scalable.

I have implemented this by making use of the LLVM loop hint attribute:

  llvm.loop.vectorize.scalable.enable

Tests were added to

  clang/test/CodeGenCXX/pragma-loop.cpp

for both the 'fixed' and 'scalable' optional parameter.

See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html

Differential Revision: https://reviews.llvm.org/D89031
2021-01-08 11:37:27 +00:00
Wang, Pengfei c102b9697b [X86] Correct the comments about comparison intrinsics. NFCI. 2021-01-08 15:36:15 +08:00
Reid Kleckner ad55d5c3f3 Simplify vectorcall argument classification of HVAs, NFC
This reduces the number of `WinX86_64ABIInfo::classify` call sites from
3 to 1. The call sites were similar, but passed different values for
FreeSSERegs. Use variables instead of `if`s to manage that argument.
2021-01-07 11:14:18 -08:00
Pushpinder Singh 4909cb1a0f [OpenMP][AMDGPU] Use AMDGPU_KERNEL calling convention for entry function
AMDGPU backend requires entry functions/kernels to have AMDGPU_KERNEL
calling convention for proper linking.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D94060
2021-01-06 02:03:30 -05:00
Thomas Lively 497026c902 [WebAssembly] Prototype prefetch instructions
As proposed in https://github.com/WebAssembly/simd/pull/352 and using the
opcodes used in the V8 prototype:
https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions
are only usable via intrinsics and clang builtins to make them opt-in while they
are being benchmarked.

Differential Revision: https://reviews.llvm.org/D93883
2021-01-05 11:32:03 -08:00
Simon Pilgrim 55488bd3cd CGExpr - EmitMatrixSubscriptExpr - fix getAs<> null-dereference static analyzer warning. NFCI.
getAs<> can return null if the cast is invalid, which can lead to null pointer deferences. Use castAs<> instead which will assert that the cast is valid.
2021-01-05 17:08:11 +00:00
Alan Phipps 9f2967bcfe [Coverage] Add support for Branch Coverage in LLVM Source-Based Code Coverage
This is an enhancement to LLVM Source-Based Code Coverage in clang to track how
many times individual branch-generating conditions are taken (evaluate to TRUE)
and not taken (evaluate to FALSE).  Individual conditions may comprise larger
boolean expressions using boolean logical operators.  This functionality is
very similar to what is supported by GCOV except that it is very closely
anchored to the ASTs.

Differential Revision: https://reviews.llvm.org/D84467
2021-01-05 09:51:51 -06:00
Joe Ellis 3d5b18a3fd [clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments
VLST arguments are coerced to VLATs at the function boundary for
consistency with the VLAT ABI. They are then bitcast back to VLSTs in
the function prolog. Previously, this conversion is done through memory.
With the introduction of the llvm.vector.{insert,extract} intrinsic, we
can avoid going through memory here.

Depends on D92761

Differential Revision: https://reviews.llvm.org/D92762
2021-01-05 15:18:21 +00:00
Thorsten Schütt 2fd11e0b1e Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)"
This reverts commit efc82c4ad2.
2021-01-04 23:17:45 +01:00
Thorsten Schütt efc82c4ad2 [NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D93765
2021-01-04 22:58:26 +01:00
Hongtao Yu 4034f9273e Switching Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm.
As a follow-up to D93656, I'm switching the Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm.

Test Plan:

Reviewed By: aeubanks, tmsriram

Differential Revision: https://reviews.llvm.org/D94019
2021-01-04 12:04:46 -08:00
Jon Chesterfield 76bfbb74d3 [libomptarget][amdgpu] Call into deviceRTL instead of ockl
[libomptarget][amdgpu] Call into deviceRTL instead of ockl

Amdgpu codegen presently emits a call into ockl. The same functionality
is already present in the deviceRTL. Adds an amdgpu specific entry point
to avoid the dependency. This lets simple openmp code (specifically, that
which doesn't use libm) run without rocm device libraries installed.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D93356
2021-01-04 16:48:47 +00:00
Brandon Bergren 6cee9d0cf8 [PowerPC] Support powerpcle target in Clang [3/5]
Add powerpcle support to clang.

For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment.

For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC.

Adjust driver to match current binutils behavior regarding machine naming.

Adjust and expand tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93919
2021-01-02 12:17:58 -06:00
Fangrui Song d1fd72343c Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions
The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.

The refactoring is made available by 2820a2ca3a.

Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:

* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.

Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
2020-12-31 13:59:45 -08:00
Fangrui Song 809a1e0ffd [CodeGenModule] Set dso_local for Mach-O GlobalValue
* static relocation model: always
* other relocation models: if isStrongDefinitionForLinker

This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.
2020-12-30 20:52:01 -08:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Fangrui Song 2820a2ca3a Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule
This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the
decision to use dso_local. For LLVM synthesized functions/globals, they may lose
inferred dso_local but such optimizations are probably not very useful.

Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now.
(llvm/CodeGen/X86/semantic-interposition-comdat.ll)
(Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)
2020-12-29 23:37:55 -08:00
James Y Knight 4ddf140c00 Fix PR35902: incorrect alignment used for ubsan check.
UBSan was using the complete-object align rather than nv alignment
when checking the "this" pointer of a method.

Furthermore, CGF.CXXABIThisAlignment was also being set incorrectly,
due to an incorrectly negated test. The latter doesn't appear to have
had any impact, due to it not really being used anywhere.

Differential Revision: https://reviews.llvm.org/D93072
2020-12-28 18:11:17 -05:00
Thomas Lively 5e09e9979b [WebAssembly] Prototype extending pairwise add instructions
As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes
the new instructions available only via clang builtins and LLVM intrinsics to
make their use opt-in while they are still being evaluated for inclusion in the
SIMD proposal.

Depends on D93771.

Differential Revision: https://reviews.llvm.org/D93775
2020-12-28 14:11:14 -08:00
Akira Hatanaka 34405b41d6 [CodeGen][ObjC] Destroy callee-destroyed arguments in the caller
function when the receiver is nil

Callee-destroyed arguments to a method have to be destroyed in the
caller function when the receiver is nil as the method doesn't get
executed. This fixes PR48207.

rdar://71808391

Differential Revision: https://reviews.llvm.org/D93273
2020-12-28 11:52:27 -08:00
Alexandre Ganea 69132d12de [Clang] Reverse test to save on indentation. NFC. 2020-12-23 19:24:53 -05:00
Alan Phipps bbd758a791 Revert "This is a test commit"
This reverts commit b920adf3b4.
2020-12-23 13:04:37 -06:00
Alan Phipps b920adf3b4 This is a test commit 2020-12-23 12:57:27 -06:00
Arthur O'Dwyer 22cf54a7fb Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC.
Differential Revision: https://reviews.llvm.org/D76572
2020-12-22 19:54:29 -05:00
Arthur Eubanks 2080232333 Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type."
This reverts commit 9e08e51a20.

This is part of 5 commits being reverted due to https://crbug.com/1161059. See bug for repro.
2020-12-22 10:18:08 -08:00
Richard Smith 9e08e51a20 [c++20] P1907R1: Support for generalized non-type template arguments of scalar type. 2020-12-18 01:08:41 -08:00
Rong Xu 3733463dbb [IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.

This patch adds support of hot function attribute to LLVM IR.  This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).

This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
    is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
    isFunctionColdInCallGraph is true, this function will be marked as
    cold.

The changes are:
(1) user annotated function attribute will used in setting function
    section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.

The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.

Differential Revision: https://reviews.llvm.org/D92493
2020-12-17 18:41:12 -08:00
Tom Stellard 3203143f13 CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int)
Add a special case for handling __builtin_mul_overflow with unsigned
inputs and a signed output to avoid emitting the __muloti4 library
call on x86_64.  __muloti4 is not implemented in libgcc, so avoiding
this call fixes compilation of some programs that call
__builtin_mul_overflow with these arguments.

For example, this fixes the build of cpio with clang, which includes code from
gnulib that calls __builtin_mul_overflow with these argument types.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D84405
2020-12-17 14:30:31 -08:00
Baptiste Saleil c2892978e9 [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_
On PPC, the vector pair instructions are independent from MMA.
This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names.
We also move the vector pair type/intrinsic/builtin tests to their own files.

Differential Revision: https://reviews.llvm.org/D91974
2020-12-17 13:19:27 -05:00
Zequan Wu fb0f728805 [Clang] Make nomerge attribute a function attribute as well as a statement attribute.
Differential Revision: https://reviews.llvm.org/D92800
2020-12-17 07:45:38 -08:00
dfukalov 9ed8e0caab [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Fangrui Song c70f36865e Use basic_string::find(char) instead of basic_string::find(const char *s, size_type pos=0)
Many (StringRef) cannot be detected by clang-tidy performance-faster-string-find.
2020-12-16 23:28:32 -08:00
Joe Ellis dad07baf12 [clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts
This change makes use of the llvm.vector.extract intrinsic to avoid
going through memory when performing bitcasts between vector-length
agnostic types and vector-length specific types.

Depends on D91362

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D92761
2020-12-16 12:24:32 +00:00
Johannes Doerfert b9c77542e2 [Clang][Attr] Introduce the `assume` function attribute
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.

A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.

The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D91979
2020-12-15 16:51:34 -06:00
Fangrui Song 59decf8e9c [clang] Migrate deprecated DebugInfo::get to DILocation::get 2020-12-15 13:59:31 -08:00
Baptiste Saleil 57d83c3a90 [PowerPC] Enable paired vector type and intrinsics when MMA is disabled
This patch enables the Clang type __vector_pair and its associated LLVM
intrinsics even when MMA is disabled. With this patch, the type is now controlled
by the PPC paired-vector-memops option. The builtins and intrinsics will be
renamed to drop the mma prefix in another patch.

Differential Revision: https://reviews.llvm.org/D91819
2020-12-15 15:14:11 -06:00
Jan Svoboda f24e58df7d [clang][cli] Create accessors for exception models in LangOptions
This abstracts away the members that are being replaced in a follow-up patch.

Depends on D83979.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D93214
2020-12-15 10:15:58 +01:00
Rong Xu c36f31c4db [PGO] remove unintentional code in early commit
Remove unintentional code in
commit 54e03d [PGO] Verify BFI counts after loading profile data.
2020-12-14 18:41:49 -08:00
Rong Xu 54e03d03a7 [PGO] Verify BFI counts after loading profile data
This patch adds the functionality to compare BFI counts with real
profile
counts right after reading the profile. It will print remarks under
-Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo.

Differential Revision: https://reviews.llvm.org/D91813
2020-12-14 15:56:10 -08:00
Gulfem Savrun Yeniceri 7c0e3a77bc [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Matt Arsenault ef4da3c2ba clang: Add byval on x86_intrcc parameter 0
This will allow removing the special case treatment of the parameter
and avoid depending on the pointer's element type.
2020-12-14 16:34:37 -05:00
Simon Pilgrim 4855a1004d [X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well.

Differential Revision: https://reviews.llvm.org/D92940
2020-12-13 15:37:35 +00:00
Alexey Bader a500a43587 [CodeGen][AMDGPU] Fix ICE for static initializer IR generation
Differential Revision: https://reviews.llvm.org/D92782
2020-12-12 23:26:54 +03:00
Melanie Blower 320af6b138 Create SPIRABIInfo to enable SPIR_FUNC calling convention.
Background: Call to library arithmetic functions for div is emitted by the
compiler and it set wrong “C” calling convention for calls to these functions,
whereas library functions are declared with `spir_function` calling convention.
InstCombine optimization replaces such calls with “unreachable” instruction.
It looks like clang lacks SPIRABIInfo class which should specify default
calling conventions for “system” function calls. SPIR supports only
SPIR_FUNC and SPIR_KERNEL calling convention.

Reviewers: Erich Keane, Anastasia

Differential Revision: https://reviews.llvm.org/D92721
2020-12-12 05:48:20 -08:00
Arthur Eubanks ff7e1da68f [NPM] Support -fmerge-functions
I tried to put it in the same place in the pipeline as the legacy PM.

Fixes PR48399.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D93002
2020-12-10 11:45:08 -08:00
Florian Hahn 9c4cddb53a
[Clang] Add vcmla and rotated variants for Arm ACLE.
This patch adds vcmla and the rotated variants as defined in
"Arm Neon Intrinsics Reference for ACLE Q3 2020" [1]

The *_lane_* are still missing, but they can be added separately.

This patch only adds the builtin mapping for AArch64.

[1] https://developer.arm.com/documentation/ihi0073/latest

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D92930
2020-12-10 16:54:08 +00:00
Fangrui Song f9c0d1b056 [Driver] Add -f[no-]legacy-pass-manager to supersede -f[no-]experimental-new-pass-manager
The new PM is considered stable and many downstream groups have adopted it (some
have adopted it for more than two years). Add -f[no-]legacy-pass-manager to reflect the
fact that it is no longer experimental and the legacy pass manager is something we strive to retire.

In the future, when the legacy PM eventually goes away,
-fno-experimental-new-pass-manager and -flegacy-pass-manager will be removed.

This patch also changes -f[no-]legacy-pass-manager to pass `-plugin-opt={new,legacy}-pass-manager` to the linker (supported by both ld.lld and LLVMgold.so) when -flto/-flto=thin is specified

Reviewed By: aeubanks, rsmith

Differential Revision: https://reviews.llvm.org/D92915
2020-12-09 16:57:36 -08:00
Reid Kleckner df282215d4 Don't setup inalloca for swiftcc on i686-windows-msvc
Swiftcall does it's own target-independent argument type classification,
since it is not designed to be ABI compatible with anything local on the
target that isn't LLVM-based. This means it never uses inalloca.
However, we have duplicate logic for checking for inalloca parameters
that runs before call argument setup. This logic needs to know ahead of
time if inalloca will be used later, and we can't move the
CGFunctionInfo calculation earlier.

This change gets the calling convention from either the
FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so
skips the stackbase setup.

Depends on D92883.

Differential Revision: https://reviews.llvm.org/D92944
2020-12-09 11:08:48 -08:00
Reid Kleckner d7098ff29c De-templatify EmitCallArgs argument type checking, NFCI
This template exists to abstract over FunctionPrototype and
ObjCMethodDecl, which have similar APIs for storing parameter types. In
place of a template, use a PointerUnion with two cases to handle this.
Hopefully this improves readability, since the type of the prototype is
easier to discover. This allows me to sink this code, which is mostly
assertions, out of the header file and into the cpp file. I can also
simplify the overloaded methods for computing isGenericMethod, and get
rid of the second EmitCallArgs overload.

Differential Revision: https://reviews.llvm.org/D92883
2020-12-09 11:08:00 -08:00
Yuanfang Chen 1821265db6 [Time-report] Add a flag -ftime-report={per-pass,per-pass-run} to control the pass timing aggregation
Currently, -ftime-report + new pass manager emits one line of report for each
pass run. This potentially causes huge output text especially with regular LTO
or large single file (Obeserved in private tests and was reported in D51276).
The behaviour of -ftime-report + legacy pass manager is
emitting one line of report for each pass object which has relatively reasonable
text output size. This patch adds a flag `-ftime-report=` to control time report
aggregation for new pass manager.

The flag is for new pass manager only. Using it with legacy pass manager gives
an error. It is a driver and cc1 flag. `per-pass` is the new default so
`-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch,
functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`.

* Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run.
* Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses.
* Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses.
* Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.)

Differential Revision: https://reviews.llvm.org/D92436
2020-12-08 10:13:19 -08:00
Kevin P. Neal acd4950d4f [FPEnv] Correct constrained metadata in fp16-ops-strict.c
This test shows we're in some cases not getting strictfp information from
the AST. Correct that.

Differential Revision: https://reviews.llvm.org/D92596
2020-12-08 10:18:32 -05:00
Tim Northover c5978f42ec UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.
2020-12-08 10:28:26 +00:00
Luís Marques 3af354e863 [Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex
Fixes bug 44904.

Differential Revision: https://reviews.llvm.org/D91278
2020-12-08 09:19:05 +00:00
Luís Marques fa8f5bfa4e [Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct
The code seemed not to account for the field 1 offset.

Differential Revision: https://reviews.llvm.org/D91270
2020-12-08 09:19:05 +00:00
Vitaly Buka 6e614b0c7e [NFC][MSan] Round up OffsetPtr in PoisonMembers
getFieldOffset(layoutStartOffset)  is expected to point to the first trivial
field or the one which follows non-trivial. So it must be byte aligned already.
However this is not obvious without assumptions about callers.
This patch will avoid the need in such assumptions.

Depends on D92727.

Differential Revision: https://reviews.llvm.org/D92728
2020-12-07 19:57:49 -08:00
Vitaly Buka 3e1cb0db8a [CodeGen][MSan] Don't use offsets of zero-sized fields
Such fields will likely have offset zero making
__sanitizer_dtor_callback poisoning wrong regions.
E.g. it can poison base class member from derived class constructor.

Differential Revision: https://reviews.llvm.org/D92727
2020-12-07 13:37:40 -08:00
Jinsong Ji b49b8f096c [PowerPC][Clang] Remove QPX support
Clean up QPX code in clang missed in https://reviews.llvm.org/D83915

Reviewed By: #powerpc, steven.zhang

Differential Revision: https://reviews.llvm.org/D92329
2020-12-07 10:15:39 -05:00
Vitaly Buka 1f21f6d6a4 [NFC][CodeGen] Simplify SanitizeDtorMembers::Emit 2020-12-05 21:11:27 -08:00