Commit Graph

13591 Commits

Author SHA1 Message Date
Sander de Smalen 2d1baf606a [SveEmitter] Add builtins for svwhilerw/svwhilewr
This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use
the result predicate type and the first pointer type as the
overloaded types for the LLVM IR intrinsic.

Reviewers: SjoerdMeijer, efriedma

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78238
2020-04-22 21:49:18 +01:00
Sander de Smalen 1559485e60 [SveEmitter] Add builtins for svwhile
This also adds the IsOverloadWhile flag which tells CGBuiltin to use
both the default type (predicate) and the type of the second operand
(scalar) as the overloaded types for the LLMV IR intrinsic.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77595
2020-04-22 21:47:47 +01:00
Sander de Smalen 662cbaf647 [SveEmitter] Add IsOverloadNone flag and builtins for svpfalse and svcnt[bhwd]_pat
Add the IsOverloadNone flag to tell CGBuiltin that it does not have
an overloaded type. This is used for e.g. svpfalse which does
not take any arguments and always returns a svbool_t.

This patch also adds builtins for svcntb_pat, svcnth_pat, svcntw_pat
and svcntd_pat, as those don't require custom codegen.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77596
2020-04-22 16:42:08 +01:00
Sander de Smalen 41d52662d5 [SveEmitter] Add support for _n form builtins
The ACLE has builtins that take a scalar value that is to be expanded
into a vector by the operation. While the ISA may have an instruction
that takes an immediate or a scalar to represent this, the LLVM IR
intrinsic may not, so Clang will have to splat the scalar value.

This patch also adds the _n forms for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77594
2020-04-22 14:23:54 +01:00
Andrzej Warzynski 72f565899d [SveEmitter] Implement builtins for gathers/scatters
This patch adds builtins for:
  * regular, first-faulting and non-temporal gather loads
  * regular and non-temporal scatter stores

Differential Revision: https://reviews.llvm.org/D77735
2020-04-22 13:21:39 +01:00
Justin Hibbits 4ca2cad947 [PowerPC] Add clang -msvr4-struct-return for 32-bit ELF
Summary:

Change the default ABI to be compatible with GCC.  For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4.  This affects FreeBSD, NetBSD, OpenBSD.  There is no change for
32-bit Linux, where Clang continues to return all structs in memory.

Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc.  These options are only for PPC32; reject them on PPC64 and
other targets.  The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.

To actually return a struct in registers, coerce it to an integer of the
same size.  LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.

Fixes PR#40736

Patch by George Koehler!

Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
2020-04-21 20:17:25 -05:00
Aaron Ballman 6a30894391 C++2a -> C++20 in some identifiers; NFC. 2020-04-21 15:37:19 -04:00
Craig Topper 68b2e507e4 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 21:31:44 -07:00
Sander de Smalen 06c980df46 [SveEmitter] Implement zeroing of false lanes
This implements zeroing of false lanes for binary operations,
where instead of merging into the first operand vector (_m)
a `select` is placed on the first input vector. This approach
easily translates to the use of the `zeroing movprfx` instruction.

This patch also adds builtins for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77593
2020-04-20 17:02:48 +01:00
Sander de Smalen 9986b3de26 [SveEmitter] Explicitly merge with zero/undef
Builtins that have the merge type MergeAnyExp or MergeZeroExp,
merge into a 'undef' or 'zero' vector respectively, which enables the
_x and _z behaviour for unary operations.

This patch also adds builtins for svabs and svneg.

Reviewers: SjoerdMeijer, efriedma, rovka

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77591
2020-04-20 16:26:20 +01:00
Sander de Smalen 515020c091 [SveEmitter] Add more immediate operand checks.
This patch adds a number of intrinsics that take immediates with
varying ranges based on the element size one of the operands.

    svext:   immediate ranging 0 to (2048/sizeinbits(elt) - 1)
    svasrd:  immediate ranging 1..sizeinbits(elt)
    svqshlu: immediate ranging 1..sizeinbits(elt)/2
    ftmad:   immediate ranging 0..(sizeinbits(elt) - 1)

Reviewers: efriedma, SjoerdMeijer, rovka, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76679
2020-04-20 14:41:58 +01:00
Christopher Tetreault c858debebc Remove asserting getters from base Type
Summary:
Remove asserting vector getters from Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: dexonsmith, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: cfe-commits, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D77278
2020-04-17 14:03:31 -07:00
Erich Keane 5f0903e9be Reland Implement _ExtInt as an extended int type specifier.
I fixed the LLDB issue, so re-applying the patch.

This reverts commit a4b88c0449.
2020-04-17 10:45:48 -07:00
Sterling Augustine a4b88c0449 Revert "Implement _ExtInt as an extended int type specifier."
This reverts commit 61ba1481e2.

I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
2020-04-17 10:29:40 -07:00
Benjamin Kramer b639091c02 Change users of CreateShuffleVector to pass the masks as int instead of Constants
No functionality change intended.
2020-04-17 16:34:29 +02:00
Erich Keane 61ba1481e2 Implement _ExtInt as an extended int type specifier.
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.

However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns.  Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.

We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch.  We are
proposing the syntax _ExtInt(N).  We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance.  An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14.  We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).

[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf

Differential Revision: https://reviews.llvm.org/D73967
2020-04-17 07:10:57 -07:00
George Burgess IV 94908088a8 [CodeGen] fix inline builtin-related breakage from D78162
In cases where we have multiple decls of an inline builtin, we may need
to go hunting for the one with a definition when setting function
attributes.

An additional test-case was provided on
https://github.com/ClangBuiltLinux/linux/issues/979
2020-04-16 11:54:10 -07:00
Ehud Katz 03a9526fe5 [CGExprAgg] Fix infinite loop in `findPeephole`
Simplify the function using IgnoreParenNoopCasts.

Fix PR45476

Differential Revision: https://reviews.llvm.org/D78098
2020-04-16 13:26:23 +03:00
Benjamin Kramer 3ee1ec0b9d LangOptions cannot depend on ASTContext, make it not use ASTContext directly
Fixes a layering violation introduced in 2ba4e3a459.
2020-04-16 11:46:35 +02:00
Ayke van Laethem 215dc2e203
[AVR] Use the correct address space for non-prototyped function calls
Some function declarations like this:

    void foo();

do not have a type declaration, for that you'd use:

    void foo(void);

Clang internally bitcasts the variadic function declaration to a
function pointer, but doesn't use the correct address space on AVR. This
commit fixes that.

This fix is necessary to let Clang compile compiler-rt for AVR.

Differential Revision: https://reviews.llvm.org/D78125
2020-04-15 23:44:51 +02:00
Melanie Blower 2ba4e3a459 Move BinaryOperators.FPOptions to trailing storage
Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D76384
2020-04-15 12:57:31 -07:00
Richard Smith bab6df86ae Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented.
Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expression that could be a canonical template argument, it
could be a constant lvalue base object, and so on. In addition, we
represented the UUID value as a string, whose source form we did not
preserve faithfully, and that we partially parsed in multiple different
places.

With this patch, we create an MSGuidDecl object to represent the
implicit object of type 'struct _GUID' created by a UuidAttr. Each
UuidAttr holds a pointer to its 'struct _GUID' and its original
(as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves
like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue
representation of the GUID on the MSGuidDecl and use it from constant
evaluation where needed.

This allows removing a lot of the special-case logic to handle these
expressions. Unfortunately, many parts of Clang assume there are only
a couple of interesting kinds of ValueDecl, so the total amount of
special-case logic is not really reduced very much.

This fixes a few bugs and issues:
 * PR38490: we now support reading from GUID objects returned from
   __uuidof during constant evaluation.
 * Our Itanium mangling for a non-instantiation-dependent template
   argument involving __uuidof no longer depends on which CXXUuidofExpr
   template argument we happened to see first.
 * We now predeclare ::_GUID, and permit use of __uuidof without
   any header inclusion, better matching MSVC's behavior. We do not
   predefine ::__s_GUID, though; that seems like a step too far.
 * Our IR representation for GUID constants now uses the correct IR type
   wherever possible. We will still fall back to using the
      {i32, i16, i16, [8 x i8]}
   layout if a definition of struct _GUID is not available. This is not
   ideal: in principle the two layouts could have different padding.

Reviewers: rnk, jdoerfert

Subscribers: arphaman, cfe-commits, aeubanks

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78171
2020-04-15 12:20:42 -07:00
George Burgess IV 2dd17ff081 [CodeGen] only add nobuiltin to inline builtins if we'll emit them
There are some inline builtin definitions that we can't emit
(isTriviallyRecursive & callers go into why). Marking these
nobuiltin is only useful if we actually emit the body, so don't mark
these as such unless we _do_ plan on emitting that.

This suboptimality was encountered in Linux (see some discussion on
D71082, and https://github.com/ClangBuiltLinux/linux/issues/979).

Differential Revision: https://reviews.llvm.org/D78162
2020-04-15 11:05:22 -07:00
Benjamin Kramer 316b49d373 Pass shufflevector indices as int instead of unsigned.
No functionality change intended.
2020-04-15 15:52:49 +02:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
George Burgess IV 91c8c74180 [CodeGen] clarify a comment; NFC
Prompted by discussion on https://reviews.llvm.org/D78148.
2020-04-14 14:33:01 -07:00
Joerg Sonnenberger 9d2d6e71f0 Emit Objective-C constructors as writable
They end up as .init_array sections and those need to be writable,
otherwise bad merging will happen.
2020-04-14 22:32:34 +02:00
Christopher Tetreault 670f2f694b [SVE] Remove calls to getBitWidth from clang
Reviewers: efriedma

Reviewed By: efriedma

Subscribers: tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77903
2020-04-14 13:29:09 -07:00
Sander de Smalen c8a5b30bac [SveEmitter] Add range checks for immediates and predicate patterns.
Summary:
This patch adds a mechanism to easily add range checks for a builtin's
immediate operands. This patch is tested with the qdech intrinsic, which takes
both an enum for the predicate pattern, as well as an immediate for the
multiplier.

Reviewers: efriedma, SjoerdMeijer, rovka

Reviewed By: efriedma, SjoerdMeijer

Subscribers: mgorny, tschuett, mgrang, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76678
2020-04-14 16:49:32 +01:00
Sander de Smalen 17a68c61a9 [SveEmitter] Implement builtins for contiguous loads/stores
This adds builtins for all contiguous loads/stores, including
non-temporal, first-faulting and non-faulting.

Reviewers: efriedma, SjoerdMeijer

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76238
2020-04-14 15:24:57 +01:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Ayke van Laethem cfc002714a
[AVR] Support aliases in non-zero address space
This fixes code like the following on AVR:

void foo(void) {
}
void bar(void) __attribute__((alias("foo")));

Code like this is present in compiler-rt, which I'm trying to build.

Differential Revision: https://reviews.llvm.org/D76182
2020-04-14 00:42:19 +02:00
Christopher Tetreault f22fbe3a15 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, efriedma, krememek

Reviewed By: sdesmalen, efriedma

Subscribers: dexonsmith, Charusso, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77257
2020-04-13 13:01:40 -07:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Matt Morehouse bef187c750 Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.

Patch by Yannis Juglaret of DGA-MI, Rennes, France

libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.

Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.

The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.

Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:

```
  # (a)
  src:*
  fun:*

  # (b)
  src:SRC/*
  fun:*

  # (c)
  src:SRC/src/woff2_dec.cc
  fun:*
```

Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.

For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.

We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.

The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.

Reviewers: kcc, morehouse, vitalybuka

Reviewed By: morehouse, vitalybuka

Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D63616
2020-04-10 10:44:03 -07:00
Kevin P. Neal 7f38812d5b [FPEnv][AArch64] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that.

Neon is part of this patch, so ARM is affected as well.

Differential Revision: https://reviews.llvm.org/D77074
2020-04-10 13:02:00 -04:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Pratyai Mazumder ced398fdc8 [SanitizerCoverage] Add -fsanitize-coverage=inline-bool-flag
Reviewers: kcc, vitalybuka

Reviewed By: vitalybuka

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77637
2020-04-09 02:40:55 -07:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Erich Keane 30588a7395 Make target features check work with ctor and dtor-
The problem was reported in PR45468, applying target features to an
always_inline constructor/destructor runs afoul of GlobalDecl
construction assert when checking for target-feature compatibility.

The core problem is fixed by using the version of the check that takes a
FunctionDecl rather than the GlobalDecl. However, while writing the
test, I discovered that source locations weren't properly set for this
check on ctors/dtors. This patch also fixes constructors and CALLED destructors.

Unfortunately, it doesn't seem too possible to get a meaningful source
location for a 'cleanup' destructor, so those are still 'frontend' level
errors unfortunately. A fixme was added to the test to cover that
situation.
2020-04-08 13:19:55 -07:00
Raul Tambre 878d96011a [clang][CodeGen] Handle throw expression in conditional operator constant folding
Summary:
We're smart and do constant folding when emitting conditional operators.
Thus we emit the live value as a lvalue. This doesn't work if the live value is a throw expression.
Handle this by emitting the throw and returning the dead value as the lvalue.

Fixes PR28184.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77502
2020-04-08 12:32:21 -07:00
Artem Belevich a9627b7ea7 [CUDA] Add partial support for recent CUDA versions.
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.

Differential Revision: https://reviews.llvm.org/D77670
2020-04-08 11:19:44 -07:00
Bevin Hansson 313461f6d8 [CodeGen] Emit IR for compound assignment with fixed-point operands.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73184
2020-04-08 14:33:04 +02:00
Bevin Hansson 39baaabf6d [CodeGen] Emit IR for fixed-point unary operators.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73183
2020-04-08 14:33:04 +02:00
Bevin Hansson 0b9922e67a [CodeGen] Emit IR for fixed-point multiplication and division.
Reviewers: rjmccall, leonardchan

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73182
2020-04-08 14:33:04 +02:00
Alexey Bataev be99c61588 [OPENMP50]Codegen for iterator construct.
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>

<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
  for (cnt2) {
    ...
    deps[deps_counter].base_addr = &<dep>;
    deps[deps_counter].size = sizeof(<dep>);
    deps[deps_counter].flags = in;
    deps_counter += 1;
    ...
  }
}

For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
2020-04-07 15:26:00 -04:00
Amy Huang bcf66084ed [DebugInfo] Fix for adding "returns cxx udt" option to functions in CodeView.
Summary:
This change adds DIFlagNonTrivial to forward declarations of
DICompositeType. It adds the flag to nontrivial types and types with
unknown triviality.

It fixes adding the "CxxReturnUdt" flag to functions inconsistently,
since it is added based on whether the return type is marked NonTrivial, and
that changes if the return type was a forward declaration.

continues the discussion at https://reviews.llvm.org/D75215

Bug: https://bugs.llvm.org/show_bug.cgi?id=44785

Reviewers: rnk, dblaikie, aprantl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77436
2020-04-07 09:10:27 -07:00
Michael Liao c97be2c377 [hip] Remove `hip_pinned_shadow`.
Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
  surface/texture reference support. So far, both the host and device
  use the same reference type, which could be revised later when
  interface/implementation is stablized.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77583
2020-04-07 09:51:49 -04:00
Florian Hahn 338be9c595 [Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops.
Currently Clang does not respect -fno-unroll-loops during LTO. During
D76916 it was suggested to respect -fno-unroll-loops on a TU basis.

This patch uses the existing llvm.loop.unroll.disable metadata to
disable loop unrolling explicitly for each loop in the TU if
unrolling is disabled. This should ensure that loops from TUs compiled
with -fno-unroll-loops are skipped by the unroller during LTO.

This also means that if a loop from a TU with -fno-unroll-loops
gets inlined into a TU without this option, the loop won't be
unrolled.

Due to the fact that some transforms might drop loop metadata, there
potentially are cases in which we still unroll loops from TUs with
-fno-unroll-loops. I think we should fix those issues rather than
introducing a function attribute to disable loop unrolling during LTO.
Improving the metadata handling will benefit other use cases, like
various loop pragmas, too. And it is an improvement to clang completely
ignoring -fno-unroll-loops during LTO.

If that direction looks good, we can use a similar approach to also
respect -fno-vectorize during LTO, at least for LoopVectorize.

In the future, this might also allow us to remove the UnrollLoops option
LLVM's PassManagerBuilder.

Reviewers: Meinersbur, hfinkel, dexonsmith, tejohnson

Reviewed By: Meinersbur, tejohnson

Differential Revision: https://reviews.llvm.org/D77058
2020-04-07 14:01:55 +01:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Erik Pilkington d33c7de8e1 [CodeGenObjC] Fix a crash when attempting to copy a zero-sized bit-field in a non-trivial C struct
Zero sized bit-fields aren't included in the CGRecordLayout, so we shouldn't be
calling EmitLValueForField for them. rdar://60695105

Differential revision: https://reviews.llvm.org/D76782
2020-04-06 16:04:13 -04:00
Reid Kleckner b36c19bc4f [AST] Remove DeclCXX.h dep on ASTContext.h
Saves only 36 includes of ASTContext.h and related headers.

There are two deps on ASTContext.h:
- C++ method overrides iterator types (TinyPtrVector)
- getting LangOptions

For #1, duplicate the iterator type, which is
TinyPtrVector<>::const_iterator.

For #2, add an out-of-line accessor to get the language options. Getting
the ASTContext from a Decl is already an out of line method that loops
over the parent DeclContexts, so if it is ever performance critical, the
proper fix is to pass the context (or LangOpts) into the predicate in
question.

Other changes are just header fixups.
2020-04-06 10:09:01 -07:00
Amy Huang 11a04a64aa [DebugInfo] Change to constructor homing debug info mode: skip literal types
Summary:
In constructor type homing mode sometimes complete debug info for constexpr
types was missing, because there was not a constructor emitted. This change
makes constructor type homing ignore constexpr types.

Reviewers: rnk, dblaikie

Subscribers: aprantl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77432
2020-04-06 09:52:53 -07:00
Alexey Bataev 1c92448656 [OPENMP]Fix PR45439: `omp for collapse(2) ordered(2)` generates invalid
IR.

Fixed a crash because of the not quite correct casting of the value of
iterations.
2020-04-06 12:07:43 -04:00
Johannes Doerfert 419a559c5a [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D77112
2020-04-05 22:30:29 -05:00
David Blaikie e9644e6f4f DebugInfo: Fix default template parameter computation for dependent non-type template parameters
This addresses the immediate bug, though in theory we could still
produce a default parameter for the DWARF in this test case - but other
cases will be definitely unachievable (you could have a default
parameter that cannot be evaluated - so long as the user overrode it
with another value rather than relying on that default)
2020-04-05 16:31:30 -07:00
Eli Friedman 83fa811e5b [clang][opaque pointers] Fix up a bunch of "getType()->getElementType()"
In contexts where we know an LLVM type is a pointer, there's generally
some simpler way to get the pointee type.
2020-04-03 18:00:33 -07:00
Eli Friedman b11decc221 [clang codegen][opaque pointers] Remove use of deprecated constructor
(See also D76269.)
2020-04-03 18:00:33 -07:00
Kevin P. Neal 9f1c35d8b1 Revert "[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins"
The new test case causes bot failures.

This reverts commit ba87430cad.
2020-04-03 15:47:19 -04:00
Andrew Wock ba87430cad [PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins
This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.

Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.

Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
2020-04-03 14:59:33 -04:00
Michael Liao b952d799ca [cuda][hip] Fix `RegisterVar` function prototype.
Summary:
- `RegisterVar` has `void` return type and `size_t` in its variable size
  parameter in HIP or CUDA 9.0+.

Reviewers: tra, yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77398
2020-04-03 12:57:09 -04:00
Lucas Prates e6cb4b659a [Clang][CodeGen] Fixing mismatch between memory layout and const expressions for oversized bitfields
Summary:
The construction of constants for structs/unions was conflicting the
expected memory layout for over-sized bit-fields. When building the
necessary bits for those fields, clang was ignoring the size information
computed for the struct/union memory layout and using the original data
from the AST's FieldDecl information. This caused an issue in big-endian
targets, where the field's contant was incorrectly misplaced due to
endian calculations.

This patch aims to separate the constant value from the necessary
padding bits, using the proper size information for each one of them.
With this, the layout of constants for over-sized bit-fields matches the
ABI requirements.

Reviewers: rsmith, eli.friedman, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77048
2020-04-02 11:55:20 +01:00
Daniel Kiss 7314aea5a4 [clang] Move branch-protection from CodeGenOptions to LangOptions
Summary:
Reason: the option has an effect on preprocessing.

Also see thread: http://lists.llvm.org/pipermail/cfe-dev/2020-March/065014.html

Reviewers: chill, efriedma

Reviewed By: efriedma

Subscribers: efriedma, danielkiss, cfe-commits, kristof.beyls

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77131
2020-04-02 10:31:52 +02:00
Johannes Doerfert 1858f4b50d Revert "[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`"
This reverts commit c18d55998b.

Bots have reported uses that need changing, e.g.,
  clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp
as reported by
  http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591
2020-04-02 02:23:22 -05:00
Johannes Doerfert c18d55998b [OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.

Differential Revision: https://reviews.llvm.org/D77112
2020-04-02 01:39:07 -05:00
David Blaikie db92719c1d DebugInfo: Defaulted non-type template parameters of bool type
Caused an assertion due to mismatched bit widths - this seems like the
right API to use for a possibly width-varying equality test. Though
certainly open to some post-commit review feedback if there's a more
suitable way to do this comparison/test.
2020-04-01 13:21:13 -07:00
Arnold Schwaighofer 153dadf3a3 [clang] CodeGen: Make getOrEmitProtocol public for Swift
Summary:
Swift would like to use clang's apis to emit protocol declarations.

This commits adds the public API:

```
 emitObjCProtocolObject(CodeGenModule &CGM, const ObjCProtocolDecl *p);
```

rdar://60888524

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77077
2020-04-01 08:55:56 -07:00
Vlastimil Labsky 57fd86de87 [AVR] Fix function pointer address space
Summary:
Function pointers should be created with program address space.
This fixes function pointers on AVR.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: Jim, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77119
2020-04-01 21:08:37 +13:00
Ian Levesque bb3111cbaf [clang][xray] Add xray attributes to functions without decls too
Summary: This allows instrumenting things like global initializers

Reviewers: dberris, MaskRay, smeenai

Subscribers: cfe-commits, johnislarry

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77191
2020-04-01 00:02:39 -04:00
Alexey Bataev c2aa543237 [OPENMP50]Codegen for array shaping expression in map clauses.
Added codegen support for array shaping operations in map/to/from
clauses.
2020-03-31 19:06:49 -04:00
Alexey Bataev e094dd5adc [OPENMP50]Fix size calculation for array shaping expression in the
codegen.

Need to include the size of the pointee type when trying to calculate
the total size of the array shaping expression.
2020-03-31 18:45:21 -04:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
zhizhouy 94d912296d [NFC] Do not run CGProfilePass when not using integrated assembler
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.

This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.

Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta

Reviewed By: manojgupta

Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D62627
2020-03-31 10:31:31 -07:00
Alexey Bataev a4f74f377b [OPENMP50]Do not imply lvalue as base expression in array shaping
expression.

We should not assume that the base expression in the array shaping
operation is an lvalue of some form, it may be an rvalue.
2020-03-30 17:07:08 -04:00
Alexey Bataev 7842e7ebbf [OPENMP50]Add codegen support for array shaping expression in depend
clauses.

Implemented codegen for array shaping operation in depend clauses. The
begin of the expression is the pointer itself, while the size of the
dependence data is the mukltiplacation of all dimensions in the array
shaping expression.
2020-03-30 13:37:21 -04:00
Michael Liao cb6389360b Fix GCC warning on enum class bitfield. NFC. 2020-03-28 10:20:34 -04:00
Yaxun (Sam) Liu 369e26ca9e [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.

Differential Revision: https://reviews.llvm.org/D76772
2020-03-28 01:03:20 -04:00
Alexey Bataev 0fca766458 [OPENMP50]Fix PR45117: Orphaned task reduction should be allowed.
Add support for orpahned task reductions.
2020-03-27 17:47:30 -04:00
Adrian Prantl 22d5bd0e3b Allow remapping Clang module include paths
in the debug info with -fdebug-prefix-map.

rdar://problem/55685132

This reapplies an earlier attempt to commit this without
modifications.

Differential Revision: https://reviews.llvm.org/D76385
2020-03-27 14:23:30 -07:00
Michael Liao 5be9b8cbe2 [cuda][hip] Add CUDA builtin surface/texture reference support.
Summary: - Re-commit after fix Sema checks on partial template specialization.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365
2020-03-27 17:18:49 -04:00
Artem Belevich fe8063e1a0 Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."
This reverts commit 6a9ad5f3f4.
The patch breaks CUDA copmilation.

Differential Revision: https://reviews.llvm.org/D76365
2020-03-27 10:01:38 -07:00
Mikael Holmen 7d482e9213 Fix TBAA for unsigned fixed-point types
Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.

Patch by: materi

Reviewers: ebevhan, leonardchan

Reviewed By: leonardchan

Subscribers: kosarev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76856
2020-03-27 10:35:24 +01:00
Johannes Doerfert befb4be3a8 [OpenMP] `omp begin/end declare variant` - part 2, sema ("+CG")
This is the second part loosely extracted from D71179 and cleaned up.

This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.

A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].

The logic is as follows:

If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
  1) Create a function declaration for the definition we were about to generate.
  2) Create a function definition but with a mangled name (according to
     `<SELECTOR>`).
  3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
     one used already for `omp declare variant`, using and the mangled
     function definition as specialization for the context defined by
     `<SELECTOR>`.

When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman

Subscribers: bollu, guansong, openmp-commits, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75779
2020-03-27 02:30:58 -05:00
Johannes Doerfert 095cecbe0d [OpenMP] `omp begin/end declare variant` - part 1, parsing
This is the first part extracted from D71179 and cleaned up.

This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].

A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D74941
2020-03-27 02:30:58 -05:00
Sid Manning b0da094983 [Hexagon] Add support for Linux/Musl ABI (part 2)
A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638
2020-03-26 17:19:46 -05:00
Michael Liao 6a9ad5f3f4 [cuda][hip] Add CUDA builtin surface/texture reference support.
Summary:
- Even though the bindless surface/texture interfaces are promoted,
  there are still code using surface/texture references. For example,
  [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
  compilation issue for code using `tex2D` with texture references. For
  better compatibility, this patch proposes the support of
  surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
  `nvcc` does use builtins for texture support. From the limited NVVM
  documentation[^nvvm] and NVPTX backend texture/surface related
  tests[^test], it's believed that surface/texture references are
  supported by replacing their reference types, which are annotated with
  `device_builtin_surface_type`/`device_builtin_texture_type`, with the
  corresponding handle-like object types, `cudaSurfaceObject_t` or
  `cudaTextureObject_t`, in the device-side compilation. On the host
  side, that global handle variables are registered and will be
  established and updated later when corresponding binding/unbinding
  APIs are called[^bind]. Surface/texture references are most like
  device global variables but represented in different types on the host
  and device sides.
- In this patch, the following changes are proposed to support that
  behavior:
  + Refine `device_builtin_surface_type` and
    `device_builtin_texture_type` attributes to be applied on `Type`
    decl only to check whether a variable is of the surface/texture
    reference type.
  + Add hooks in code generation to replace that reference types with
    the correponding object types as well as all accesses to them. In
    particular, `nvvm.texsurf.handle.internal` should be used to load
    object handles from global reference variables[^texsurf] as well as
    metadata annotations.
  + Generate host-side registration with proper template argument
    parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used.  But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365
2020-03-26 14:44:52 -04:00
Eli Friedman c46a0c07a6 [clang codegen] Address review comment on comment in constWithPadding. 2020-03-25 10:58:03 -07:00
Adrian Prantl c025235e96 Revert "Allow remapping Clang module include paths"
to investigate why this commit broke a test in the LLDB testsuite.

This reverts commit dca920a904.
2020-03-24 17:57:34 -07:00
Adrian Prantl dca920a904 Allow remapping Clang module include paths
rdar://problem/55685132

Differential Revision: https://reviews.llvm.org/D76385
2020-03-24 17:14:27 -07:00
Eli Friedman 3f1defa6e2 [clang codegen] Clean up handling of vectors with trivial-auto-var-init.
The code was pretending to be doing something useful with vectors, but
really it was doing nothing: the element type of a vector is always a
scalar type, so constWithPadding would always just return the input constant.

Split off from D75661 so it can be reviewed separately.

While I'm here, also add testcase to show missing vector handling.

Differential Revision: https://reviews.llvm.org/D76528
2020-03-24 14:34:40 -07:00
Erik Pilkington de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Momchil Velikov 080d046c91 [ARM][CMSE] Implement CMSE attributes
This patch adds CMSE attributes `cmse_nonsecure_call` and
`cmse_nonsecure_entry`.  As usual, specification is available here:
https://developer.arm.com/docs/ecm0359818/latest

Patch by Javed Absar, Bradley Smith, David Green, Momchil Velikov,
possibly others.

Differential Revision: https://reviews.llvm.org/D71129
2020-03-24 10:21:26 +00:00
Jun Ma d0f4af8f30 [Coroutines] Insert lifetime intrinsics even O0 is used
Differential Revision: https://reviews.llvm.org/D76119
2020-03-24 13:41:55 +08:00
Matt Arsenault 3f533006ba AMDGPU: Emit llvm.fshr for __builtin_amdgcn_alignbit
These are equivalent. The generic rotate builtins do not directly map
to the fshr intrinsic.
2020-03-23 16:51:25 -04:00
Johannes Doerfert 55eca2853e [OpenMP][NFC] Minimize memory usage and copying of `OMPTraitInfo`s
See rational here: https://reviews.llvm.org/D71830#1922656

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D76173
2020-03-23 14:23:46 -05:00
Alexey Bataev 63828a35da [OPENMP50]Bassic support for exclusive clause.
Added basic support (parsing/sema/serialization) for exclusive clause in
scan directives.
2020-03-23 13:12:52 -04:00
David Blaikie b5eafda8d3 Revert "EHScopeStack::Cleanup has virtual functions so the destructor should be too."
This type was already well designed - having a protected destructor, and
derived classes being final/public non-virtual destructors, the type
couldn't be destroyed polymorphically & accidentally cause slicing.

This reverts commit 736385c0b4.
2020-03-21 21:17:33 -07:00
Thomas Lively de6cd3e836 [WebAssembly] Add SIMD integer abs builtins
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76538
2020-03-21 00:21:24 -07:00
Akira Hatanaka d35a454170 [CodeGen] Emit destructor calls to destruct non-trivial C struct objects
returned by function calls or loaded from volatile objects

rdar://problem/51867864

Differential Revision: https://reviews.llvm.org/D66094
2020-03-20 18:34:22 -07:00
Adrian Prantl ceae47143b Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 16:27:50 -07:00
Adrian Prantl bde15de3ca Revert "Allow remapping the sysroot with -fdebug-prefix-map."
This reverts commit 6725c4836a.
2020-03-20 16:27:23 -07:00
Adrian Prantl 6725c4836a Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 15:52:39 -07:00
Adrian Prantl 43580a5c5a Allow remapping Clang module skeleton CU references with -fdebug-prefix-map
Differential Revision: https://reviews.llvm.org/D76383
2020-03-20 15:15:56 -07:00
Adrian Prantl 97f490d87b Don't set the isOptimized flag in module skeleton DICompileUnits.
It's not used for anything.
2020-03-20 14:18:15 -07:00
Adrian Prantl 079c6ddaf5 Correctly initialize the DW_AT_comp_dir attribute of Clang module skeleton CUs
Before this patch a Clang module skeleton CU would have a
DW_AT_comp_dir pointing to the directory of the module map file, and
this information was not used by anyone. Even worse, LLDB actually
resolves relative DWO paths by appending it to DW_AT_comp_dir. This
patch sets it to the same directory that is used as the main CU's
compilation directory, which would make the LLDB code work.

Differential Revision: https://reviews.llvm.org/D76377
2020-03-20 14:18:14 -07:00
Alexey Bataev 06dea73307 [OPENMP50]Initial support for inclusive clause.
Added parsing/sema/serialization support for inclusive clause in scan
directive.
2020-03-20 14:20:38 -04:00
Reid Kleckner ce5173c0e1 Use FinishThunk to finish musttail thunks
FinishThunk, and the invariant of setting and then unsetting
CurCodeDecl, was added in 7f416cc426 (2015). The invariant didn't
exist when I added this musttail codepath in ab2090d107 (2014).
Recently in 28328c3771, I started using this codepath on non-Windows
platforms, and users reported problems during release testing (PR44987).

The issue was already present for users of EH on i686-windows-msvc, so I
added a test for that case as well.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D76444
2020-03-20 09:02:21 -07:00
Alexey Bataev fcba7c3534 [OPENMP50]Initial support for scan directive.
Addedi basic parsing/sema/serialization support for scan directive.
2020-03-20 07:58:15 -04:00
Shiva Chen fc3752665f [RISCV] Passing small data limitation value to RISCV backend
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.

Differential Revision: https://reviews.llvm.org/D57497
2020-03-20 11:03:51 +08:00
Thomas Lively a3f974f3c3 [WebAssembly] SIMD bitmask intrinsics and builtin functions
Summary:
These experimental new instructions are proposed in
https://github.com/WebAssembly/simd/pull/201.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76397
2020-03-19 17:15:37 -07:00
Djordje Todorovic d9b9621009 Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Lucas Prates d4ad386ee1 [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics
Summary:
The range checks performed for the vqrdmulh_lane and vqrdmulh_lane Neon
intrinsics were incorrectly using their return type as the base type for
the range check performed on their 'lane' argument.

This patch updates those intrisics to use the type of the proper reference
argument to perform the range checks.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74766
2020-03-19 12:08:12 +00:00
Lucas Prates f56550cf7f [ARM] Enabling range checks on Neon intrinsics' lane arguments
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.

This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard

Reviewed By: ostannard

Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74619
2020-03-19 12:07:23 +00:00
Lucas Prates 7bf23563f4 Revert "[ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions"
This reverts commit 62ab15ffa3.

Multiple commits were unintentionally squashed into this one. Reverting
so each of them can be pushed properly.
2020-03-19 12:01:13 +00:00
Lucas Prates 62ab15ffa3 [ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74616
2020-03-19 11:52:41 +00:00
Richard Smith f18233dad4 Fix -fsanitize=array-bound to treat T[0] union members as flexible array
members regardless of whether they're the last member of the union.
2020-03-18 15:47:24 -07:00
Alexey Bataev f3c857fae2 [OPENMP50]Add basic codegen support for ancestor device modifier.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.
2020-03-18 17:53:18 -04:00
Alexey Bataev 2f8894a5b8 [OPENMP50]Add support for extended device clause in target directives.
Added parsing/sema/serialization support for extended device clause in
executable target directives.
2020-03-18 15:02:37 -04:00
Michael Liao 4cf01ed75e [hip] Revise `GlobalDecl` constructors. NFC.
Summary:
- https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors
  to ensure all GPU kernels have `ReferenceKenelKind` initialized
  properly with an explicit constructor and static one. But, there are
  lots of places using the implicit constructor triggering the assertion
  on non-GPU kernels. That's found in compilation of many tests and
  workloads.
- Fixing all of them may change more code and, more importantly, all of
  them assumes the default kernel reference kind. This patch changes
  that constructor to tell `CUDAGlobalAttr` and construct `GlobalDecl`
  properly.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76344
2020-03-18 09:33:39 -04:00
Alexey Bataev b09cce07c7 [OPENMP50]Codegen for detach clause.
Implemented codegen for detach clause in task directives.
2020-03-18 09:01:17 -04:00
Sander de Smalen c5b81466c2 Reland D75470 [SVE] Auto-generate builtins and header for svld1.
Reworked the patch to avoid sharing a header (SVETypeFlags.h) between
include/clang/Basic and utils/TableGen/SveEmitter.cpp. Now the patch
generates the enum/flags which is included in TargetBuiltins.h.

Also renamed one of the SveEmitter options to be in line with MVE.

Summary:

This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.

I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.
2020-03-18 11:16:28 +00:00
Michael Liao a2920c4ea9 [codegen] Fix one more case where `getGlobalDecl` should be used. NFC.
- After https://reviews.llvm.org/D68578, the implicit conversion from
  `FunctionDecl` to `GlobalDecl` needs replacing with `getGlobalDecl`;
  otherwise, assertion is triggered.
2020-03-17 17:56:47 -04:00
Jon Chesterfield c45eaeabb7 [Clang] Undef attribute for global variables
Summary:
[Clang] Attribute to allow defining undef global variables

Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.

At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.

A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.

Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.

Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman

Reviewed By: rjmccall, aaron.ballman

Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74361
2020-03-17 21:22:23 +00:00
Alexey Bataev 0f0564bb9a [OPENMP50]Initial support for detach clause in task directive.
Added parsing/sema/serialization support for detach clause.
2020-03-17 09:19:03 -04:00
Kerry McLaughlin af64948e2a [SVE][Inline-Asm] Add constraints for SVE ACLE types
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
 - y: SVE registers Z0-Z7
 - Upl: One of the low eight SVE predicate registers (P0-P7)
 - Upa: Full range of SVE predicate registers (P0-P15)

Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin

Reviewed By: efriedma

Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75690
2020-03-17 11:04:19 +00:00
Evgenii Stepanov 2a3723ef11 [memtag] Plug in stack safety analysis.
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.

Reviewers: pcc, vitalybuka, ostannard

Reviewed By: vitalybuka

Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73513
2020-03-16 16:35:25 -07:00
Sander de Smalen 6ce537ccfc Revert "[SVE] Auto-generate builtins and header for svld1."
This reverts commit 8b409eabaf.

Reverting this patch for now because it breaks some buildbots.
2020-03-16 15:22:15 +00:00
Sander de Smalen 8b409eabaf [SVE] Auto-generate builtins and header for svld1.
This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.

I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.

Reviewers: efriedma, rovka, SjoerdMeijer, rsandifo-arm, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75470
2020-03-16 10:52:37 +00:00
Jun Ma 53c2e10fb8 [Coroutines] Do not evaluate InitListExpr of a co_return
Differential Revision: https://reviews.llvm.org/D76118
2020-03-16 12:42:44 +08:00
Sander de Smalen 5087ace651 [Clang][SVE] Parse builtin type string for scalable vectors
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.

This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:

 svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
   return svld1_s8(pg, base);
 }

Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75298
2020-03-15 14:34:52 +00:00
Alexey Bataev b3998a0edb [OPENMP]Fix PR45047: Do not copy firstprivates in tasks twice.
Avoid copying of the orignal variable if it is going to be marked as
firstprivate in task regions. For taskloops, still need to copy the
non-trvially copyable variables to correctly construct them upon task
creation.
2020-03-13 18:04:16 -04:00
Nico Weber f82b32a51e Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit 5aa5c943f7.
Causes clang to assert, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c4
for a repro.
2020-03-13 15:37:44 -04:00
Adrian Prantl 842ea709e4 Debug Info: Store the SDK in the DICompileUnit.
This is another intermediate step for PR44213
(https://bugs.llvm.org/show_bug.cgi?id=44213).

This stores the SDK *name* in the debug info, to make it possible to
`-fdebug-prefix-map`-replace the sysroot with a recognizable string
and allowing the debugger to find a fitting SDK relative to itself,
not the machine the executable was compiled on.

rdar://problem/51645582
2020-03-13 11:21:30 -07:00
Alexey Bataev 172f1460ae [OPENMP]Reduce number of captured global vars.
Try to reduce the number of global vars captured in the OpenMP regions
by capturing them only the regions, which mark them as not-shared.
2020-03-13 10:47:54 -04:00
Yaxun (Sam) Liu 0ffb12ca67 [HIP] Mark kernels with uniform-work-group-size=true
Differential Revision: https://reviews.llvm.org/D76076
2020-03-13 06:56:56 -04:00
Huihui Zhang 118abf2017 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Reid Kleckner 26d254f084 Sink more Attr.h inline methods, NFC
This has very little impact on build time, but is a mechanical pre-req
to removing the OpenMPClause.h include, which matters. Most of these
pretty print methods require Expr to be complete.
2020-03-12 11:54:31 -07:00
Simon Pilgrim adeb8c5428 Replace getAs with castAs to fix null dereference static analyzer warning.
Use castAs as we know the cast should succeed (and castAs will assert if it doesn't) and we're dereferencing it directly in the BuildRCBlockVarRecordLayout call.
2020-03-12 18:52:58 +00:00
Simon Pilgrim 336530be07 CGOpenMPRuntime::emitDeclareTargetVarDefinition - fix static analyzer null dereference warning. NFCI.
All paths test for or dereference the VD pointer, so just assert that its not null.
2020-03-12 18:52:57 +00:00
Reid Kleckner e08464fb45 Avoid including FileManager.h from SourceManager.h
Most clients of SourceManager.h need to do things like turning source
locations into file & line number pairs, but this doesn't require
bringing in FileManager.h and LLVM's FS headers.

The main code change here is to sink SM::createFileID into the cpp file.
I reason that this is not performance critical because it doesn't happen
on the diagnostic path, it happens along the paths of macro expansion
(could be hot) and new includes (less hot).

Saves some includes:
    309 -    /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileManager.h
    272 -    /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileSystemOptions.h
    271 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/VirtualFileSystem.h
    267 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/FileSystem.h
    266 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Chrono.h

Differential Revision: https://reviews.llvm.org/D75406
2020-03-11 13:53:12 -07:00
Reid Kleckner c915cb957d Avoid including Module.h from ExternalASTSource.h
Module.h takes 86ms to parse, mostly parsing the class itself. Avoid it
if possible. ASTContext.h depends on ExternalASTSource.h.

A few NFC changes were needed to make this possible:

- Move ASTSourceDescriptor to Module.h. This needs Module to be
  complete, and seems more related to modules and AST files than
  external AST sources.
- Move "import complete" bit from Module* pointer int pair to
  NextLocalImport pointer. Required because PointerIntPair<Module*,...>
  requires Module to be complete, and now it may not be.

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75784
2020-03-11 13:37:41 -07:00
Jin Lin a0cacb6054 Fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode
Summary:
The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.

Reviewers: rjmccall, ahatanak, steven_wu

Reviewed By: rjmccall, steven_wu

Subscribers: manmanren, mehdi_amini, hiraditya, dexonsmith, llvm-commits, jinlin

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71219
2020-03-11 13:26:06 -07:00
Akira Hatanaka 37fa9d65ea [CodeGen][ObjC] Don't extend lifetime of ObjC pointers passed to calls
to __builtin_os_log_format if ARC isn't enabled

Fixes a bug introduced in this commit:
f4d791f833

rdar://problem/60301219
2020-03-10 22:10:32 -07:00
Erik Pilkington 75af694a6d [CodeGenObjC] Place property names in __objc_methname
This allows the property name to deduplicate with the accessor method name.
rdar://58927964
2020-03-10 14:31:00 -07:00
Akira Hatanaka 40568fec7e [CodeGen] Emit destructor calls to destruct compound literals
Fix a bug in IRGen where it wasn't destructing compound literals in C
that are ObjC pointer arrays or non-trivial structs. Also diagnose jumps
that enter or exit the lifetime of the compound literals.

rdar://problem/51867864

Differential Revision: https://reviews.llvm.org/D64464
2020-03-10 14:08:28 -07:00
Mikhail Maltsev 47edf5bafb [ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE
Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intrinsics
(some of the intrinsics will be polymorphic and accept/return values
of MVE vector types).
Specifically the patch:
* Adds new tablegen backends -gen-arm-cde-builtin-def,
  -gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema,
  -gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on
  existing MVE backends.
* Renames the '__clang_arm_mve_alias' attribute into
  '__clang_arm_builtin_alias' (it will be used with CDE intrinsics as
  well as MVE intrinsics)
* Implements semantic checks for the coprocessor argument of the CDE
  intrinsics as well as the existing coprocessor intrinsics.
* Adds one CDE intrinsic __arm_cx1 to test the above changes

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75850
2020-03-10 14:03:16 +00:00
Djordje Todorovic 5aa5c943f7 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-03-10 09:15:06 +01:00
Erik Pilkington 7fbf15a8f2 [CodeGenObjC] Privatize some ObjC metadata symbols
Nobody needs these symbols, so there isn't any benefit in including them. This
saves some code-size in Objective-C binaries. Partially reverts:
https://reviews.llvm.org/D61454. rdar://56579760

Differential revision: https://reviews.llvm.org/D75491
2020-03-09 15:40:24 -07:00
Alexey Bataev 6309334b95 [OPENMP50]Codegen for depobj dependency kind.
Implemented codegen for depobj modifier in depend clauses.
2020-03-09 17:46:06 -04:00
Yaxun (Sam) Liu 22c457a869 [HIP] Fix device stub name
HIP emits a device stub function for each kernel in host code.

The HIP debugger requires device stub function to have a different unmangled name as the kernel.

Currently the name of the device stub function is the mangled name with a postfix .stub. However,
this does not work with the HIP debugger since the unmangled name is the same as the kernel.

This patch adds prefix __device__stub__ to the unmangled name of the device stub before mangling,
therefore the device stub function has a valid mangled name which is different than the device kernel
name. The device side kernel name is kept unchanged. kernels with extern "C" also gets the prefix added
to the corresponding device stub function.

Differential Revision: https://reviews.llvm.org/D68578
2020-03-09 16:40:05 -04:00
Krzysztof Parzyszek d0ca1041ba [Hexagon] Refactor handling of circular load/store builtins, NFC 2020-03-09 14:40:08 -05:00
Erich Keane 7b66160828 Fix Target Multiversioning renaming.
The initial implementation only did 'first declaration renaming' when
a default version came after. This is insufficient in cases where a
default does not exist, so this patch makes sure that we do the renaming
in all cases.

This renaming is necessary because we emit the first declaration before
knowing that it IS a target multiversion function, which would change
its name. The second declaration (the one that caused the
multiversioning) then needs to make sure that the first one has its name
changed to be consistent with the resolver usage.
2020-03-09 08:29:18 -07:00
Djordje Todorovic c15c68abdc [CallSiteInfo] Enable the call site info only for -g + optimizations
Emit call site info only in the case of '-g' + 'O>0' level.

Differential Revision: https://reviews.llvm.org/D75175
2020-03-09 12:12:44 +01:00
Yaxun (Sam) Liu 29e1a16be8 [NFC] Let mangler accept GlobalDecl
Differential Revision: https://reviews.llvm.org/D75700
2020-03-07 23:51:41 -05:00
Matt Arsenault a4e71f01c0 Assume ieee behavior without denormal-fp-math attribute 2020-03-07 12:10:56 -05:00
Akira Hatanaka f4d791f833 [CodeGen][ObjC] Extend lifetime of ObjC pointers passed to calls to
__builtin_os_log_format

This is needed to keep all the objects, including temporaries returned
by function calls, written to the buffer alive until os_log_pack_send is
called.

rdar://problem/60105410
2020-03-06 16:46:50 -08:00
Thomas Lively d43fcd0c04 [WebAssembly] Add SIMD integer min/max builtins
Summary:
Although SIMD integer min/max operations can be expressed using the ?:
operator in C++, that operator is disallowed for vectors in C. As a
workaround, this change introduces new WebAssembly-specific builtin
functions that lower to the desired vector icmp/select sequences.

Reviewers: aheejin, dschuff, kripken

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75770
2020-03-06 14:28:52 -08:00
Alexey Bataev 5dadf577d5 [OPENMP50]Add 'depobj' modifier in 'depend' clauses.
Added basic support (parsing/sema/serialization) for depobj dependency
kind in depend clauses.
2020-03-06 11:44:57 -05:00
Alexey Bataev 8d7b118875 [OPENMP50]Add codegen for update clause in depobj directive.
Added codegen for update clause in depobj. Reads the number of the
elements from the first element and updates flags for each element in
the loop.
```
omp_depend_t x;
kmp_depend_info *base = (kmp_depend_info *)x;
intptr_t num = x[-1].base_addr;
kmp_depend_info *end = x + num;
kmp_depend_info *el = base;
do {
  el.flags = new_flag;
  el = &el[1];
} while (el != end);
```
2020-03-05 14:31:07 -05:00
Alexey Bataev ea5b3ef593 [OPENMP50]Skip the first element when storing the list of dependencies
in depobj object.

The first element in the list of the dependencies is used for internal
purposes to store the number of the elements in the provided list.
The first element now is skipped and depobj object poits exactly to the
list of dependencies.
2020-03-05 14:26:07 -05:00
Adrian Prantl 314b9278f0 Revert "[CGBlocks] Improve line info in backtraces containing *_helper_block"
Block copy/destroy helpers are now linkonce_odr functions, meant to be uniqued, and thus attaching debug information from one translation unit (or even just from one instance of many inside one translation unit) would be misleading and wrong in the general case.

This effectively reverts commit 9c6b6826ce.

<rdar://problem/59137040>

Differential Revision: https://reviews.llvm.org/D75615
2020-03-05 09:58:42 -08:00
Alexey Bataev b27ff4d07d [OPENMP50]Codegen for 'destroy' clause in depobj directive.
If the destroy clause is appplied, the previously allocated memory for
the dependency object must be destroyed.
2020-03-04 16:30:34 -05:00
Alexey Bataev e46f0fee30 [OPENMP50]Codegen for 'depend' clause in depobj directive.
Added codegen for 'depend' clause in depobj directive. The depend clause
is emitted as kmp_depend_info <deps>[<number_of_items_in_clause> + 1]. The
first element in this array is reserved for storing the number of
elements in this array: <deps>[0].base_addr =
<number_of_items_in_clause>;

This extra element is required to implement 'update' and 'destroy'
clauses. It is required to know the size of array to destroy it
correctly and to update depency kind.
2020-03-04 15:01:53 -05:00
hsmahesha cac068600e [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code
Summary:
hip-pinned-shadow global var should remain in the final code object irrespective
of whether it is used or not within the code. Add it to used list, so that it
will not get eliminated when it is unused.

Reviewers: yaxunl, tra, hliao

Reviewed By: yaxunl

Subscribers: hliao, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75402
2020-03-04 10:54:26 +05:30
Alexey Bataev 375437ab92 [OPENMP50]Support 'destroy' clause on 'depobj' directives.
Added basic support (parsing/sema/serialization) for 'destroy' clause in
depobj directives.
2020-03-02 14:40:53 -05:00
Alexey Bataev c112e941a0 [OPENMP50]Add basic support for depobj construct.
Added basic parsing/sema/serialization support for depobj directive.
2020-03-02 13:10:32 -05:00
Simon Pilgrim dc8680eceb [CodeGenPGO] Fix shadow variable warning. NFC. 2020-03-02 15:06:34 +00:00
Simon Pilgrim 736385c0b4 EHScopeStack::Cleanup has virtual functions so the destructor should be too.
Fixes cppcheck warning.
2020-03-02 15:06:34 +00:00
Simon Pilgrim 842c5c7994 Fix shadow variable warning. NFC. 2020-03-02 11:41:20 +00:00
Awanish Pandey 7a42babeb8 Reland "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters
in C++ templates."

This was reverted in 802b22b5c8 due to
missing .bc file and a chromium bot failure.
https://bugs.chromium.org/p/chromium/issues/detail?id=1057559#c1
This revision address both of them.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462
2020-03-02 16:45:48 +05:30
Hans Wennborg 802b22b5c8 Revert "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters"
The Bitcode/DITemplateParameter-5.0.ll test is failing:

FAIL: LLVM :: Bitcode/DITemplateParameter-5.0.ll (5894 of 36324)
******************** TEST 'LLVM :: Bitcode/DITemplateParameter-5.0.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/llvm-dis -o - /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll.bc | /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/FileCheck /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll
--
Exit Code: 2

Command Output (stderr):
--

It looks like the Bitcode/DITemplateParameter-5.0.ll.bc file was never checked in.

This reverts commit c2b437d53d.
2020-03-02 09:30:52 +01:00
Awanish Pandey c2b437d53d [DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters
in C++ templates.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462
2020-03-02 12:33:05 +05:30
Simon Pilgrim 7e9747b50b [X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554)
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.

Differential Revision: https://reviews.llvm.org/D75162
2020-02-29 18:57:35 +00:00
Vedant Kumar dd1ea9de2e Reland: [Coverage] Revise format to reduce binary size
Try again with an up-to-date version of D69471 (99317124 was a stale
revision).

---

Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 18:12:04 -08:00
Vedant Kumar 3388871714 Revert "[Coverage] Revise format to reduce binary size"
This reverts commit 99317124e1. This is
still busted on Windows:

http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873

The llvm-cov tests report 'error: Could not load coverage information'.
2020-02-28 18:03:15 -08:00
Vedant Kumar 99317124e1 [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 17:33:25 -08:00
Vedant Kumar c54597b99d [ubsan] Add support for -fsanitize=nullability-* suppressions
rdar://59402904
2020-02-28 14:30:40 -08:00
cchen 6ee6fa28a7 [OpenMP5.0] Allow pointer arithmetic in motion/map clause, by Chi Chun
Chen

Summary:
Base declaration in pointer arithmetic expression is determined by
binary search with type information. Take "int *a, *b; *(a+*b)" as an
example, we determine the base by checking the type of LHS and RHS. In
this case the type of LHS is "int *", the type of RHS is "int",
therefore, we know that we need to visit LHS in order to find base
declaration.

Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: guansong, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75077
2020-02-28 15:07:32 -05:00
Reid Kleckner 4c2a6567bb Avoid ASTContext.h -> TargetInfo.h dep
This has been done before in 2008: ab13857072
But these things regress easily.
Move some things out of line.

Saves 316 includes + transitive stuff:
    316 -    ../clang/include/clang/Basic/TargetOptions.h
    316 -    ../clang/include/clang/Basic/TargetInfo.h
    316 -    ../clang/include/clang/Basic/TargetCXXABI.h
    316 -    ../clang/include/clang/Basic/OpenCLOptions.h
    316 -    ../clang/include/clang/Basic/OpenCLExtensions.def
    302 -    ../llvm/include/llvm/Target/TargetOptions.h
    302 -    ../llvm/include/llvm/Support/CodeGen.h
    302 -    ../llvm/include/llvm/MC/MCTargetOptions.h
    302 -    ../llvm/include/llvm/ADT/FloatingPointMode.h
    302 -    ../clang/include/clang/Basic/XRayInstr.h
    302 -    ../clang/include/clang/Basic/DebugInfoOptions.h
    302 -    ../clang/include/clang/Basic/CodeGenOptions.h
    302 -    ../clang/include/clang/Basic/CodeGenOptions.def
    257 -    ../llvm/include/llvm/Support/Regex.h
     79 -    ../llvm/include/llvm/ADT/SmallSet.h
     68 -    MSVCSTL/include/set
     66 -    ../llvm/include/llvm/ADT/SmallPtrSet.h
     62 -    ../llvm/include/llvm/ADT/StringSwitch.h
2020-02-27 14:35:00 -08:00
Reid Kleckner 86565c1309 Avoid SourceManager.h include in RawCommentList.h, add missing incs
SourceManager.h includes FileManager.h, which is expensive due to
dependencies on LLVM FS headers.

Remove dead BeforeThanCompare specialization.

Sink ASTContext::addComment to cpp file.

This reduces the time to compile a file that does nothing but include
ASTContext.h from ~3.4s to ~2.8s for me.

Saves these includes:
    219 -    ../clang/include/clang/Basic/SourceManager.h
    204 -    ../clang/include/clang/Basic/FileSystemOptions.h
    204 -    ../clang/include/clang/Basic/FileManager.h
    165 -    ../llvm/include/llvm/Support/VirtualFileSystem.h
    164 -    ../llvm/include/llvm/Support/SourceMgr.h
    164 -    ../llvm/include/llvm/Support/SMLoc.h
    161 -    ../llvm/include/llvm/Support/Path.h
    141 -    ../llvm/include/llvm/ADT/BitVector.h
    128 -    ../llvm/include/llvm/Support/MemoryBuffer.h
    124 -    ../llvm/include/llvm/Support/FileSystem.h
    124 -    ../llvm/include/llvm/Support/Chrono.h
    124 -    .../MSVCSTL/include/stack
    122 -    ../llvm/include/llvm-c/Types.h
    122 -    ../llvm/include/llvm/Support/NativeFormatting.h
    122 -    ../llvm/include/llvm/Support/FormatProviders.h
    122 -    ../llvm/include/llvm/Support/CBindingWrapping.h
    122 -    .../MSVCSTL/include/xtimec.h
    122 -    .../MSVCSTL/include/ratio
    122 -    .../MSVCSTL/include/chrono
    121 -    ../llvm/include/llvm/Support/FormatVariadicDetails.h
    118 -    ../llvm/include/llvm/Support/MD5.h
    109 -    .../MSVCSTL/include/deque
    105 -    ../llvm/include/llvm/Support/Host.h
    105 -    ../llvm/include/llvm/Support/Endian.h

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75279
2020-02-27 13:49:40 -08:00
Dan Gohman 00072c08c7 [WebAssembly] Mangle the argc/argv `main` as `__wasm_argc_argv`.
WebAssembly enforces a rule that caller and callee signatures must
match. This means that the traditional technique of passing `main`
`argc` and `argv` even when it doesn't need them doesn't work.

Currently the backend renames `main` to `__original_main`, however this
doesn't interact well with LTO'ing libc, and the name isn't intuitive.
This patch allows us to transition to `__main_argc_argv` instead.

This implements the proposal in
https://github.com/WebAssembly/tool-conventions/pull/134
with a flag to disable it when targeting Emscripten, though this is
expected to be temporary, as discussed in the proposal comments.

Differential Revision: https://reviews.llvm.org/D70700
2020-02-27 07:55:36 -08:00
Roman Lebedev 3dd5a298bf
[clang] Annotating C++'s `operator new` with more attributes
Summary:
Right now we annotate C++'s `operator new` with `noalias` attribute,
which very much is healthy for optimizations.

However as per [[ http://eel.is/c++draft/basic.stc.dynamic.allocation | `[basic.stc.dynamic.allocation]` ]],
there are more promises on global `operator new`, namely:
* non-`std::nothrow_t` `operator new` *never* returns `nullptr`
* If `std::align_val_t align` parameter is taken, the pointer will also be `align`-aligned
* ~~global `operator new`-returned pointer is `__STDCPP_DEFAULT_NEW_ALIGNMENT__`-aligned ~~ It's more caveated than that.

Supplying this information may not cause immediate landslide effects
on any specific benchmarks, but it for sure will be healthy for optimizer
in the sense that the IR will better reflect the guarantees provided in the source code.

The caveat is `-fno-assume-sane-operator-new`, which currently prevents emitting `noalias`
attribute, and is automatically passed by Sanitizers ([[ https://bugs.llvm.org/show_bug.cgi?id=16386 | PR16386 ]]) - should it also cover these attributes?
The problem is that the flag is back-end-specific, as seen in `test/Modules/explicit-build-flags.cpp`.
But while it is okay to add `noalias` metadata in backend, we really should be adding at least
the alignment metadata to the AST, since that allows us to perform sema checks on it.

Reviewers: erichkeane, rjmccall, jdoerfert, eugenis, rsmith

Reviewed By: rsmith

Subscribers: xbolva00, jrtc27, atanasyan, nlopes, cfe-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D73380
2020-02-26 01:37:17 +03:00
Yaxun (Sam) Liu a57d9652a0 Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4
Differential Revision: https://reviews.llvm.org/D75028
2020-02-25 13:58:20 -05:00
Rong Xu 11857d4994 [remark][diagnostics] [codegen] Fix PR44896
This patch fixes PR44896. For IR input files, option fdiscard-value-names
should be ignored as we need named values in loadModule().
Commit 60d3947922 sets this option after loadModule() where valued names
already created. This creates an inconsistent state in setNameImpl()
that leads to a seg fault.
This patch forces fdiscard-value-names to be false for IR input files.

This patch also emits a warning of "ignoring -fdiscard-value-names" if
option fdiscard-value-names is explictly enabled in the commandline for
IR input files.

Differential Revision: https://reviews.llvm.org/D74878
2020-02-25 08:15:17 -08:00
Bill Wendling 50cac24877 Support output constraints on "asm goto"
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.

Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.

In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.

Reviewers: jyknight, nickdesaulniers, hfinkel

Reviewed By: jyknight, nickdesaulniers

Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69876
2020-02-24 18:51:29 -08:00
Craig Topper 727328433a [X86] Add back fmaddsub intrinsics to work towards fixing the strict fp implementation
Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions.

This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict.

This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now.

Differential Revision: https://reviews.llvm.org/D74268
2020-02-24 12:07:21 -08:00
Xiangling Liao 8bee52bdb5 [AIX][Frontend] C++ ABI customizations for AIX boilerplate
This PR enables "XL" C++ ABI in frontend AST to IR codegen. And it is driven by
static init work. The current kind in Clang by default is Generic Itanium, which
has different behavior on static init with IBM xlclang compiler on AIX.

Differential Revision: https://reviews.llvm.org/D74015
2020-02-24 10:26:51 -05:00
Johannes Doerfert 4b540fa8a1 [OpenMP][NFC] Remove leftover debug messages 2020-02-20 20:28:42 -06:00
Djordje Todorovic 2f215cf36a Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGfaff707db82d.
A failure found on an ARM 2-stage buildbot.
The investigation is needed.
2020-02-20 14:41:39 +01:00
Roman Lebedev 9ea5d17cc9
[Sema] Demote call-site-based 'alignment is a power of two' check for AllocAlignAttr into a warning
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
  if ((N & (N-1)) == 0)
    return operator new(N, std::align_val_t(N));
  else
    return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.

That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?

Reviewers: rsmith, erichkeane

Reviewed By: erichkeane

Subscribers: cfe-commits, rsmith

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73996
2020-02-20 16:39:26 +03:00
Reid Kleckner 0edb212925 [MS] Mark vectorcall FP and vector args inreg
This has no effect on how LLVM passes the arguments, but it prevents
rewriteWithInAlloca from thinking that these parameters should be part
of the inalloca pack.

Follow-up to D72114

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D74452
2020-02-19 16:37:50 -08:00
Krzysztof Parzyszek b1d47467e2 [Hexagon] Change HVX vector predicate types from v512/1024i1 to v64/128i1
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.

It may cause existing bitcode files to become invalid.

* Converting between vector predicates and vector registers must be
  done explicitly via vandvrt/vandqrt instructions (their intrinsics),
  i.e. (for 64-byte mode):
    %Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
    %V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)

  The conversion intrinsics are:
    declare  <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
    declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
    declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
    declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
  They are all pure.

* Vector predicate values cannot be loaded/stored directly. This directly
  reflects the architecture restriction. Loading and storing or vector
  predicates must be done indirectly via vector registers and explicit
  conversions via vandvrt/vandqrt instructions.
2020-02-19 14:14:56 -06:00
Fady Ghanim ba3f863dfb [OpenMP][OMPIRBuilder] Introducing the `OMPBuilderCBHelpers` helper class
This patch introduces a new helper class `OMPBuilderCBHelpers`,
which will contain all reusable C/C++ language specific function-
alities required by the `OMPIRBuilder`.

Initially, this helper class contains the body and finalization
codegen functionalities implemented using callbacks which were
moved here for reusability among the different directives
implemented in the `OMPIRBuilder`, along with RAIIs for preserving
state prior to emitting outlined and/or inlined OpenMP regions.

In the future this helper class will also contain all the different
call backs required by OpenMP clauses/variable privatization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74562
2020-02-19 14:11:17 -06:00
Sander de Smalen 49b307e96d [AArch64][SVE] CodeGen of ACLE Builtin Types
Summary:
This patch adds codegen support for the ACLE builtin types added in:

  https://reviews.llvm.org/D62960

so that the ACLE builtin types are emitted as corresponding scalable
vector types in LLVM.

Reviewers: rsandifo-arm, rovka, rjmccall, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74724
2020-02-19 12:10:47 +00:00
Djordje Todorovic faff707db8 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-02-19 11:12:26 +01:00
Brian Gesiak 048239e46e [Coroutines][6/6] Clang schedules new passes
Summary:
Depends on https://reviews.llvm.org/D71902.

The last in a series of six patches that ports the LLVM coroutines
passes to the new pass manager infrastructure.

This patch has Clang schedule the new coroutines passes when the
`-fexperimental-new-pass-manager` option is used. With this and the
previous 5 patches, Clang is capable of building and successfully
running the test suite of large coroutines projects such as
https://github.com/lewissbaker/cppcoro with
`ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On`.

Reviewers: GorNishanov, lewissbaker, chandlerc, junparser

Subscribers: EricWF, cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71903
2020-02-19 01:03:28 -05:00
Djordje Todorovic 2bf44d11cb Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGa82d3e8a6e67.
2020-02-18 16:38:11 +01:00
Djordje Todorovic a82d3e8a6e Reland "[DebugInfo] Enable the debug entry values feature by default"
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-18 14:41:08 +01:00
Simon Tatham c32af4447f [ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.
Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.

LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.

Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74337
2020-02-18 09:34:50 +00:00
Simon Tatham 5e97940cd2 [ARM,MVE] Add the vmovlbq,vmovltq intrinsic family.
Summary:
These intrinsics take a vector of 2n elements, and return a vector of
n wider elements obtained by sign- or zero-extending every other
element of the input vector. They're represented in IR as a
shufflevector that extracts the odd or even elements of the input,
followed by a sext or zext.

Existing LLVM codegen already matches this pattern and generates the
VMOVLB instruction (which widens the even-index input lanes). But no
existing isel rule was generating VMOVLT, so I've added some. However,
the new rules currently only work in little-endian MVE, because the
pattern they expect from isel lowering includes a bitconvert which
doesn't have the right semantics in big-endian.

The output of one existing codegen test is improved by those new
rules.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74336
2020-02-18 09:34:50 +00:00
Simon Tatham b6236e9479 [ARM,MVE] Add the vrev16q, vrev32q, vrev64q family.
Summary:
These intrinsics just reorder the lanes of a vector, so the natural IR
representation is as a shufflevector operation. Existing LLVM codegen
already recognizes those particular shufflevectors and generates the
MVE VREV instruction.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74334
2020-02-18 09:34:50 +00:00
Simon Tatham 90dc78bc62 [ARM,MVE] Add intrinsics for abs, neg and not operations.
Summary:
This commit adds the unpredicated intrinsics for the unary operations
vabsq (absolute value), vnegq (arithmetic negation), vmvnq (bitwise
complement), vqabsq and vqnegq (saturating versions of abs and neg for
signed integers, in the sense that they give INT_MAX if an input lane
is INT_MIN).

This is done entirely in clang: all of these operations have existing
isel patterns and existing tests for them on the LLVM side, so I've
just made clang emit the same IR that those patterns already match.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74331
2020-02-18 09:34:50 +00:00
Jim Lin 466f8843f5 [NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
2020-02-18 10:49:13 +08:00
Nicolai Hähnle bf197304a6 CGBuiltin: Remove uses of deprecated CreateCall overloads
Reviewers: t.p.northover

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74673
2020-02-18 00:24:09 +01:00
Nikita Popov 3eaa53e805 Reapply "[IRBuilder] Virtualize IRBuilder"
Relative to the original commit, this fixes some warnings,
and is based on the deletion of the IRBuilder copy constructor
in D74693. The automatic copy constructor would no longer be
safe.

-----

Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-17 19:04:11 +01:00
Benjamin Kramer 5fc5c7db38 Strength reduce vectors into arrays. NFCI. 2020-02-17 15:37:35 +01:00
Yaxun (Sam) Liu fb44b9db95 [OpenCL][CUDA][HIP][SYCL] Add norecurse
norecurse function attr indicates the function is not called recursively
directly or indirectly.

Add norecurse to OpenCL functions, SYCL functions in device compilation
and CUDA/HIP kernels.

Although there is LLVM pass adding norecurse to functions, it only works
for whole-program compilation. Also FE adding norecurse can make that
pass run faster since functions with norecurse do not need to be checked
again.

Differential Revision: https://reviews.llvm.org/D73651
2020-02-16 20:41:00 -05:00
Nikita Popov 7c362b25d7 [IRBuilder] Fix unnecessary IRBuilder copies; NFC
Fix a few cases where an IRBuilder is passed to a helper function
by value, while a by reference pass was intended.
2020-02-16 17:57:18 +01:00
Nikita Popov af480e8c63 Revert "[IRBuilder] Virtualize IRBuilder"
This reverts commit 0765d3824d.
This reverts commit 1b04866a3d.

Relevant looking crashes observed on:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
2020-02-16 17:01:10 +01:00
Nikita Popov 1b04866a3d [IRBuilder] Try to fix warnings
Try to fix -Wnon-virtual-dtor warnings that cause build failure
on clang-pcc64le-rhel.
2020-02-16 15:32:11 +01:00
Nikita Popov 0765d3824d [IRBuilder] Virtualize IRBuilder
Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-16 13:48:55 +01:00
Johannes Doerfert b86bf83c28 [FIX] Remove pointer in attribute to eliminate leaks (see D71830) 2020-02-15 18:09:54 -06:00
Fady Ghanim 7438059a90 [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-15 01:15:45 -06:00
Johannes Doerfert 1228d42dda [OpenMP][Part 2] Use reusable OpenMP context/traits handling
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.

All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.

The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).

The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.

The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.

The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.

---

Test changes:

The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim

Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71830
2020-02-14 16:37:42 -06:00
Roger Ferrer Ibanez 2bef1c0e56 [OpenMP] Lower taskyield using OpenMP IR Builder
This is similar to D69828.

Special codegen for enclosing untied tasks is still done in clang.

Differential Revision: https://reviews.llvm.org/D70799
2020-02-14 11:35:17 +00:00
Roger Ferrer Ibanez a82f35e176 [OpenMP] Lower taskwait using OpenMP IR Builder
The code generation is exactly the same as it was.

But not that the special handling of untied tasks is still handled by
emitUntiedSwitch in clang.

Differential Revision: https://reviews.llvm.org/D69828
2020-02-14 09:53:02 +00:00
Fangrui Song 1d49eb00d9 [AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction
Similar to rL328848.
2020-02-13 17:06:24 -08:00
Alexey Bataev e0ca4792fa [OPENMP50]Add cancellation support in taskloop-based directives.
According to OpenMP 5.0, cancel and cancellation point constructs are
supported in taskloop directive. Added support for cancellation in
taskloop, master taskloop and parallel master taskloop.
2020-02-13 12:03:43 -05:00
Alexey Bataev 18789bfe3a [OPENMP50]Fix handling of clauses in parallel master taskloop directive.
We need to capture correctly the value of num_tasks clause and should
not try to emit the if clause at all in the task region.
2020-02-13 11:00:01 -05:00
Johannes Doerfert 70cac41a2b Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d76 with minor fixes.

The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.

This reverts commit 3aac953afa.
2020-02-12 22:29:07 -06:00
Johannes Doerfert 3aac953afa Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
This reverts commit 8a56d64d76.

Will be recommitted once the clang test problem is addressed.
2020-02-12 18:50:43 -06:00
Johannes Doerfert 8a56d64d76 [OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74372
2020-02-12 17:55:01 -06:00
Erik Pilkington e26c24b849 Revert "[IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas"
This reverts commit fafc6e4fdf.

Should fix ppc stage2 failure: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/23546

Conflicts:
	clang/lib/CodeGen/CGCall.cpp
2020-02-12 12:26:46 -08:00
Michael Liao f6a3ac150b Fix `-Wunused-variable` warning. NFC. 2020-02-12 12:45:14 -05:00
Djordje Todorovic 97ed706a96 Revert "[DebugInfo] Enable the debug entry values feature by default"
This reverts commit rG9f6ff07f8a39.

Found a test failure on clang-with-thin-lto-ubuntu buildbot.
2020-02-12 11:59:04 +01:00
jasonliu 55e2678fcd [clang] Add -fignore-exceptions
Summary:

This is trying to implement the functionality proposed in:
http://lists.llvm.org/pipermail/cfe-dev/2017-April/053417.html
An exception can throw, but no cleanup is going to happen.
A module compiled with exceptions on, can catch the exception throws
from module compiled with -fignore-exceptions.

The use cases for enabling this option are:
1. Performance analysis of EH instrumentation overhead
2. The ability to QA non EH functionality when EH functionality is not available.
3. User of EH enabled headers knows the calls won't throw in their program and
   wants the performance gain from ignoring EH construct.

The implementation tried to accomplish that by removing any landing pad code
 that might get generated.

Reviewed by: aaron.ballman

Differential Revision: https://reviews.llvm.org/D72644
2020-02-12 09:56:18 +00:00
Djordje Todorovic 9f6ff07f8a [DebugInfo] Enable the debug entry values feature by default
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-12 10:25:14 +01:00
Reid Kleckner 2c6a3896ab Re-land "[MS] Overhaul how clang passes overaligned args on x86_32"
This brings back 2af74e27ed and reverts
eaabaf7e04.

The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
  struct alignas(uint64_t) Wrapper { uint64_t P };
  void f(uint64_t p);
  void f(Wrapper p);

MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
2020-02-11 16:49:28 -08:00
Ian Levesque 14f870366a [xray][clang] Always add xray-skip-entry/exit and xray-ignore-loops attrs
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.

Differential Revision: https://reviews.llvm.org/D73842
2020-02-11 14:00:41 -08:00
Alexey Bataev 2d4f80f78a [OPENMP50]Full handling of atomic_default_mem_order in requires
directive.

According to OpenMP 5.0, The atomic_default_mem_order clause specifies the default memory ordering behavior for atomic constructs that must be provided by an implementation. If the default memory ordering is specified as seq_cst, all atomic constructs on which memory-order-clause is not specified behave as if the seq_cst clause appears. If the default memory ordering is specified as relaxed, all atomic constructs on which memory-order-clause is not specified behave as if the relaxed clause appears.
If the default memory ordering is specified as acq_rel, atomic constructs on which memory-order-clause is not specified behave as if the release clause appears if the atomic write or atomic update operation is specified, as if the acquire clause appears if the atomic read operation is specified, and as if the acq_rel clause appears if the atomic captured update operation is specified.
2020-02-11 15:42:34 -05:00
Krzysztof Parzyszek 57148e0379 [Hexagon] Fix ABI info for returning HVX vectors 2020-02-11 12:38:54 -06:00
Justin Lebar 027eb71696 Use std::foo_t rather than std::foo in clang.
Summary: No functional change.

Reviewers: bkramer, MaskRay, martong, shafik

Subscribers: martong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74414
2020-02-11 10:37:08 -08:00
Alexey Bataev 9a8defcc34 [OPENMP50]Add support for relaxed clause in atomic directive.
Added full support for relaxed clause.
2020-02-11 11:54:46 -05:00
Vedant Kumar 8b81ebfe7e [ubsan] Null-check and adjust TypeLoc before using it
Null-check and adjut a TypeLoc before casting it to a FunctionTypeLoc.
This fixes a crash in -fsanitize=nullability-return, and also makes the
location of the nonnull type available when the return type is adjusted.

rdar://59263039

Differential Revision: https://reviews.llvm.org/D74355
2020-02-10 14:10:06 -08:00
Alexey Bataev 9559834a5c [OPENMP50]Add support for 'release' clause.
Added full support for 'release' clause in flush|atomic directives.
2020-02-10 16:01:41 -05:00
Alexey Bataev 04a830f80a [OPENMP50]Support for acquire clause.
Added full support for acquire clause in flush|atomic directives.
2020-02-10 14:51:46 -05:00
Kadir Cetinkaya 5731b6672d
Revert "[OpenMP] Fix unused variable"
This breaks under asan, see http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38597/steps/check-clang%20asan/logs/stdio

This reverts commit bb50454295.

Revert "[FIX] Ordering problem accidentally introduced with D72304"

This reverts commit 08c0a06d8f.

Revert "[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder."

This reverts commit e8a436c5ea.
2020-02-10 16:34:59 +01:00
Michael Liao a067891389 [clang][codegen] Fix another lifetime emission on alloca on non-default address space.
- Lifetime intrinsics expect the pointer directly from alloca. Need
  extra handling for targets with alloca on non-default (or non-zero)
  address space.
2020-02-10 00:15:56 -05:00
serge_sans_paille e67cbac812 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille 4546211600 Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille 0fd51a4554 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
fady e8a436c5ea [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-08 18:55:48 -06:00
serge-sans-paille 658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Guillaume Chatelet d65bbf81f8 [clang] Add support for __builtin_memcpy_inline
Summary: This is a follow up on D61634 and the last step to implement http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

Reviewers: efriedma, courbet, tejohnson

Subscribers: hiraditya, cfe-commits, llvm-commits, jdoerfert, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73543
2020-02-07 23:55:26 +01:00
Erik Pilkington fafc6e4fdf [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.

rdar://58552124

Differential revision: https://reviews.llvm.org/D74094
2020-02-07 14:39:31 -08:00
Alexey Bataev e8e05de08b [OPENMP50]Add codegen for acq_rel clause in atomic|flush directives.
Added codegen support for atomic|flush directives with acq_rel clause.
2020-02-07 15:05:09 -05:00
Nico Weber b03c3d8c62 Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00
serge_sans_paille 4a1a0690ad Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with correct option
flags set.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 19:54:39 +01:00
Alexey Bataev ea9166b5a8 [OPENMP50]Add parsing/sema for acq_rel clause.
Added basic support (representation + parsing/sema/(de)serialization)
for acq_rel clause in flush/atomic directives.
2020-02-07 09:21:10 -05:00
serge-sans-paille f6d98429fc Revert "Support -fstack-clash-protection for x86"
This reverts commit 39f50da2a3.

The -fstack-clash-protection is being passed to the linker too, which
is not intended.

Reverting and fixing that in a later commit.
2020-02-07 11:36:53 +01:00
Diogo Sampaio 9d869180c4 [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
Summary:
Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

```
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

```

Reviewers: lebedev.ri, ostannard, jfb, eli.friedman

Reviewed By: jfb

Subscribers: rsmith, rjmccall, dexonsmith, kristof.beyls, jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67399
2020-02-07 10:11:54 +00:00
serge_sans_paille 39f50da2a3 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 10:56:15 +01:00
Craig Topper 96400ae2a4 Recommit "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
With REQUIRES: x86-register-target added to the tests.

Also remove some unneeded FIXMEs

But add a FIXME for bad IR generation for FMADDSUB/FMSUBADD with
constrained FP.

Original patch by Kevin P. Neal
2020-02-06 16:54:35 -08:00
Richard Smith 96c899449b C++ DR2026: static storage duration variables are not zeroed before
constant initialization.

Removing this zeroing regressed our code generation in a few cases, also
fixed here. We now compute whether a variable has constant destruction
even if it doesn't have a constant initializer, by trying to destroy a
default-initialized value, and skip emitting a trivial default
constructor for a variable even if it has non-trivial (but perhaps
constant) destruction.
2020-02-06 16:37:22 -08:00
Kevin P. Neal ad0e03fd4c Revert "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
This reverts commit 208470dd5d.

Tests fail:
error: unable to create target: 'No available targets are compatible with triple "x86_64-apple-darwin"'

This happens on clang-hexagon-elf, clang-cmake-armv7-quick, and
clang-cmake-armv7-quick bots.

If anyone has any suggestions on why then I'm all ears.

Differential Revision: https://reviews.llvm.org/D73570

Revert "[FPEnv][X86] Speculative fix for failures introduced by eda495426."

This reverts commit 80e17e5fcc.

The speculative fix didn't solve the test failures on Hexagon, ARMv6, and
MSVC AArch64.
2020-02-06 19:17:14 -05:00
Kevin P. Neal 208470dd5d [FPEnv][X86] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the X86-specific builtins don't
use constrained intrinsics in some cases. Fix that.

Differential Revision: https://reviews.llvm.org/D73570
2020-02-06 14:20:44 -05:00
Vedant Kumar 65f0785fff [ubsan] Omit return value check when return block is unreachable
If the return block is unreachable, clang removes it in
CodeGenFunction::FinishFunction(). This removal can leave dangling
references to values defined in the return block if the return block has
successors, which it /would/ if UBSan's return value check is emitted.

In this case, as the UBSan check wouldn't be reachable, it's better to
simply not emit it.

rdar://59196131
2020-02-06 10:24:03 -08:00
Michael Liao 318d0ede57 Fix warning on unused variables. NFC. 2020-02-06 12:21:20 -05:00
shafik 428583dd22 [DebugInfo] Fix debug-info generation for block invocations so that we set the LinkageName
Currently when generating debug-info for a BlockDecl we are setting the Name to the mangled name and not setting the LinkageName.
This means we see the mangled name for block invcations ends up in DW_AT_Name and not in DW_AT_linkage_name.

This patch fixes this case so that we also set the LinkageName as well.

Differential Revision: https://reviews.llvm.org/D73282
2020-02-05 11:07:30 -08:00
Thomas Lively 8c3e6af71b [WebAssembly] Add experimental multivalue calling ABI
Summary:
For now, this ABI simply expands all possible aggregate arguments and
returns all possible aggregates directly. This ABI will change rapidly
as we prototype and benchmark a new ABI that takes advantage of
multivalue return and possibly other changes from the MVP ABI.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72972
2020-02-04 21:09:49 -08:00
Francis Visoiu Mistrih 7531a5039f [Remarks] Extend the RemarkStreamer to support other emitters
This extends the RemarkStreamer to allow for other emitters (e.g.
frontends, SIL, etc.) to emit remarks through a common interface.

See changes in llvm/docs/Remarks.rst for motivation and design choices.

Differential Revision: https://reviews.llvm.org/D73676
2020-02-04 17:16:02 -08:00
Kiran Chandramohan a969e051a5 [OpenMP] Add Flush directive to OpenMPIRBuilder
Add support for Flush in the OMPIRBuilder. This patch also adds changes
to clang to use the OMPIRBuilder when '-fopenmp-enable-irbuilder'
commandline option is used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70712
2020-02-04 22:48:02 +00:00
Matt Arsenault a3c814d234 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Yonghong Song 9271cab270 [BPF] use base lvalue type for preserve_{struct,union}_access_index metadata
Linux commit
  1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
      apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
  #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
  typedef struct {
    int a;
  } __t;
  #pragma clang attribute pop
  int test(__t *arg) { return arg->a; }

The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
    in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
  clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
     Assertion `TypeName.size()' failed.

The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.

Differential Revision: https://reviews.llvm.org/D73900
2020-02-04 09:28:30 -08:00
Jonas Paulsson 563e84790f [SystemZ] Support -msoft-float
This is needed when building the Linux kernel.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D72189
2020-02-04 10:32:45 -05:00
Fangrui Song dbc96b518b Revert "[CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition"
This reverts commit 789a46f2d7.

Accidentally committed.
2020-02-03 10:09:39 -08:00
Fangrui Song 789a46f2d7 [CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition
Summary:
Clang -fpic defaults to -fno-semantic-interposition (GCC -fpic defaults
to -fsemantic-interposition).
Users need to specify -fsemantic-interposition to get semantic
interposition behavior.

Semantic interposition is currently a best-effort feature. There may
still be some cases where it is not handled well.

Reviewers: peter.smith, rnk, serge-sans-paille, sfertile, jfb, jdoerfert

Subscribers: dschuff, jyknight, dylanmckay, nemanjai, jvesely, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, arphaman, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73865
2020-02-03 09:52:48 -08:00
Alexey Bataev a781521867 [OPENMP50]Codegen support for order(concurrent) clause.
Emit llvm parallel access metadata for the loops if they are marked as
order(concurrent).
2020-02-03 12:27:33 -05:00
Alexey Bataev cb8e69148d [OPENMP50]Basic parsing/sema analysis for order(concurrent) clause.
Added parsing/sema/serialization support for order(concurrent) clause in
loop|simd-based directives.
2020-02-03 10:31:02 -05:00
Johannes Doerfert 9dcfc7cd64 Revert "[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder."
This reverts commit 1ca740387b.

The bots break [0], investigation is needed.

[0] http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/22899
2020-02-03 08:59:14 -06:00
Fady Ghanim 1ca740387b [OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304
2020-02-03 08:44:23 -06:00
Simon Tatham 961530fdc9 [ARM,MVE] Fix vreinterpretq in big-endian mode.
Summary:
In big-endian MVE, the simple vector load/store instructions (i.e.
both contiguous and non-widening) don't all store the bytes of a
register to memory in the same order: it matters whether you did a
VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register
formats of different vector types relate to each other in a different
way from the in-memory formats.

So, if you want to 'bitcast' or 'reinterpret' one vector type as
another, you have to carefully specify which you mean: did you want to
reinterpret the //register// format of one type as that of the other,
or the //memory// format?

The ACLE `vreinterpretq` intrinsics are specified to reinterpret the
register format. But I had implemented them as LLVM IR bitcast, which
is specified for all types as a reinterpretation of the memory format.
So a `vreinterpretq` intrinsic, applied to values already in registers,
would code-generate incorrectly if compiled big-endian: instead of
emitting no code, it would emit a `vrev`.

To fix this, I've introduced a new IR intrinsic to perform a
register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's
implemented by a trivial isel pattern that expects the input in an
MQPR register, and just returns it unchanged.

In the clang codegen, I only emit this new intrinsic where it's
actually needed: I prefer a bitcast wherever it will have the right
effect, because LLVM understands bitcasts better. So we still generate
bitcasts in little-endian mode, and even in big-endian when you're
casting between two vector types with the same lane size.

For testing, I've moved all the codegen tests of vreinterpretq out
into their own file, so that they can have a different set of RUN
lines to check both big- and little-endian.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73786
2020-02-03 11:20:06 +00:00
Richard Smith 0130b6cb5a Don't assume a reference refers to at least sizeof(T) bytes.
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
2020-01-31 19:08:17 -08:00
serge-sans-paille fd09f12f32 Implement -fsemantic-interposition
First attempt at implementing -fsemantic-interposition.

Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.

Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.

So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.

Note that it only impacts architecture compiled with -fPIC and no pie.

Differential Revision: https://reviews.llvm.org/D72829
2020-01-31 14:02:33 +01:00
Pierre Habouzit 6eb969b7c5 [objc_direct] fix codegen for mismatched Decl/Impl return types
For non direct methods, the codegen uses the type of the Implementation.
Because Objective-C rules allow some differences between the Declaration
and Implementation return types, when the Implementation is in this
translation unit, the type of the Implementation should be preferred to
emit the Function over the Declaration.

Radar-Id: rdar://problem/58797748
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Differential Revision: https://reviews.llvm.org/D73208
2020-01-30 18:17:45 -08:00
Reid Kleckner 01943a59f5 Move verification of Sema::MaximumAlignment to a .cpp file
Saves these transitive includes:
2020-01-30 13:37:52 -08:00
Alexey Bataev 4697874c28 [OPENMP50]Handle lastprivate conditionals passed as shared in inner
regions.

If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
2020-01-30 11:35:23 -05:00
Jonas Devlieghere 509e21a1b9 [clang] Replace SmallStr.str().str() with std::string conversion operator.
Use the std::string conversion operator introduced in
d7049213d0.
2020-01-29 21:27:46 -08:00
Craig Topper a10cec02f7 [X86] Improve X86 cmpps/cmppd/cmpss/cmpsd intrinsics with strictfp
The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.

I've also added support for selecting between signaling and quiet.

Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.

Differential Revision: https://reviews.llvm.org/D72906
2020-01-29 15:52:11 -08:00
Sanne Wouda 2939fc13c8 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
Benjamin Kramer bd31243a34 Fix more implicit conversions. Getting closer to having clang working with gcc 5 again 2020-01-29 02:57:59 +01:00
Francis Visoiu Mistrih b1a8189d7d [NFC] Fix comment typo 2020-01-28 15:23:28 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Francis Visoiu Mistrih 4e799ada58 [CodeGen] Attach no-builtin attributes to function definitions with no Decl
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.

This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.

The fix changes the behavior to always add the attribute if LangOptions
requests it.

Differential Revision: https://reviews.llvm.org/D73495
2020-01-28 13:59:08 -08:00
Hans Wennborg eaabaf7e04 Revert "[MS] Overhaul how clang passes overaligned args on x86_32"
It broke some Chromium tests, so let's revert until it can be fixed; see
https://crbug.com/1046362

This reverts commit 2af74e27ed.
2020-01-28 22:25:07 +01:00
Aaron Ballman 5547919280 Fix a crash when casting _Complex and ignoring the results.
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:

  _Complex int a;
  void fn1() { (_Complex double) a; }

This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
2020-01-28 13:05:56 -05:00
Alexey Bataev f117f2cc78 [OPENMP50]Check for lastprivate conditional updates in atomic
constructs.

Added analysis in atomic constrcuts to support checks for updates of
conditional lastprivate variables.
2020-01-28 11:40:31 -05:00
Wang, Pengfei 3239b5034e [FPEnv] Add pragma FP_CONTRACT support under strict FP.
Summary: Support pragma FP_CONTRACT under strict FP.

Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3

Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72820
2020-01-28 20:43:43 +08:00
Alexey Bataev e6d2583e45 [OPENMP50]Track changes of lastprivate conditional in parallel-based
regions with reductions, lastprivates or linears clauses.

If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
2020-01-27 14:53:25 -05:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Teresa Johnson af954e441a [WPD] Emit vcall_visibility metadata for MicrosoftCXXABI
Summary:
The MicrosoftCXXABI uses a separate mechanism for emitting vtable
type metadata, and thus didn't pick up the change from D71907
to emit the vcall_visibility metadata under -fwhole-program-vtables.

I believe this is the cause of a Windows bot failure when I committed
follow on change D71913 that required a revert. The failure occurred
in a CFI test that was expecting to not abort because it expected a
devirtualization to occur, and without the necessary vcall_visibility
metadata we would not get devirtualization.

Note in the equivalent code in CodeGenModule::EmitVTableTypeMetadata
(used by the ItaniumCXXABI), we also emit the vcall_visibility metadata
when Virtual Function Elimination is enabled. Since I am not as familiar
with the details of that optimization, I have marked that as a TODO and
am only inserting under -fwhole-program-vtables.

Reviewers: evgeny777

Subscribers: Prazek, ostannard, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73418
2020-01-27 06:22:24 -08:00
Guillaume Chatelet 07c9d53266 [Alignment][NFC] Use Align with CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73449
2020-01-27 10:58:36 +01:00
Reid Kleckner 8a81daaa8b [AST] Split parent map traversal logic into ParentMapContext.h
The only part of ASTContext.h that requires most AST types to be
complete is the parent map. Nothing in Clang proper uses the ParentMap,
so split it out into its own class. Make ASTContext own the
ParentMapContext so there is still a one-to-one relationship.

After this change, 562 fewer files depend on ASTTypeTraits.h, and 66
fewer depend on TypeLoc.h:
  $ diff -u deps-before.txt deps-after.txt | \
    grep '^[-+] ' | sort | uniq -c | sort -nr | less
      562 -    ../clang/include/clang/AST/ASTTypeTraits.h
      340 +    ../clang/include/clang/AST/ParentMapContext.h
       66 -    ../clang/include/clang/AST/TypeLocNodes.def
       66 -    ../clang/include/clang/AST/TypeLoc.h
       15 -    ../clang/include/clang/AST/TemplateBase.h
  ...
I computed deps-before.txt and deps-after.txt with `ninja -t deps`.

This removes a common and key dependency on TemplateBase.h and
TypeLoc.h.

This also has the effect of breaking the ParentMap RecursiveASTVisitor
instantiation into its own file, which roughly halves the compilation
time of ASTContext.cpp (29.75s -> 17.66s). The new file takes 13.8s to
compile.

I left behind forwarding methods for getParents(), but clients will need
to include a new header to make them work:
  #include "clang/AST/ParentMapContext.h"

I noticed that this parent map functionality is unfortunately duplicated
in ParentMap.h, which only works for Stmt nodes.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D71313
2020-01-24 13:42:28 -08:00
David Zarzycki 0d61cd25a6
Verify that clang's max alignment is <= LLVM's max alignment
Reviewers: lebedev.ri

Reviewed By: lebedev.ri

Subscribers: cfe-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D73363
2020-01-24 12:37:05 -05:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Awanish Pandey c83602fdf5 Recommit "[DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions."
Summary:
This was reverted in e45fcfc3aa due to
libcxx build failure. This revision addresses that case.

Original commit message:
    This patch will provide support for auto return type for the C++ member
    functions.

    This patch includes clang side implementation of this feature.

    Patch by: Awanish Pandey <Awanish.Pandey@amd.com>

    Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
    Reviewed by: dblaikie

    Differential Revision: https://reviews.llvm.org/D70524
2020-01-24 14:50:17 +05:30
Pierre Habouzit 52311d0483 [objc_direct] do not add direct properties to the serialization array
If we do, then the property_list_t length is wrong
and class_getProperty gets very sad.

Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58804805
Differential Revision: https://reviews.llvm.org/D73219
2020-01-23 22:39:47 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Fangrui Song 69bf40c45f [Driver][CodeGen] Support -fpatchable-function-entry=N,M and __attribute__((patchable_function_entry(N,M))) where M>0
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73072
2020-01-23 17:02:54 -08:00