Commit Graph

5512 Commits

Author SHA1 Message Date
Fangrui Song 3bb24bf257 Fix tests on Windows after D49466
It is tricky to use replace_path_prefix correctly on Windows which uses
backslashes as native path separators. Switch back to the old approach
(startswith is not ideal) to appease build bots for now.
2019-11-26 16:15:39 -08:00
Dan McGregor 6c92cdff72 Initial implementation of -fmacro-prefix-map and -ffile-prefix-map
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map

Reviewed By: rnk, Lekensteyn, maskray

Differential Revision: https://reviews.llvm.org/D49466
2019-11-26 15:17:49 -08:00
Tim Northover 78ad22e0cc Recommit ARM-NEON: make type modifiers orthogonal and allow multiple modifiers.
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.

When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.

The original version broke vcreate_* because it became a macro and didn't
apply the normal integer promotion rules before bitcasting to a vector.
This adds a temporary.
2019-11-26 09:21:47 +00:00
Muhammad Omair Javaid c9ddb02659 Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there."
This reverts commit 8ff85ed905.

This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

Following tests were failing:
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py
    lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py
    lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py
    lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-26 09:32:13 +05:00
Eric Christopher 8ff85ed905 As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-25 17:16:46 -08:00
Peter Collingbourne 90b8bc003c IRGen: Call SetLLVMFunctionAttributes{,ForDefinition} on __cfi_check_fail.
This has the main effect of causing target-cpu and target-features to be set
on __cfi_check_fail, causing the function to become ABI-compatible with other
functions in the case where these attributes affect ABI (e.g. reserve-x18).

Technically we only need to call SetLLVMFunctionAttributes to get the target-*
attributes set, but since we're creating a definition we probably ought to
call the ForDefinition function as well.

Fixes PR44094.

Differential Revision: https://reviews.llvm.org/D70692
2019-11-25 15:16:43 -08:00
Hans Wennborg 21f26470e9 Revert 3f91705ca5 "ARM-NEON: make type modifiers orthogonal and allow multiple modifiers."
This broke the vcreate_u64 intrinsic. Example:

  $ cat /tmp/a.cc
  #include <arm_neon.h>

  void g() {
    auto v = vcreate_u64(0);
  }
  $ bin/clang -c /tmp/a.cc --target=arm-linux-androideabi16 -march=armv7-a
  /tmp/a.cc:4:12: error: C-style cast from scalar 'int' to vector 'uint64x1_t' (vector of 1 'uint64_t' value) of different size
    auto v = vcreate_u64(0);
             ^~~~~~~~~~~~~~
  /work/llvm.monorepo/build.release/lib/clang/10.0.0/include/arm_neon.h:4144:11: note: expanded from macro 'vcreate_u64'
    __ret = (uint64x1_t)(__p0); \
            ^~~~~~~~~~~~~~~~~~

Reverting until this can be investigated.

> The modifier system used to mutate types on NEON intrinsic definitions had a
> separate letter for all kinds of transformations that might be needed, and we
> were quite quickly running out of letters to use. This patch converts to a much
> smaller set of orthogonal modifiers that can be applied together to achieve the
> desired effect.
>
> When merging with downstream it is likely to cause a conflict with any local
> modifications to the .td files. There is a new script in
> utils/convert_arm_neon.py that was used to convert all .td definitions and I
> would suggest running it on the last downstream version of those files before
> this commit rather than resolving conflicts manually.
2019-11-25 16:27:53 +01:00
David Blaikie e956952ede DebugInfo: Flag Dwarf Version metadata for merging during LTO
When the Dwarf Version metadata was initially added (r184276) there was
no support for Module::Max - though the comment suggested that was the
desired behavior. The original behavior was Module::Warn which would
warn and then pick whichever version came first - which is pretty
arbitrary/luck-based if the consumer has some need for one version or
the other.

Now that the functionality's been added (r303590) this change updates
the implementation to match the desired goal.

The general logic here is - if you compile /some/ of your program with a
more recent DWARF version, you must have a consumer that can handle it,
so might as well use it for /everything/.

The only place where this might fall down is if you have a need to use
an old tool (supporting only the older DWARF version) for some subset of
your program. In which case now it'll all be the higher version. That
seems pretty narrow (& the inverse could happen too - you specifically
/need/ the higher DWARF version for some extra expressivity, etc, in
some part of the program)
2019-11-22 17:16:35 -08:00
Tim Northover 5cf58768cb Atomics: support min/max orthogonally
We seem to have been gradually growing support for atomic min/max operations
(exposing longstanding IR atomicrmw instructions). But until now there have
been gaps in the expected intrinsics. This adds support for the C11-style
intrinsics (i.e. taking _Atomic, rather than individually blessed by C11
standard), and the variants that return the new value instead of the original
one.

That way, people won't be misled by trying one form and it not working, and the
front-end is more friendly to people using _Atomic types, as we recommend.
2019-11-21 10:37:56 +00:00
Tim Northover 3f91705ca5 ARM-NEON: make type modifiers orthogonal and allow multiple modifiers.
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.

When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.
2019-11-20 13:20:02 +00:00
Tim Northover b80e483c42 Update tests after change to llvm-cxxfilt's underscore stripping behaviour. 2019-11-20 13:10:55 +00:00
Djordje Todorovic ce1f95a6e0 Reland "[clang] Remove the DIFlagArgumentNotModified debug info flag"
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.

Differential Revision: https://reviews.llvm.org/D68206
2019-11-20 10:08:07 +01:00
Vedant Kumar 568db780bb [CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood (reland with fixes)
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

This was reverted in 5b9a072c because it attached declaration
subprograms to inlinable builtin calls, which interacted badly with the
MergeICmps pass. The fix is to not attach declarations to builtins.

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
2019-11-19 12:49:27 -08:00
Matt Arsenault e531750c6c clang: Add -fconvergent-functions flag
The CUDA builtin library is apparently compiled in C++ mode, so the
assumption of convergent needs to be made in a typically non-SPMD
language. The functions in the library should still be assumed
convergent. Currently they are not, which is potentially incorrect and
this happens to work after the library is linked.
2019-11-19 23:20:15 +05:30
Simon Tatham 254b4f2500 [ARM,MVE] Add intrinsics for scalar shifts.
This fills in the small family of MVE intrinsics that have nothing to
do with vectors: they implement bit-shift operations on 32- or 64-bit
values held in one or two general-purpose registers. Most of these
shift operations saturate if shifting left, and round to nearest if
shifting right, although LSLL and ASRL behave like ordinary shifts.

When these instructions take a variable shift count in a register,
they pay attention to its sign, so that (for example) LSLL or UQRSHLL
will shift left if given a positive number but right if given a
negative one. That makes even LSLL and ASRL different enough from
standard LLVM IR shift semantics that I couldn't see any better
alternative than to simply model the whole family as a set of
MVE-specific IR intrinsics.

(The //immediate// forms of LSLL and ASRL, on the other hand, do
behave exactly like a standard IR shift of a 64-bit value. In fact,
those forms don't have ACLE intrinsics defined at all, because you can
just write an ordinary C shift operation if you want one of those.)

The 64-bit shifts have to be instruction-selected in C++, because they
deliver two output values. But the 32-bit ones are simple enough that
I could write a DAG isel pattern directly into each Instruction
record.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70319
2019-11-19 14:47:29 +00:00
Eric Christopher 30e7ee3c4b Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.

This reverts commits af57dbf12e and e6584b2b7b
2019-11-18 10:46:48 -08:00
Simon Tatham 4a4dd85e5a [ARM,MVE] Add intrinsics for vector comparisons.
This adds the `vcmp` family of ACLE MVE intrinsics: vector/vector,
vector/scalar, and the predicated forms of both. All are represented
using standard existing IR: vector/scalar comparisons are represented
by making a vector out of the scalar first, and predicated forms are
represented by taking the bitwise AND of the input predicate and the
output of the comparison. Existing LLVM-side tests demonstrate that
ISel will pattern-match all of that back down to single MVE VCMPs.

The idiom of handling a vector/scalar operation by generating IR to
expand the scalar into a second vector is going to be needed for a lot
of MVE intrinsics, so to make that easy, I've provided a helper
function that automatically works out the element count.

The comparison intrinsics are the first ones that have to //return// a
predicate, in the user-facing `mve_pred16_t` format. This means we
have to use the `arm_mve_pred_v2i` low-level intrinsic to convert it
back from the logical `<n x i1>` form used in IR. I've done that
explicitly in the code gen specification for the builtins, because it
happens much more rarely in the ACLE API than passing a Predicate as
input, so it didn't seem worth automating in MveEmitter.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70297
2019-11-18 10:39:30 +00:00
Momchil Velikov aa6d48fa70 Implement target(branch-protection) attribute for AArch64
This patch implements `__attribute__((target("branch-protection=...")))`
in a manner, compatible with the analogous GCC feature:

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes

Differential Revision: https://reviews.llvm.org/D68711
2019-11-15 15:40:46 +00:00
Djordje Todorovic 41d6ad6efd Revert "[clang] Remove the DIFlagArgumentNotModified debug info flag"
This reverts commit rG1643734741d2 due to LLDB test failure.
2019-11-15 12:16:44 +01:00
Djordje Todorovic 1643734741 [clang] Remove the DIFlagArgumentNotModified debug info flag
It turns out that the ExprMutationAnalyzer can be very slow when AST
gets huge in some cases. The idea is to move this analysis to the LLVM
back-end level (more precisely, in the LiveDebugValues pass). The new
approach will remove the performance regression, simplify the
implementation and give us front-end independent implementation.

Differential Revision: https://reviews.llvm.org/D68206
2019-11-15 11:10:19 +01:00
Simon Tatham 9e37892773 [ARM,MVE] Add intrinsics for vector get/set lane.
This adds the `vgetq_lane` and `vsetq_lane` families, to copy between
a scalar and a specified lane of a vector.

One of the new `vgetq_lane` intrinsics returns a `float16_t`, which
causes a compile error if `%clang_cc1` doesn't get the option
`-fallow-half-arguments-and-returns`. The driver passes that option to
cc1 already, but I've had to edit all the explicit cc1 command lines
in the existing MVE intrinsics tests.

A couple of fixes are included for the code I wrote up front in
MveEmitter to support lane-index immediates (and which nothing has
tested until now): the type was wrong (`uint32_t` instead of `int`)
and the range was off by one.

I've also added a method of bypassing the default promotion to `i32`
that is done by the MveEmitter code generation: it's sensible to
promote short scalars like `i16` to `i32` if they're going to be
passed to custom IR intrinsics representing a machine instruction
operating on GPRs, but not if they're going to be passed to standard
IR operations like `insertelement` which expect the exact type.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70188
2019-11-15 09:53:58 +00:00
Simon Tatham 902e84556a [ARM,MVE] Add intrinsics for 'administrative' vector operations.
This batch of intrinsics includes lots of things that move vector data
around or change its type without really affecting its value very
much. It includes the `vreinterpretq` family (cast one vector type to
another); `vuninitializedq` (create a vector of a given type with
don't-care contents); and `vcreateq` (make a 128-bit vector out of two
`uint64_t` halves).

These are all implemented using completely standard IR that's already
tested in existing LLVM unit tests, so I've just written a clang test
to check the IR is correct, and left it at that.

I've also added some richer infrastructure to the MveEmitter Tablegen
backend, to make it specify the exact integer type of integer
arguments passed to IR construction functions, and wrap those
arguments in a `static_cast` in the autogenerated C++. That was
necessary to prevent an overloading ambiguity when passing the integer
literal `0` to `IRBuilder::CreateInsertElement`, because otherwise, it
could mean either a null pointer `llvm::Value *` or a zero `uint64_t`.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D70133
2019-11-15 09:53:43 +00:00
Craig Topper 188d92b947 [X86] Don't treat mxcsr as a register name when parsing MS inline assembly.
No instruction takes mxcsr as a an operand so we should always
treat it as an identifier name.
2019-11-13 15:26:18 -08:00
Yonghong Song 1583158042 [BPF] fix clang test failure for bpf-attr-preserve-access-index-4.c
Depending on different cmake configures, clang may generate different
IR name for slot variables. Let us use the regex instead of hard
coding the name. I did the same for other bpf-attr-preserve-access-index
tests with such an approach, but somehow did not do for this one.
2019-11-13 09:40:57 -08:00
Yonghong Song 4e2ce228ae [BPF] Add preserve_access_index attribute for record definition
This is a resubmission for the previous reverted commit
9434360401 with the same subject. This commit fixed the
segfault issue and addressed additional review comments.

This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-13 08:23:44 -08:00
Simon Tatham a12f588ebb [ARM,MVE] Add intrinsics for contiguous load/stores.
This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.

Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.

The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the `-enable-arm-maskedldst` option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.

In the Tablegen backend, I've had to add a handful of extra support
features:

* We need to be able to make clang::Address objects out of a
  pointer and an alignment (previously we only needed these when the
  user passed us an existing one).

* We can now specify vector types that aren't 128 bits wide (for use
  in those intermediate values in IR), the parametrized type system
  can make one starting from two existing vector types (using the lane
  count of one and the element type of the other).

* I've added support for code generation of pointer casts, and for
  specifying LLVM types as operands to IRBuilder operations (for zext
  and sext, though I think they'll come in useful again).

* Now not all IR construction operations need to be specified as
  Builder.CreateFoo; some don't involve a Builder at all, and one
  passes it as a parameter to a tiny static helper function in
  CGBuiltin.cpp.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Subscribers: kristof.beyls, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70088
2019-11-13 12:47:00 +00:00
Tim Northover 44e5879f0f AArch64: add arm64_32 support to Clang. 2019-11-12 12:45:18 +00:00
Yonghong Song 9434360401 Revert "[BPF] Add preserve_access_index attribute for record definition"
This reverts commit 4a5aa1a7bf.

There are some other test failures. Investigate them first.
2019-11-09 08:32:44 -08:00
Yonghong Song 4a5aa1a7bf [BPF] Add preserve_access_index attribute for record definition
This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-09 08:17:12 -08:00
Jan Korous 590f279c45 [clang] Add VFS support for sanitizers' blacklists
Differential Revision: https://reviews.llvm.org/D69648
2019-11-08 10:58:50 -08:00
Nemanja Ivanovic e0407f5496 [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst
As we currently have it implemented in altivec.h, the offsets for these two
intrinsics are element offsets. The documentation in the ABI (as well as the
implementation in both XL and GCC) states that these should be byte offsets.

Differential revision: https://reviews.llvm.org/D63636
2019-11-07 20:58:11 -06:00
Nemanja Ivanovic 070e4027b0 [PowerPC][Altivec] Emit correct builtin for single precision vec_all_ne
We currently emit a double precision comparison instruction for this, whereas we
need to emit the single precision version.

Differential revision: https://reviews.llvm.org/D64024
2019-11-07 20:40:32 -06:00
Melanie Blower af57dbf12e Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=
Add options to control floating point behavior: trapping and
    exception behavior, rounding, and control of optimizations that affect
    floating point calculations. More details in UsersManual.rst.

    Reviewers: rjmccall

    Differential Revision: https://reviews.llvm.org/D62731
2019-11-07 07:22:45 -08:00
Tim Northover 10e0d64337 CodeGen: set correct result for atomic compound expressions
Atomic compound expressions try to use atomicrmw if possible, but this
path doesn't set the Result variable, leaving it to crash in later code
if anything ever tries to use the result of the expression. This fixes
that issue by recalculating the new value based on the old one
atomically loaded.
2019-11-07 13:36:44 +00:00
Hans Wennborg 5b9a072c39 Revert a5c8ec4 "[CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood"
This caused Chromium builds to fail with "inlinable function call in a function
with debug info must have a !dbg location" errors. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a
reproducer.

> Currently, clang emits subprograms for declared functions when the
> target debugger or DWARF standard is known to support entry values
> (DW_OP_entry_value & the GNU equivalent).
>
> Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
> tail calls.
>
> Pre-patch debug session with a cross-TU tail call:
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> Post-patch (note that the tail-calling frame, "helper", is visible):
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
>     frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> rdar://46577651
>
> Differential Revision: https://reviews.llvm.org/D69743
2019-11-07 10:30:07 +01:00
Tim Northover 59f063b89c NeonEmitter: remove special 'a' type modifier.
'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this
can be done directly from .td expansions now (and most ops already did).
So removing it simplifies the overall code.

https://reviews.llvm.org/D69716
2019-11-06 10:23:36 +00:00
Simon Tatham 6c3fee47a6 [ARM,MVE] Add intrinsics for gather/scatter load/stores.
This patch adds two new families of intrinsics, both of which are
memory accesses taking a vector of locations to load from / store to.

The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of
base addresses, and an immediate offset to be added consistently to
each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar
base address, and a vector of offsets to add to it. The
'shifted_offset' variants also multiply each offset by the element
size type, so that the vector is effectively of array indices.

At the IR level, these operations are represented by a single set of
four IR intrinsics: {gather,scatter} × {base,offset}. The other
details (signed/unsigned, shift, and memory element size as opposed to
vector element size) are all specified by IR intrinsic polymorphism
and immediate operands, because that made the selection job easier
than making a huge family of similarly named intrinsics.

I considered using the standard IR representations such as
llvm.masked.gather, but they're not a good fit. In order to use
llvm.masked.gather to represent a gather_offset load with element size
smaller than a pointer, you'd have to expand the <8 x i16> vector of
offsets into an <8 x i16*> vector of pointers, which would be split up
during legalization, so you'd spend most of your time undoing the mess
it had made. Also, ISel support for llvm.masked.gather would be easy
enough in a trivial way (you can expand it into a gather-base load
with a zero immediate offset), but instruction-selecting lots of
fiddly idioms back into all the _other_ MVE load instructions would be
much more work. So I think dedicated IR intrinsics are the more
sensible approach, at least for the moment.

On the clang tablegen side, I've added two new features to the
Tablegen source accepted by MveEmitter: a 'CopyKind' type node for
defining a type that varies with the parameter type (it lets you ask
for an unsigned integer type of the same width as the parameter), and
an 'unsignedflag' value node for passing an immediate IR operand which
is 0 for a signed integer type or 1 for an unsigned one. That lets me
write each kind of intrinsic just once and get all its subtypes and
immediate arguments generated automatically.

Also I've tweaked the handling of pointer-typed values in the code
generation part of MveEmitter: they're generated as Address rather
than Value (i.e. including an alignment) so that they can be given to
the ordinary IR load and store operations, but I'd omitted the code to
convert them back to Value when they're going to be used as an
argument to an IR intrinsic.

On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you
not only the full assembly-language suffix for a given vector type
(like 's32' or 'u16') but also the numeric-only one used by store
instructions (just '32' or '16').

Reviewers: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69791
2019-11-06 09:01:42 +00:00
Bill Wendling ee10d934dd Fix typo so that '-O0' is correctly specified 2019-11-05 13:15:55 -08:00
Jonas Paulsson 9376714314 [Clang FE] Recognize -mnop-mcount CL option (SystemZ only).
Recognize -mnop-mcount from the command line and add a function attribute
"mnop-mcount"="true" when passed.

When this option is used, a nop is added instead of a call to fentry. This
is used when building the Linux Kernel.

If this option is passed for any other target than SystemZ, an error is
generated.

Review: Ulrich Weigand
https://reviews.llvm.org/D67763
2019-11-05 12:12:36 +01:00
Vedant Kumar a5c8ec4baa [CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
2019-11-04 15:14:24 -08:00
Amy Huang ab76cfdd20 Recommit "[CodeView] Add option to disable inline line tables."
This reverts commit 004ed2b0d1.
Original commit hash 6d03890384

Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

https://reviews.llvm.org/D67723
2019-11-04 09:15:26 -08:00
Pengfei Wang af3a7de20c [X86] add mayRaiseFPException flag and FPCW registers for X87 instructions
Summary:
This patch adds flag "mayRaiseFPException"  , FPCW and FPSW for X87 instructions which could raise
float exception.

Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper

Reviewed By: craig.topper

Subscribers: thakis, hiraditya, llvm-commits

Patch by LiuChen.

Differential Revision: https://reviews.llvm.org/D68854
2019-11-01 21:12:43 -07:00
David Blaikie 098d901bd1 DebugInfo: Let -gdwarf use the toolchain default DWARF version, instead of hardcoded/aliased to -gdwarf-4 2019-11-01 15:17:51 -07:00
Thomas Lively 935c84c3c2 [WebAssembly] Add experimental SIMD dot product instruction
Summary:
This instruction is not merged to the spec proposal, but we need it to
be implemented in the toolchain to experiment with it. It is available
only on an opt-in basis through a clang builtin.

Defined in https://github.com/WebAssembly/simd/pull/127.

Depends on D69696.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69697
2019-11-01 10:45:48 -07:00
Thomas Lively a07019a275 [WebAssembly] SIMD integer min and max instructions
Summary:
Introduces a clang builtins and LLVM intrinsics representing integer
min/max instructions. These instructions have not been merged to the
SIMD spec proposal yet, so they are currently opt-in only via builtins
and not produced by general pattern matching. If these instructions
are accepted into the spec proposal the builtins and intrinsics will
be replaced with normal pattern matching.

Defined in https://github.com/WebAssembly/simd/pull/27.

Reviewers: aheejin

Reviewed By: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69696
2019-10-31 20:22:11 -07:00
Matt Arsenault c6da9ec0e9 clang: Fix assert on void pointer arithmetic with address_space
This attempted to always use the default address space void pointer
type instead of preserving the source address space.
2019-10-31 20:07:23 -07:00
Yaxun (Sam) Liu bb1616ba47 [CodeGen] Fix invalid llvm.linker.options about pragma detect_mismatch
When a target does not support pragma detect_mismatch, an llvm.linker.options
metadata with an empty entry is created, which causes diagnostic in backend
since backend expects name/value pair in llvm.linker.options entries.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D69678
2019-10-31 22:27:35 -04:00
Amy Huang 004ed2b0d1 Revert "[CodeView] Add option to disable inline line tables."
because it breaks compiler-rt tests.

This reverts commit 6d03890384.
2019-10-30 17:31:12 -07:00
Amy Huang 6d03890384 [CodeView] Add option to disable inline line tables.
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

See https://bugs.llvm.org/show_bug.cgi?id=42344

Reviewers: rnk

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D67723
2019-10-30 16:52:39 -07:00
Evandro Menezes 215da6606c [clang][llvm] Obsolete Exynos M1 and M2 2019-10-30 15:02:59 -05:00