Commit Graph

5512 Commits

Author SHA1 Message Date
Guanzhong Chen 8a503e439d [WebAssembly] Make clang emit correct va_arg code for structs
Summary:
In the WebAssembly backend, when lowering variadic function calls, non-single
member aggregate type arguments are always passed by pointer.

However, when emitting va_arg code in clang, the arguments are instead read as
if they are passed directly. This results in the pointer being read as the
actual structure.

Fixes https://github.com/emscripten-core/emscripten/issues/9042.

Reviewers: tlively, sbc100, kripken, aheejin, dschuff

Reviewed By: dschuff

Subscribers: dschuff, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66168

llvm-svn: 368750
2019-08-13 21:41:11 +00:00
David Bolvansky 97c35c9f57 [NFC] Updated tests after r368724
llvm-svn: 368725
2019-08-13 17:19:16 +00:00
David Bolvansky c3012b2c26 [NFC] Updated tests after r368657
llvm-svn: 368658
2019-08-13 09:12:07 +00:00
Peter Collingbourne 0e497d1554 cfi-icall: Allow the jump table to be optionally made non-canonical.
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:

- There is a performance and code size overhead associated with each
  exported function, because each such function must have an associated
  jump table entry, which must be emitted even in the common case where the
  function is never address-taken anywhere in the program, and must be used
  even for direct calls between DSOs, in addition to the PLT overhead.

- There is no good way to take a CFI-valid address of a function written in
  assembly or a language not supported by Clang. The reason is that the code
  generator would need to insert a jump table in order to form a CFI-valid
  address for assembly functions, but there is no way in general for the
  code generator to determine the language of the function. This may be
  possible with LTO in the intra-DSO case, but in the cross-DSO case the only
  information available is the function declaration. One possible solution
  is to add a C wrapper for each assembly function, but these wrappers can
  present a significant maintenance burden for heavy users of assembly in
  addition to adding runtime overhead.

For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.

This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.

Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.

Fixes PR41972.

Differential Revision: https://reviews.llvm.org/D65629

llvm-svn: 368495
2019-08-09 22:31:59 +00:00
Saleem Abdulrasool a5af238343 CodeGen: ensure 8-byte aligned String Swift CF ABI
CFStrings should be 8-byte aligned when built for the Swift CF runtime
ABI as the atomic CF info field must be properly aligned.  This is a
problem on 32-bit platforms which would give the structure 4-byte
alignment rather than 8-byte alignment.

llvm-svn: 368471
2019-08-09 19:29:05 +00:00
Richard Sandiford eb485fbc71 Add SVE opaque built-in types
This patch adds the SVE built-in types defined by the Procedure Call
Standard for the Arm Architecture:

   https://developer.arm.com/docs/100986/0000

It handles the types in all relevant places that deal with built-in types.
At the moment, some of these places bail out with an error, including:

   (1) trying to generate LLVM IR for the types
   (2) trying to generate debug info for the types
   (3) trying to mangle the types using the Microsoft C++ ABI
   (4) trying to @encode the types in Objective C

(1) and (2) are fixed by follow-on patches but (unlike this patch)
they deal mostly with target-specific LLVM details, so seemed like
a logically separate change.  There is currently no spec for (3) and
(4), so reporting an error seems like the correct behaviour for now.

The intention is that the types will become sizeless types:

   http://lists.llvm.org/pipermail/cfe-dev/2019-June/062523.html

The main purpose of the sizeless type extension is to diagnose
impossible or dangerous uses of the types, such as any that would
require sizeof to have a meaningful defined value.

Until then, the patch sets the alignments of the types to the values
specified in the link above.  It also sets the sizes of the types to
zero, which is chosen to be consistently wrong and shouldn't affect
correctly-written code (i.e. code that would compile even with the
sizeless type extension).

The patch adds the common subset of functionality needed to test the
sizeless type extension on the one hand and to provide SVE intrinsic
functions on the other.  After this patch, the two pieces of work are
essentially independent.

The patch is based on one by Graham Hunter:

   https://reviews.llvm.org/D59245

Differential Revision: https://reviews.llvm.org/D62960

llvm-svn: 368413
2019-08-09 08:52:54 +00:00
Qiu Chaofan e9efaf3529 [PowerPC] [Clang] Port SSE3, SSSE3 and SSE4 intrinsics to PowerPC
Port existing headers which include x86 intrinsics implementation to
PowerPC platform (using Altivec), along with tests. Also, tests about
including these intrinsic headers are combined.

The headers are mainly developed by Steven Munroe, with contributions
from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.

Reviewed By: Jinsong Ji

Differential Revision: https://reviews.llvm.org/D65630

llvm-svn: 368392
2019-08-09 03:39:55 +00:00
Bill Wendling 85f07cbb54 Add target requirements for those bots which don't handle x86.
llvm-svn: 368202
2019-08-07 19:36:48 +00:00
Bill Wendling ce29291fc3 Delay diagnosing asm constraints that require immediates until after inlining
Summary:
An inline asm call may result in an immediate input value after inlining.
Therefore, don't emit a diagnostic here if the input isn't an immediate.

Reviewers: joerg, eli.friedman, rsmith

Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, krytarowski, mgorny, riccibruno, eraman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60943

llvm-svn: 368104
2019-08-06 22:41:22 +00:00
Guanzhong Chen b3292a8469 [WebAssembly] Lower ASan constructor priority on Emscripten
Summary:
This change gives Emscripten the ability to use more than one constructor
priorities that runs before ASan. By convention, constructor priorites 0-100
are reserved for use by the system. ASan on Emscripten now uses priority 50,
leaving plenty of room for use by Emscripten before and after ASan.

This change is done in response to:
https://github.com/emscripten-core/emscripten/pull/9076#discussion_r310323723

Reviewers: kripken, tlively, aheejin

Reviewed By: tlively

Subscribers: cfe-commits, dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D65684

llvm-svn: 368101
2019-08-06 21:52:58 +00:00
Roger Ferrer Ibanez f686e56e7d Sidestep false positive due to a matching git repository name
I have failures in this test because the grep @b gets confused by the
clang version including a repository name like this

!1 = !{!"clang version 10.0.0 (git@build-machine:llvm/llvm-monorepo.git fe958c0e8c89ec663c8e551936778e2cbb460154)"}

I considered something like grep -w but my understanding of the manpages
was that that isn't super portable. So I think it is easier to make
clang not to output that metadata using -fno-ident.

Differential Revision: https://reviews.llvm.org/D65635

llvm-svn: 367826
2019-08-05 10:09:06 +00:00
Tim Northover a009a60a91 IR: print value numbers for unnamed function arguments
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.

Also modifies the parser to accept IR in that form for obvious reasons.

llvm-svn: 367755
2019-08-03 14:28:34 +00:00
Yonghong Song d0ea05d5ef [BPF] annotate DIType metadata for builtin preseve_array_access_index()
Previously, debuginfo types are annotated to
IR builtin preserve_struct_access_index() and
preserve_union_access_index(), but not
preserve_array_access_index(). The debug info
is useful to identify the root type name which
later will be used for type comparison.

For user access without explicit type conversions,
the previous scheme works as we can ignore intermediate
compiler generated type conversions (e.g., from union types to
union members) and still generate correct access index string.

The issue comes with user explicit type conversions, e.g.,
converting an array to a structure like below:
  struct t { int a; char b[40]; };
  struct p { int c; int d; };
  struct t *var = ...;
  ... __builtin_preserve_access_index(&(((struct p *)&(var->b[0]))->d)) ...
Although BPF backend can derive the type of &(var->b[0]),
explicit type annotation make checking more consistent
and less error prone.

Another benefit is for multiple dimension array handling.
For example,
  struct p { int c; int d; } g[8][9][10];
  ... __builtin_preserve_access_index(&g[2][3][4].d) ...
It would be possible to calculate the number of "struct p"'s
before accessing its member "d" if array debug info is
available as it contains each dimension range.

This patch enables to annotate IR builtin preserve_array_access_index()
with proper debuginfo type. The unit test case and language reference
is updated as well.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D65664

llvm-svn: 367724
2019-08-02 21:28:28 +00:00
Nico Weber 5c2d5f066f Rename two clang tests from .cc to .cpp.
clang/test/lit.cfg.py doesn't list .cc as test extension, so these
tests never ran.

Tweak one of the two tests to actually pass, now that it runs.
(The other one was already passing.)

llvm-svn: 367574
2019-08-01 15:06:57 +00:00
Michael J. Spencer 33703fb9f9 [clang][ARM] Fix msvc arm{64} builtins to use int on LP64 systems.
The `InterlockedX_{acq,nf,rel}` functions deal with 32 bits which is long on
MSVC, but int on most other systems.

This also checks that `ReadStatusRegister` and `WriteStatusRegister` have
the correct type on aarch64-darwin.

Differential Revision: https://reviews.llvm.org/D64164

llvm-svn: 367479
2019-07-31 20:42:28 +00:00
Sanjay Patel 435cdecdf7 [InstCombine] canonicalize fneg before fmul/fdiv
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

llvm-svn: 367447
2019-07-31 16:53:22 +00:00
Momchil Velikov a36d31478c [AArch64] Add support for Transactional Memory Extension (TME)
Re-commit r366322 after some fixes

TME is a future architecture technology, documented in

  https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools
  https://developer.arm.com/docs/ddi0601/a

More about the future architectures:

  https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture

This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and
TCANCEL and the target feature/arch extension "tme".

It also implements TME builtin functions, defined in ACLE Q2 2019
(https://developer.arm.com/docs/101028/latest)

Differential Revision: https://reviews.llvm.org/D64416

Patch by Javed Absar and Momchil Velikov

llvm-svn: 367428
2019-07-31 12:52:17 +00:00
Sam Elliott 9e6b2e1605 [RISCV] Support 'f' Inline Assembly Constraint
Summary:
This adds the 'f' inline assembly constraint, as supported by GCC. An
'f'-constrained operand is passed in a floating point register. Exactly
which kind of floating-point register (32-bit or 64-bit) is decided
based on the operand type and the available standard extensions (-f and
-d, respectively).

This patch adds support in both the clang frontend, and LLVM itself.

Reviewers: asb, lewis-revill

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65500

llvm-svn: 367403
2019-07-31 09:45:55 +00:00
David Major 027bb52790 [COFF][ARM64] Reorder handling of aarch64 MSVC builtins
In `CodeGenFunction::EmitAArch64BuiltinExpr()`, bulk move all of the aarch64 MSVC-builtin cases to an earlier point in the function (the `// Handle non-overloaded intrinsics first` switch block) in order to avoid an unreachable in `GetNeonType()`. The NEON type-overloading logic is not appropriate for the Windows builtins.

Fixes https://llvm.org/pr42775

Differential Revision: https://reviews.llvm.org/D65403

llvm-svn: 367323
2019-07-30 15:32:49 +00:00
Hideto Ueno cc0a4cdc89 [FunctionAttrs] Annotate "willreturn" for intrinsics
Summary:
In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef).

In this patch, willreturn is annotated for LLVM intrinsics.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64904

llvm-svn: 367184
2019-07-28 06:09:56 +00:00
Petr Hosek 92a2e1bbb9 Revert "[ARM] Set default alignment to 64bits"
This reverts commit r367119.

This broke several bots:

http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/26891/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Aexception-alignment.cpp
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/245/consoleFull

llvm-svn: 367166
2019-07-27 01:59:23 +00:00
Leonard Chan 01ba91e6af [NewPM] Run avx*-builtins.c tests under the new pass manager only
This patch changes the following tests to run under the new pass manager only:

```
Clang :: CodeGen/avx512-reduceMinMaxIntrin.c (1 of 4)
Clang :: CodeGen/avx512vl-builtins.c (2 of 4)
Clang :: CodeGen/avx512vlbw-builtins.c (3 of 4)
Clang :: CodeGen/avx512f-builtins.c (4 of 4)
```

The new PM added extra bitcasts that weren't checked before. For
reduceMinMaxIntrin.c, the issue was mostly the alloca's being in a different
order. Other changes involved extra bitcasts, and differently ordered loads and
stores, but the logic should still be the same.

Differential revision: https://reviews.llvm.org/D65110

llvm-svn: 367157
2019-07-26 21:19:37 +00:00
Sanjay Patel c0fc24bb8e [CodeGen] fix test that broke with rL367146
This should be fixed properly to not depend on LLVM (so much).

llvm-svn: 367149
2019-07-26 20:36:57 +00:00
Simi Pallipurath 92363a3ada [ARM] Set default alignment to 64bits
The maximum alignment used by ARM arch
is 64bits, not 128.

This could cause overaligned memory
access for 128 bit neon vector that
have unpredictable behaviour.

This fixes: https://bugs.llvm.org/show_bug.cgi?id=42668

Patch by: Diogo Sampaio(diogo.sampaio@arm.com)

Differential Revision: https://reviews.llvm.org/D65000

Change-Id: I5a62b766491f15dd51e4cfe6625929db897f67e3
llvm-svn: 367119
2019-07-26 15:05:19 +00:00
Leonard Chan 007f674c6a Reland the "[NewPM] Port Sancov" patch from rL365838. No functional
changes were made to the patch since then.

--------

[NewPM] Port Sancov

This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

llvm-svn: 367053
2019-07-25 20:53:15 +00:00
JF Bastien dbc0a5df8d Allow prefetching from non-zero address spaces
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.

<rdar://problem/42662136>

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65254

llvm-svn: 367032
2019-07-25 16:11:57 +00:00
Sander de Smalen 2b290885d9 [SVE][Inline-Asm] Add support to specify SVE registers in the clobber list
Adds the SVE vector and predicate registers to the list of known registers.

Patch by Kerry McLaughlin.

Reviewers: erichkeane, sdesmalen, rengolin

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D64739

llvm-svn: 366878
2019-07-24 08:42:34 +00:00
Christudasan Devadasan 8c5e6fa657 Updated the signature for some stack related intrinsics (CLANG)
Modified the intrinsics
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
This commit depends on the changes in rL366679

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64563

llvm-svn: 366683
2019-07-22 12:50:30 +00:00
Yuanfang Chen ff22ec3d70 [Clang] Replace cc1 options '-mdisable-fp-elim' and '-momit-leaf-frame-pointer'
with '-mframe-pointer'

After D56351 and D64294, frame pointer handling is migrated to tri-state
(all, non-leaf, none) in clang driver and on the function attribute.
This patch makes the frame pointer handling cc1 option tri-state.

Reviewers: chandlerc, rnk, t.p.northover, MaskRay

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D56353

llvm-svn: 366645
2019-07-20 22:50:50 +00:00
Guanzhong Chen 5204f7611f [WebAssembly] Compute and export TLS block alignment
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.

Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.

The expected usage has now changed to:

    __wasm_init_tls(memalign(__builtin_wasm_tls_align(),
                             __builtin_wasm_tls_size()));

Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton

Reviewed By: tlively

Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65028

llvm-svn: 366624
2019-07-19 23:34:16 +00:00
Teresa Johnson 604f802fd3 [LTO] Always mark regular LTO units with EnableSplitLTOUnit=1
Summary:
Regular LTO modules do not need LTO Unit splitting, only ThinLTO does
(they must be consistently split into regular and Thin units for
optimizations such as whole program devirtualization and lower type
tests). In order to avoid spurious errors from LTO when combining with
split ThinLTO modules, always set this flag for regular LTO modules.

Reviewers: pcc

Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65009

llvm-svn: 366623
2019-07-19 23:02:58 +00:00
Alex Bradbury e078967adf [RISCV] Hard float ABI support
The RISC-V hard float calling convention requires the frontend to:

* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s)
* Track usage of GPRs and FPRs in order to gate the above, and to
determine when signext/zeroext attributes must be added to integer
scalars

This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.

Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.

Re-landed after backing out 366450 due to missed hunks.

Differential Revision: https://reviews.llvm.org/D60456

llvm-svn: 366480
2019-07-18 18:29:59 +00:00
Guanzhong Chen 801fa8e6b9 [WebAssembly] Implement __builtin_wasm_tls_base intrinsic
Summary:
Add `__builtin_wasm_tls_base` so that LeakSanitizer can find the thread-local
block and scan through it for memory leaks.

Reviewers: tlively, aheejin, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64900

llvm-svn: 366475
2019-07-18 17:53:22 +00:00
Alex Bradbury 9b732fe99b Revert "[RISCV] Hard float ABI support" r366450
The commit was missing a few hunks. Will fix and recommit.

llvm-svn: 366454
2019-07-18 16:13:17 +00:00
Alex Bradbury fc3aa2ab48 [RISCV] Hard float ABI support
The RISC-V hard float calling convention requires the frontend to:

* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s) * Track usage of GPRs and FPRs in order to gate the
above, and to
determine when signext/zeroext attributes must be added to integer
scalars

This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.

Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.

Differential Revision: https://reviews.llvm.org/D60456

llvm-svn: 366450
2019-07-18 15:33:41 +00:00
Qiu Chaofan 03aaef8e72 [PowerPC][Clang] Remove use of malloc in mm_malloc
Remove dependency of malloc in implementation of mm_malloc function in PowerPC
intrinsics and alignment assumption on glibc.

Reviewed By: Hal Finkel

Differential Revision: https://reviews.llvm.org/D64850

llvm-svn: 366406
2019-07-18 06:20:12 +00:00
Sunil Srivastava 85d667fcb6 Renamed and changed the wording of warn_cconv_ignored
As discussed in D64780 the wording of this warning message is being
changed to say 'is not supported' instead of 'ignored', and the
diag ID itself is being changed to warn_cconv_not_supported.

llvm-svn: 366368
2019-07-17 20:41:26 +00:00
Momchil Velikov 0e2b74a2b0 Revert [AArch64] Add support for Transactional Memory Extension (TME)
This reverts r366322 (git commit 4b8da3a503)

llvm-svn: 366355
2019-07-17 17:43:32 +00:00
Momchil Velikov 4b8da3a503 [AArch64] Add support for Transactional Memory Extension (TME)
TME is a future architecture technology, documented in

https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools
https://developer.arm.com/docs/ddi0601/a

More about the future architectures:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture

This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and
TCANCEL and the target feature/arch extension "tme".

It also implements TME builtin functions, defined in ACLE Q2 2019
(https://developer.arm.com/docs/101028/latest)

Patch by Javed Absar and Momchil Velikov

Differential Revision: https://reviews.llvm.org/D64416

llvm-svn: 366322
2019-07-17 13:23:27 +00:00
Guanzhong Chen 42bba4b852 [WebAssembly] Implement thread-local storage (local-exec model)
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.

`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.

`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.

`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.

To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.

The expected usage is to run something like the following upon thread startup:

    __wasm_init_tls(malloc(__builtin_wasm_tls_size()));

Reviewers: tlively, aheejin, kripken, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64537

llvm-svn: 366272
2019-07-16 22:00:45 +00:00
Yonghong Song 4754814c5a fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
The original commit is r366076. It is temporarily reverted (r366155)
due to test failure. This resubmit makes test more robust by accepting
regex instead of hardcoded names/references in several places.

This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366231
2019-07-16 17:24:33 +00:00
Kyrylo Tkachov eb72138340 [AArch64] Implement __jcvt intrinsic from Armv8.3-A
The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined.

This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function.
The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used.
I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example).

make check-all didn't show any new failures.

[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics

Differential Revision: https://reviews.llvm.org/D64495

llvm-svn: 366197
2019-07-16 09:27:39 +00:00
Stephan Bergmann e215996a29 Finish "Adapt -fsanitize=function to SANITIZER_NON_UNIQUE_TYPEINFO"
i.e., recent 5745eccef54ddd3caca278d1d292a88b2281528b:

* Bump the function_type_mismatch handler version, as its signature has changed.

* The function_type_mismatch handler can return successfully now, so
  SanitizerKind::Function must be AlwaysRecoverable (like for
  SanitizerKind::Vptr).

* But the minimal runtime would still unconditionally treat a call to the
  function_type_mismatch handler as failure, so disallow -fsanitize=function in
  combination with -fsanitize-minimal-runtime (like it was already done for
  -fsanitize=vptr).

* Add tests.

Differential Revision: https://reviews.llvm.org/D61479

llvm-svn: 366186
2019-07-16 06:23:27 +00:00
Eric Christopher fdcbd5fa48 Temporarily Revert "fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic"
The commit had tests that would only work with names in the IR.

This reverts commit r366076.

llvm-svn: 366155
2019-07-15 23:49:31 +00:00
Leonard Chan bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Evgeniy Stepanov c5e7f56249 ARM MTE stack sanitizer.
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

llvm-svn: 366123
2019-07-15 20:02:23 +00:00
Yonghong Song e5086481b6 fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366076
2019-07-15 15:42:41 +00:00
Fangrui Song 6bd02a442c [PowerPC] Support -mabi=ieeelongdouble and -mabi=ibmlongdouble
gcc PowerPC supports 3 representations of long double:

* -mlong-double-64

  long double has the same representation of double but is mangled as `e`.
  In clang, this is the default on AIX, FreeBSD and Linux musl.

* -mlong-double-128

  2 possible 128-bit floating point representations:

  + -mabi=ibmlongdouble
    IBM extended double format. Mangled as `g`
    In clang, this is the default on Linux glibc.
  + -mabi=ieeelongdouble
    IEEE 754 quadruple-precision format. Mangled as `u9__ieee128` (`U10__float128` before gcc 8.2)
    This is currently unavailable.

This patch adds -mabi=ibmlongdouble and -mabi=ieeelongdouble, and thus
makes the IEEE 754 quadruple-precision long double available for
languages supported by clang.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D64283

llvm-svn: 366044
2019-07-15 07:25:11 +00:00
Alexandros Lamprineas 24cacf9c56 [clang][Driver][ARM] Favor -mfpu over default CPU features
When processing the command line options march, mcpu and mfpu, we store
the implied target features on a vector. The change D62998 introduced a
temporary vector, where the processed features get accumulated. When
calling DecodeARMFeaturesFromCPU, which sets the default features for
the specified CPU, we certainly don't want to override the features
that have been explicitly specified on the command line. Therefore, the
default features should appear first in the final vector. This problem
became evident once I added the missing (unhandled) target features in
ARM::getExtensionFeatures.

Differential Revision: https://reviews.llvm.org/D63936

llvm-svn: 366027
2019-07-14 18:32:42 +00:00
Ulrich Weigand b98bf60ef7 [SystemZ] Add support for new cpu architecture - arch13
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10303.

Note: No currently available Z system supports the arch13
architecture.  Once new systems become available, the
official system name will be added as supported -march name.

llvm-svn: 365933
2019-07-12 18:14:51 +00:00
Fangrui Song c46d78d1b7 [X86][PowerPC] Support -mlong-double-128
This patch makes the driver option -mlong-double-128 available for X86
and PowerPC. The CC1 option -mlong-double-128 is available on all targets
for users to test on unsupported targets.

On PowerPC, -mlong-double-128 uses the IBM extended double format
because we don't support -mabi=ieeelongdouble yet (D64283).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D64277

llvm-svn: 365866
2019-07-12 02:32:15 +00:00
Leonard Chan 5652f35817 [NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

Differential Revision: https://reviews.llvm.org/D62888

llvm-svn: 365838
2019-07-11 22:35:40 +00:00
Benjamin Kramer 3b5e60b695 [CodeGen] NVPTX: Switch from atomic.load.add.f32 to atomicrmw fadd
llvm-svn: 365798
2019-07-11 17:44:11 +00:00
Vedant Kumar 31c4d2a40d [CGDebugInfo] Fix -femit-debug-entry-values crash on os_log_helpers
An os_log_helper FunctionDecl may not have a body. Ignore these for the
purposes of debug entry value emission.

Fixes an assertion failure seen in a stage2 build of clang:

Assertion failed: (FD->hasBody() && "Functions must have body here"),
function analyzeParametersModification

llvm-svn: 365716
2019-07-11 00:09:16 +00:00
Vitaly Buka e26398849d GodeGen, NFC: Add test to track emitStoresForConstant behavior
Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64385

llvm-svn: 365706
2019-07-10 22:47:07 +00:00
Craig Topper caf6b71ab2 [X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64.
This is similar to what we do for loadl_pi and loadh_pi.

llvm-svn: 365669
2019-07-10 17:11:29 +00:00
Craig Topper f9cb127ca9 [X86] Add guards to some of the x86 intrinsic tests to skip 64-bit mode only intrinsics when compiled for 32-bit mode.
All the command lines are for 64-bit mode, but sometimes I compile
the tests in 32-bit mode to see what assembly we get and we need
to skip these to do that.

llvm-svn: 365668
2019-07-10 17:11:23 +00:00
Diogo N. Sampaio 71cac61d01 [AArch64] Fix vector vuqadd intrinsics operands
Summary:
Change the vuqadd vector instrinsics to have the second argument as unsigned values, not signed,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Reviewers: LukeCheeseman, ostannard

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64211

llvm-svn: 365609
2019-07-10 09:58:51 +00:00
Diogo N. Sampaio 3490aab63a [NFC][AArch64] Fix vector vqtb[lx][1-4]_s8 operand
Summary:
Change the vqtb[lx][1-4]_s8 instrinsics to have the last argument as vector of unsigned valuse, not
signed, accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Reviewers: LukeCheeseman, DavidSpickett

Reviewed By: DavidSpickett

Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64243

llvm-svn: 365598
2019-07-10 08:16:49 +00:00
Reid Kleckner 4586a19da8 [MS] Treat ignored explicit calling conventions as an explicit __cdecl
The CCCR_Ignore action is only used for Microsoft calling conventions,
mainly because MSVC does not warn when a calling convention would be
ignored by the current target. This behavior is actually somewhat
important, since windows.h uses WINAPI (which expands to __stdcall)
widely. This distinction didn't matter much before the introduction of
__vectorcall to x64 and the ability to make that the default calling
convention with /Gv. Now, we can't just ignore __stdcall for x64, we
have to treat it as an explicit __cdecl annotation.

Fixes PR42531

llvm-svn: 365579
2019-07-09 23:17:43 +00:00
Aaron Ballman b1e511bf5a Ignore trailing NullStmts in StmtExprs for GCC compatibility.
Ignore trailing NullStmts in compound expressions when determining the result type and value. This is to match the GCC behavior which ignores semicolons at the end of compound expressions.

Patch by Dominic Ferreira.

llvm-svn: 365498
2019-07-09 15:02:07 +00:00
Yonghong Song 048493f882 [BPF] Preserve debuginfo array/union/struct type/access index
For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
  base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:
  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D61809

llvm-svn: 365438
2019-07-09 04:21:50 +00:00
Yonghong Song e085b40e9c Revert "[BPF] Preserve debuginfo array/union/struct type/access index"
This reverts commit r365435.

Forgot adding the Differential Revision link. Will add to the
commit message and resubmit.

llvm-svn: 365436
2019-07-09 04:15:12 +00:00
Yonghong Song f21eeafcd9 [BPF] Preserve debuginfo array/union/struct type/access index
For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
  base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:
  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 365435
2019-07-09 04:04:21 +00:00
Fangrui Song 11cb39c5fc [X86][PPC] Support -mlong-double-64
-mlong-double-64 is supported on some ports of gcc (i386, x86_64, and ppc{32,64}).
On many other targets, there will be an error:

    error: unrecognized command line option '-mlong-double-64'

This patch makes the driver option -mlong-double-64 available for x86
and ppc. The CC1 option -mlong-double-64 is available on all targets for
users to test on unsupported targets.

LongDoubleSize is added as a VALUE_LANGOPT so that the option can be
shared with -mlong-double-128 when we support it in clang.

Also, make powerpc*-linux-musl default to use 64-bit long double. It is
currently the only supported ABI on musl and is also how people
configure powerpc*-linux-musl-gcc.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D64067

llvm-svn: 365412
2019-07-09 00:27:43 +00:00
Alex Bradbury 77d4a8f9f7 [RISCV] Specify registers used for exception handling
Implements the handling of __builtin_eh_return_regno().

Differential Revision: https://reviews.llvm.org/D63417
Patch by Edward Jones.

llvm-svn: 365305
2019-07-08 09:38:06 +00:00
Diogo N. Sampaio 4ec445b813 [AArch64] Fix scalar vuqadd intrinsics operands
Summary:
Change the vuqadd scalar instrinsics to have the second argument as unsigned values, not signed,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

So now the compiler correctly warns that a undefined negative float conversion is being done.

Reviewers: LukeCheeseman, john.brawn

Reviewed By: john.brawn

Subscribers: john.brawn, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64242

llvm-svn: 365300
2019-07-08 08:47:47 +00:00
Diogo N. Sampaio 0464e07c8f [AArch64] Fix vsqadd scalar intrinsics operands
Summary:
Change the vsqadd scalar instrinsics to have the second argument as signed values, not unsigned,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

The existing unsigned argument can cause faulty code as negative float to unsigned conversion is
undefined, which llvm/clang optimizes away.

Reviewers: LukeCheeseman, john.brawn

Reviewed By: john.brawn

Subscribers: john.brawn, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64239

llvm-svn: 365298
2019-07-08 08:35:05 +00:00
Richard Smith 9e52c43090 Treat the range of representable values of floating-point types as [-inf, +inf] not as [-max, +max].
Summary:
Prior to r329065, we used [-max, max] as the range of representable
values because LLVM's `fptrunc` did not guarantee defined behavior when
truncating from a larger floating-point type to a smaller one. Now that
has been fixed, we can make clang follow normal IEEE 754 semantics in this
regard and take the larger range [-inf, +inf] as the range of representable
values.

In practice, this affects two parts of the frontend:
 * the constant evaluator no longer treats floating-point evaluations
   that result in +-inf as being undefined (because they no longer leave
   the range of representable values of the type)
 * UBSan no longer treats conversions to floating-point type that are
   outside the [-max, +max] range as being undefined

In passing, also remove the float-divide-by-zero sanitizer from
-fsanitize=undefined, on the basis that while it's undefined per C++
rules (and we disallow it in constant expressions for that reason), it
is defined by Clang / LLVM / IEEE 754.

Reviewers: rnk, BillyONeal

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63793

llvm-svn: 365272
2019-07-06 21:05:52 +00:00
Fangrui Song 1f333562de [PowerPC] Support constraint code "ww"
Summary:
"ww" and "ws" are both constraint codes for VSX vector registers that
hold scalar double data. "ww" is preferred for float while "ws" is
preferred for double.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D64119

llvm-svn: 365106
2019-07-04 04:44:42 +00:00
Djordje Todorovic 0f65168566 [clang] Add DISuprogram and DIE for a func decl
Attach a unique DISubprogram to a function declaration that will be
used for call site debug info.

([7/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60714

llvm-svn: 364502
2019-06-27 06:44:44 +00:00
Aaron Puchert b207baeb28 [Clang] Remove unused -split-dwarf and obsolete -enable-split-dwarf
Summary:
The changes in D59673 made the choice redundant, since we can achieve
single-file split DWARF just by not setting an output file name.
Like llc we can also derive whether to enable Split DWARF from whether
-split-dwarf-file is set, so we don't need the flag at all anymore.

The test CodeGen/split-debug-filename.c distinguished between having set
or not set -enable-split-dwarf with -split-dwarf-file, but we can
probably just always emit the metadata into the IR.

The flag -split-dwarf wasn't used at all anymore.

Reviewers: dblaikie, echristo

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D63167

llvm-svn: 364479
2019-06-26 21:36:35 +00:00
Djordje Todorovic ed05d49aad [clang/DIVar] Emit the flag for params that have unmodified value
Emit the debug info flag that indicates that a parameter has unchanged
value throughout a function.

([5/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D58035

llvm-svn: 364424
2019-06-26 13:32:02 +00:00
Simon Tatham e8de8ba6a6 [ARM] Support inline assembler constraints for MVE.
"To" selects an odd-numbered GPR, and "Te" an even one. There are some
8.1-M instructions that have one too few bits in their register fields
and require registers of particular parity, without necessarily using
a consecutive even/odd pair.

Also, the constraint letter "t" should select an MVE q-register, when
MVE is present. This didn't need any source changes, but some extra
tests have been added.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60709

llvm-svn: 364331
2019-06-25 16:49:32 +00:00
Leonard Chan f948f6b862 [clang][NewPM] Remove exception handling before loading pgo sample profile data
This patch ensures that SimplifyCFGPass comes before SampleProfileLoaderPass
on PGO runs in the new PM and fixes clang/test/CodeGen/pgo-sample.c.

Differential Revision: https://reviews.llvm.org/D63626

llvm-svn: 364201
2019-06-24 16:44:27 +00:00
Richard Smith 1fa07ebd92 Fix TBAA representation for zero-sized fields and unnamed bit-fields.
Unnamed bit-fields should not be represented in the TBAA metadata
because they do not represent storage fields (they only affect layout).

Zero-sized fields should not be represented in the TBAA metadata
because by definition they have no associated storage (so we will never
emit a load or store through them), and they might not appear in
declaration order within the struct layout.

Fixes a verifier failure when emitting a TBAA-enabled load through a
class type containing a zero-sized field.

llvm-svn: 364140
2019-06-22 21:30:43 +00:00
Craig Topper ed78daf810 [X86] Don't use _MM_FROUND_CUR_DIRECTION in the intrinsics tests.
_MM_FROUND_CUR_DIRECTION is the behavior of the intrinsics that
don't take a rounding mode argument. So a better test
is using _MM_FROUND_NO_EXC with the SAE only intrinsics and
an explicit rounding mode with the intrinsics that support
embedded rounding mode.

llvm-svn: 364127
2019-06-22 07:21:48 +00:00
Leonard Chan f66309203e [clang][NewPM] Add -fno-experimental-new-pass-manager to tests
As per the discussion on D58375, we disable test that have optimizations under
the new PM. This patch adds -fno-experimental-new-pass-manager to RUNS that:

- Already run with optimizations (-O1 or higher) that were missed in D58375.
- Explicitly test new PM behavior along side some new PM RUNS, but are missing
  this flag if new PM is enabled by default.
- Specify -O without the number. Based on getOptimizationLevel(), it seems the
  default is 2, and the IR appears to be the same when changed to -O2, so
  update the test to explicitly say -O2 and provide -fno-experimental-new-pass-manager`.

Differential Revision: https://reviews.llvm.org/D63156

llvm-svn: 364066
2019-06-21 16:03:06 +00:00
Reid Kleckner 3fd3de147b Fix passing structs and AVX vectors through sysv_abi
Do this the same way we did it for ms_abi in r324594.

Fixes PR36806.

llvm-svn: 363973
2019-06-20 20:07:20 +00:00
Leonard Chan 97dc622ab3 [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runs
This fixes CodeGen/available-externally-suppress.c when the new pass manager is
turned on by default. available_externally was not emitted during -O2 -flto
runs when it should still be retained for link time inlining purposes. This can
be fixed by checking that we aren't LTOPrelinking when adding the
EliminateAvailableExternallyPass.

Differential Revision: https://reviews.llvm.org/D63580

llvm-svn: 363971
2019-06-20 19:44:51 +00:00
Leonard Chan b206513e45 [clang][NewPM] Move EntryExitInstrumenterPass to the start of the pipeline
This fixes CodeGen/x86_64-instrument-functions.c when running under the new
pass manager. The pass should go before any other pass to prevent
`__cyg_profile_func_enter/exit()` from not being emitted by inlined functions.

Differential Revision: https://reviews.llvm.org/D63577

llvm-svn: 363969
2019-06-20 19:35:25 +00:00
Craig Topper 6d9fb68c53 [X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic.
These intrinsics should always take an immediate for the rounding mode.
The base instruction comes from before EVEX embdedded rounding. The
user should always provide the immediate rather than us assuming
CUR_DIRECTION.

Make the 512-bit versions also explicit aliases instead of copy
pasting the code.

llvm-svn: 363961
2019-06-20 18:24:29 +00:00
Amy Huang 7fac5c8d94 Store a pointer to the return value in a static alloca and let the debugger use that
as the variable address for NRVO variables.

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63361

llvm-svn: 363952
2019-06-20 17:15:21 +00:00
Leonard Chan e6d2c8dde6 [clang][NewPM] Fixing remaining -O0 tests that are broken under new PM
- CodeGen/flatten.c will fail under new PM becausec the new PM AlwaysInliner
  seems to intentionally inline functions but not call sites marked with
  alwaysinline (D23299)
- Tests that check remarks happen to check them for the inliner which is not
  turned on at O0. These tests just check that remarks work, but we can make
  separate tests for the new PM with -O1 so we can turn on the inliner and
  check the remarks with minimal changes.

Differential Revision: https://reviews.llvm.org/D62225

llvm-svn: 363846
2019-06-19 17:41:30 +00:00
Hans Wennborg d874c057bc Revert r363116 "[X86] [ABI] Fix i386 ABI "__m64" type bug"
This introduced MMX instructions in code that wasn't previously using
them, breaking programs using 64-bit vectors and x87 floating-point in
the same application. See discussion on the code review for more
details.

> According to System V i386 ABI: the  __m64 type paramater and return
> value are passed by MMX registers. But current implementation treats
> __m64 as i64 which results in parameter passing by stack and returning
> by EDX and EAX.
>
> This patch fixes the bug (https://bugs.llvm.org/show_bug.cgi?id=41029)
> for Linux and NetBSD.
>
> Patch by Wei Xiao (wxiao3)
>
> Differential Revision: https://reviews.llvm.org/D59744

llvm-svn: 363790
2019-06-19 11:34:08 +00:00
Lewis Revill af22e071ca [RISCV] Mark TLS as supported
Inform Clang that TLS is implemented by LLVM for RISC-V

Differential Revision: https://reviews.llvm.org/D57055

llvm-svn: 363776
2019-06-19 08:53:46 +00:00
Mikhail Maltsev a45292cbfd [CodeGen][ARM] Fix FP16 vector coercion
Summary:
When a function argument or return type is a homogeneous aggregate
which contains an FP16 vector but the target does not support FP16
operations natively, the type must be converted into an array of
integer vectors by then front end (otherwise LLVM will handle FP16
vectors incorrectly by scalarizing them and promoting FP16 to float,
see https://reviews.llvm.org/D50507).

Currently the logic for checking whether or not a given homogeneous
aggregate contains FP16 vectors is incorrect: it only looks at the
type of the first vector.

This patch fixes the issue by adding a new method
ARMABIInfo::containsAnyFP16Vectors and using it. The traversal logic
of this method is largely the same as in
ABIInfo::isHomogeneousAggregate.

Reviewers: eli.friedman, olista01, ostannard

Reviewed By: ostannard

Subscribers: ostannard, john.brawn, javed.absar, kristof.beyls, pbarrio, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63437

llvm-svn: 363687
2019-06-18 14:34:27 +00:00
Francis Visoiu Mistrih 34667519dc [Remarks] Extend -fsave-optimization-record to specify the format
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.

For now, only YAML is supported.

llvm-svn: 363573
2019-06-17 16:06:00 +00:00
Aaron Puchert e1dc495e63 [Clang] Harmonize Split DWARF options with llc
Summary:
With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.

To use Split DWARF with remote compilation, one needs to either

* make sure only relative paths are used, and mirror the build directory
  structure of the client on the server,
* inject the desired file name on the client directly.

Since llc already supports the latter solution, we're just copying that
over. We allow setting the actual output filename separately from the
value of the DW_AT_[GNU_]dwo_name attribute in the skeleton CU.

Fixes PR40276.

Reviewers: dblaikie, echristo, tejohnson

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D59673

llvm-svn: 363496
2019-06-15 15:38:51 +00:00
Aaron Puchert 922759a63d [Clang] Rename -split-dwarf-file to -split-dwarf-output
Summary:
This is the first in a series of changes trying to align clang -cc1
flags for Split DWARF with those of llc. The unfortunate side effect of
having -split-dwarf-output for single file Split DWARF will disappear
again in a subsequent change.

The change is the result of a discussion in D59673.

Reviewers: dblaikie, echristo

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D63130

llvm-svn: 363494
2019-06-15 14:07:43 +00:00
Francis Visoiu Mistrih 5501dda247 [Remarks][NFC] Improve testing and documentation of -foptimization-record-passes
This adds:

* documentation to the user manual
* nicer error message
* test for the error case
* test for the gold plugin

llvm-svn: 363463
2019-06-14 21:38:57 +00:00
George Burgess IV 2c074bb39e [Targets] Move soft-float-abi filtering to `initFeatureMap`
ARM has a special target feature called soft-float-abi. This feature is
special, since we get it passed to us explicitly in the frontend, but
filter it out before it can land in any target feature strings in LLVM
IR.

__attribute__((target(""))) doesn't quite filter these features out
properly, so today, we get warnings about soft-float-abi being an
unknown feature from the backend.

This CL has us filter soft-float-abi out at a slightly different point,
so we don't end up passing these invalid features to the backend.

Differential Revision: https://reviews.llvm.org/D61750

llvm-svn: 363346
2019-06-14 00:35:17 +00:00
Leonard Chan 09f56b51ec [clang][NewPM] Fix broken -O0 test from missing assumptions
Add an AssumptionCache callback to the InlineFuntionInfo used for the
AlwaysInlinerPass to match codegen of the AlwaysInlinerLegacyPass to generate
llvm.assume. This fixes CodeGen/builtin-movdir.c when new PM is enabled by
default.

Differential Revision: https://reviews.llvm.org/D63170

llvm-svn: 363287
2019-06-13 18:18:40 +00:00
Leonard Chan 9f8ce3feb2 [clang][NewPM] Fix split debug test
This contains the part of D62225 which fixes CodeGen/split-debug-single-file.c
by not placing .dwo sections when using -enable-split-dwarf=split.

Differential Revision: https://reviews.llvm.org/D63168

llvm-svn: 363281
2019-06-13 17:40:03 +00:00
Leonard Chan 587497b87d [clang][NewPM] Fix broken -O0 test from the AlwaysInliner
This contains the part of D62225 which prevents insertion of lifetime
intrinsics when creating the AlwaysInliner. This fixes the following tests
when the new PM is enabled by default:

Clang :: CodeGen/aarch64-neon-across.c
Clang :: CodeGen/aarch64-neon-fcvt-intrinsics.c
Clang :: CodeGen/aarch64-neon-fma.c
Clang :: CodeGen/aarch64-neon-perm.c
Clang :: CodeGen/aarch64-neon-tbl.c
Clang :: CodeGen/aarch64-poly128.c
Clang :: CodeGen/aarch64-v8.2a-neon-intrinsics.c
Clang :: CodeGen/arm-neon-fma.c
Clang :: CodeGen/arm-neon-numeric-maxmin.c
Clang :: CodeGen/arm-neon-vcvtX.c
Clang :: CodeGen/avx-builtins.c
Clang :: CodeGen/builtins-ppc-p9vector.c
Clang :: CodeGen/builtins-ppc-vsx.c
Clang :: CodeGen/lifetime.c
Clang :: CodeGen/sse-builtins.c
Clang :: CodeGen/sse2-builtins.c

Differential Revision: https://reviews.llvm.org/D63153

llvm-svn: 363277
2019-06-13 16:45:29 +00:00
Zi Xuan Wu cc12f68fff [PowerPC] [Clang] Port SSE2 intrinsics to PowerPC
Port emmintrin.h which include Intel SSE2 intrinsics implementation to PowerPC platform (using Altivec).

The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.

It's a follow-up patch of D62121.

Patched by: Qiu Chaofan <qiucf@cn.ibm.com>

Differential Revision: https://reviews.llvm.org/D62569

llvm-svn: 363122
2019-06-12 05:25:40 +00:00
Pengfei Wang fbfee60c32 [X86] [ABI] Fix i386 ABI "__m64" type bug
According to System V i386 ABI: the  __m64 type paramater and return
value are passed by MMX registers. But current implementation treats
__m64 as i64 which results in parameter passing by stack and returning
by EDX and EAX.

This patch fixes the bug (https://bugs.llvm.org/show_bug.cgi?id=41029)
for Linux and NetBSD.

Patch by Wei Xiao (wxiao3)

Differential Revision: https://reviews.llvm.org/D59744

llvm-svn: 363116
2019-06-12 01:52:23 +00:00
Hubert Tong 11db920f74 [NFC][PowerPC] Header-dependent test requires "native"
Two recently added tests mention complications for cross-compile, but
they do not actually enforce native compilation. This patch makes them
require native compilation to avoid the complications they mention.

llvm-svn: 363070
2019-06-11 14:23:55 +00:00
Lewis Revill 22196f0f69 [RISCV][NFC] Add missing test files for D54091
llvm-svn: 363056
2019-06-11 12:49:15 +00:00
Pengfei Wang 244062eece [X86] Enable intrinsics that convert float and bf16 data to each other
Scalar version :
_mm_cvtsbh_ss , _mm_cvtness_sbh

Vector version:
_mm512_cvtpbh_ps , _mm256_cvtpbh_ps
_mm512_maskz_cvtpbh_ps , _mm256_maskz_cvtpbh_ps
_mm512_mask_cvtpbh_ps , _mm256_mask_cvtpbh_ps

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62363

llvm-svn: 363018
2019-06-11 01:17:28 +00:00
Simon Tatham 5d66f2b0af [ARM] Fix bugs introduced by the fp64/d32 rework.
Change D60691 caused some knock-on failures that weren't caught by the
existing tests. Firstly, selecting a CPU that should have had a
restricted FPU (e.g. `-mcpu=cortex-m4`, which should have 16 d-regs
and no double precision) could give the unrestricted version, because
`ARM::getFPUFeatures` returned a list of features including subtracted
ones (here `-fp64`,`-d32`), but `ARMTargetInfo::initFeatureMap` threw
away all the ones that didn't start with `+`. Secondly, the
preprocessor macros didn't reliably match the actual compilation
settings: for example, `-mfpu=softvfp` could still set `__ARM_FP` as
if hardware FP was available, because the list of features on the cc1
command line would include things like `+vfp4`,`-vfp4d16` and clang
didn't realise that one of those cancelled out the other.

I've fixed both of these issues by rewriting `ARM::getFPUFeatures` so
that it returns a list that enables every FP-related feature
compatible with the selected FPU and disables every feature not
compatible, which is more verbose but means clang doesn't have to
understand the dependency relationships between the backend features.
Meanwhile, `ARMTargetInfo::handleTargetFeatures` is testing for all
the various forms of the FP feature names, so that it won't miss cases
where it should have set `HW_FP` to feed into feature test macros.

That in turn caused an ordering problem when handling `-mcpu=foo+bar`
together with `-mfpu=something_that_turns_off_bar`. To fix that, I've
arranged that the `+bar` suffixes on the end of `-mcpu` and `-march`
cause feature names to be put into a separate vector which is
concatenated after the output of `getFPUFeatures`.

Another side effect of all this is to fix a bug where `clang -target
armv8-eabi` by itself would fail to set `__ARM_FEATURE_FMA`, even
though `armv8` (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
because `HW_FP` was being set to a value including only the `FPARMV8`
bit, but that feature test macro was testing only the `VFP4FPU` bit.
Now `HW_FP` ends up with all the bits set, so it gives the right
answer.

Changes to tests included in this patch:

* `arm-target-features.c`: I had to change basically all the expected
  results. (The Cortex-M4 test in there should function as a
  regression test for the accidental double-precision bug.)
* `arm-mfpu.c`, `armv8.1m.main.c`: switched to using `CHECK-DAG`
  everywhere so that those tests are no longer sensitive to the order
  of cc1 feature options on the command line.
* `arm-acle-6.5.c`: been updated to expect the right answer to that
  FMA test.
* `Preprocessor/arm-target-features.c`: added a regression test for
  the `mfpu=softvfp` issue.

Reviewers: SjoerdMeijer, dmgreen, ostannard, samparker, JamesNagurne

Reviewed By: ostannard

Subscribers: srhines, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62998

llvm-svn: 362791
2019-06-07 12:42:54 +00:00
Russell Gallop 4bcba163b1 [X86][test] Add test cases using immediates to builtins-x86.c
These builtins should work with immediate or variable shift operand for
gcc compatibility.

Differential Revision: https://reviews.llvm.org/D62850

llvm-svn: 362786
2019-06-07 09:51:44 +00:00
Pengfei Wang 3a29f7c99c [X86] Add ENQCMD instructions
For more details about these instructions, please refer to the latest
ISE document:
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Patch by Tianqing Wang (tianqing)

Differential Revision: https://reviews.llvm.org/D62282

llvm-svn: 362685
2019-06-06 08:28:42 +00:00
Tim Northover c46827c7ed LLVM IR: Generate new-style byval-with-Type from Clang
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.

For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.

llvm-svn: 362652
2019-06-05 21:12:14 +00:00
Petr Hosek 516e6cc1dd [Clang] Disable new PM for tests that use optimization level -O1, -O2 and -O3
Tests that use -O1, -O2 and -O3 would often produce different results
with the new pass manager which makes these tests fail. Disable new PM
explicitly for these tests.

Differential Revision: https://reviews.llvm.org/D58375

llvm-svn: 362580
2019-06-05 03:17:11 +00:00
Eric Christopher 6d04fd15b5 Remove test/CodeGen/builtin-stackaddress.c as it duplicates
test/CodeGen/2004-02-13-BuiltinFrameReturnAddress.c.

Differential Revision: https://reviews.llvm.org/D62133

llvm-svn: 362462
2019-06-03 23:16:06 +00:00
Jennifer Yu b8fee677bf Re-check in clang support gun asm goto after fixing tests.
llvm-svn: 362410
2019-06-03 15:57:25 +00:00
Andrew Savonichev fa8cd7691a [OpenCL] Use long instead of long long in x86 builtins
Summary: According to C99 standard long long is at least 64 bits in
size. However, OpenCL C defines long long as 128 bit signed
integer. This prevents one to use x86 builtins when compiling OpenCL C
code for x86 targets. The patch changes long long to long for OpenCL
only.

Patch by: Alexander Batashev <alexander.batashev@intel.com>

Reviewers: craig.topper, Ka-Ka, eandrews, erichkeane, Anastasia

Reviewed By: Ka-Ka, erichkeane, Anastasia

Subscribers: a.elovikov, yaxunl, Anastasia, cfe-commits, ivankara, etyurin, asavonic

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62580

llvm-svn: 362391
2019-06-03 12:34:59 +00:00
Simon Tatham dc83a3c449 [ARM] Fix recent breakage of -mfpu=none.
The recent change D60691 introduced a bug in clang when handling
option combinations such as `-mcpu=cortex-m4 -mfpu=none`. Those
options together should select Cortex-M4 but disable all use of
hardware FP, but in fact, now hardware FP instructions can still be
generated in that mode.

The reason is because the handling of FPUVersion::NONE disables all
the same feature names it used to, of which the base one is `vfp2`.
But now there are further features below that, like `vfp2d16fp` and
(following D60694) `fpregs`, which also need to be turned off to
disable hardware FP completely.

Added a tiny test which double-checks that compiling a simple FP
function doesn't access the FP registers.

Reviewers: SjoerdMeijer, dmgreen

Reviewed By: dmgreen

Subscribers: lebedev.ri, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62729

llvm-svn: 362380
2019-06-03 11:02:53 +00:00
Pengfei Wang cc3629d545 [X86] Add VP2INTERSECT instructions
Support intel AVX512 VP2INTERSECT instructions in clang

Patch by Xiang Zhang (xiangzhangllvm)

Differential Revision: https://reviews.llvm.org/D62367

llvm-svn: 362196
2019-05-31 06:09:35 +00:00
Zi Xuan Wu fc3ed1ec50 re-commit r361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC
Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec).

The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.

Patched by: Qiu Chaofan <qiucf@cn.ibm.com>
Reviewed By: Jinsong Ji

Differential Revision: https://reviews.llvm.org/D62121

llvm-svn: 362190
2019-05-31 04:42:13 +00:00
Pengfei Wang 48387ec187 Revert "[X86] Fix i386 struct and union parameter alignment"
This reverts commit d61cb749f4 (SVN:
361934).

According to James suggestion, revert this change. Please ref:
https://reviews.llvm.org/D60748

llvm-svn: 362186
2019-05-31 01:50:07 +00:00
Tim Northover fcb00d4aec Reapply: LLVM IR: update Clang tests for byval being a typed attribute.
Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

Clang patch is unchanged.

llvm-svn: 362129
2019-05-30 18:49:19 +00:00
Erich Keane d0f34fd198 Revert "clang support gnu asm goto."
This reverts commit 954ec09aed.

Reverting due to test failures as requested by Jennifer Yu.

Conflicts:
	clang/test/CodeGen/asm-goto.c

llvm-svn: 362106
2019-05-30 15:38:02 +00:00
Fangrui Song 54d3c3d436 Mark CodeGen/asm-goto.c as x86 specific after r362045
llvm-svn: 362059
2019-05-30 06:48:13 +00:00
Jennifer Yu 954ec09aed clang support gnu asm goto.
Syntax:
  asm [volatile] goto ( AssemblerTemplate
                      :
                      : InputOperands
                      : Clobbers
                      : GotoLabels)

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

New llvm IR is "callbr" for inline asm goto instead "call" for inline asm
For:
asm goto("testl %0, %0; jne %l1;" :: "r"(cond)::label_true, loop);
IR:
callbr void asm sideeffect "testl $0, $0; jne ${1:l};", "r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %0, i8* blockaddress(@foo, %label_true), i8* blockaddress(@foo, %loop)) #1
          to label %asm.fallthrough [label %label_true, label %loop], !srcloc !3

asm.fallthrough:                                

Compiler need to generate:
1> a dummy constarint 'X' for each label.
2> an unique fallthrough label for each asm goto stmt " asm.fallthrough%number".


Diagnostic 
1>	duplicate asm operand name are used in output, input and label.
2>	goto out of scope.

llvm-svn: 362045
2019-05-30 01:05:46 +00:00
Tim Northover 4b281755ae Revert "LLVM IR: update Clang tests for byval being a typed attribute."
The underlying LLVM change couldn't cope with llvm-link and broke LTO builds.

llvm-svn: 362028
2019-05-29 20:45:32 +00:00
Tim Northover 45e8cc6639 LLVM IR: update Clang tests for byval being a typed attribute.
Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

llvm-svn: 362013
2019-05-29 19:13:29 +00:00
Simon Atanasyan c7f0b33fa5 [mips] Check argument for __builtin_msa_ctcmsa / __builtin_msa_cfcmsa
The `__builtin_msa_ctcmsa` and `__builtin_msa_cfcmsa` builtins are mapped
to the `ctcmsa` and `cfcmsa` instructions respectively. While MSA
control registers have indexes in 0..7 range, the instructions accept
register index in 0..31 range [1].

[1] MIPS Architecture for Programmers Volume IV-j:
    The MIPS64 SIMD Architecture Module
https://www.mips.com/?do-download=the-mips64-simd-architecture-module

llvm-svn: 361967
2019-05-29 14:59:32 +00:00
Pengfei Wang d61cb749f4 [X86] Fix i386 struct and union parameter alignment
According to i386 System V ABI 2.1: Structures and unions assume the
alignment of their most strictly aligned component. But current
implementation always takes them as 4-byte aligned which will result
in incorrect code, e.g:

 1 #include <immintrin.h>
 2 typedef union {
 3         int d[4];
 4         __m128 m;
 5 } M128;
 6 extern void foo(int, ...);
 7 void test(void)
 8 {
 9   M128 a;
10   foo(1, a);
11   foo(1, a.m);
12 }

The first call (line 10) takes the second arg as 4-byte aligned while
the second call (line 11) takes the second arg as 16-byte aligned.
There is oxymoron for the alignment of the 2 calls because they should
be the same.

This patch fixes the bug by following i386 System V ABI and apply it to
Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks
at present.

Patch by Wei Xiao (wxiao3)

Differential Revision: https://reviews.llvm.org/D60748

llvm-svn: 361934
2019-05-29 08:42:35 +00:00
Zi Xuan Wu 48061cd999 revert rC361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC
Because test fails in other targets rather than PowerPC

llvm-svn: 361930
2019-05-29 07:09:54 +00:00
Zi Xuan Wu b3bcbb5b66 [PowerPC] [Clang] Port SSE intrinsics to PowerPC
Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec).

The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.

Patched by: Qiu Chaofan <qiucf@cn.ibm.com>
Reviewed By: Jinsong Ji

Differential Revision: https://reviews.llvm.org/D62121

llvm-svn: 361928
2019-05-29 05:17:03 +00:00
Adhemerval Zanella 1468991073 [clang] Handle lrint/llrint builtins
As for other floating-point rounding builtins that can be optimized
when build with -fno-math-errno, this patch adds support for lrint
and llrint.  It currently only optimize for AArch64 backend.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D62019

llvm-svn: 361878
2019-05-28 21:16:04 +00:00
Simon Tatham 760df47b77 [ARM] Replace fp-only-sp and d16 with fp64 and d32.
Those two subtarget features were awkward because their semantics are
reversed: each one indicates the _lack_ of support for something in
the architecture, rather than the presence. As a consequence, you
don't get the behavior you want if you combine two sets of feature
bits.

Each SubtargetFeature for an FP architecture version now comes in four
versions, one for each combination of those options. So you can still
say (for example) '+vfp2' in a feature string and it will mean what
it's always meant, but there's a new string '+vfp2d16sp' meaning the
version without those extra options.

A lot of this change is just mechanically replacing positive checks
for the old features with negative checks for the new ones. But one
more interesting change is that I've rearranged getFPUFeatures() so
that the main FPU feature is appended to the output list *before*
rather than after the features derived from the Restriction field, so
that -fp64 and -d32 can override defaults added by the main feature.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60691

llvm-svn: 361845
2019-05-28 16:13:20 +00:00
Alina Sbirlea b4c756dc1c Mark tests as x86.
llvm-svn: 361674
2019-05-24 21:49:27 +00:00
Alina Sbirlea 21efe2afed [NewPassManager] Add tuning option: LoopUnrolling [clang-change]
Summary:
Use CodeGenOpts's setting for loop unrolling.
[to be coupled with D61618]

Reviewers: chandlerc

Subscribers: jlebar, dmgreen, cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61620

llvm-svn: 361653
2019-05-24 17:40:52 +00:00
Alina Sbirlea f2e41dd6ed Use clang_cc1 instead of clang in CodeGen test.
llvm-svn: 361562
2019-05-23 22:07:37 +00:00
Alina Sbirlea 9925ef78ce Update breaking test.
llvm-svn: 361542
2019-05-23 19:51:16 +00:00
Alina Sbirlea 267ac925fb [NewPassManager] Add tuning option: SLPVectorization [clang-change]
Summary:
NewPassManager is not using CodeGenOpts values before this patch.
[to be coupled with D61616]

Reviewers: chandlerc

Subscribers: jlebar, cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61617

llvm-svn: 361534
2019-05-23 18:51:02 +00:00
John Brawn 6c49f58a35 [ARM][AArch64] Fix incorrect handling of alignment in va_arg code generation
Overaligned and underaligned types (i.e. types where the alignment has been
increased or decreased using the aligned and packed attributes) weren't being
correctly handled in all cases, as the unadjusted alignment should be used.

This patch also adjusts getTypeUnadjustedAlign to correctly handle typedefs of
non-aggregate types, which it appears it never had to handle before.

Differential Revision: https://reviews.llvm.org/D62152

llvm-svn: 361372
2019-05-22 11:42:54 +00:00
Alexandre Ganea 047e65db77 [DebugInfo] Don't emit checksums when compiling a preprocessed CPP
Fixes PR41215

Differential Revision: https://reviews.llvm.org/D60283

llvm-svn: 361296
2019-05-21 19:40:28 +00:00
Craig Topper 31cc510980 [X86] Check the alignment argument for the masked.load/store for the _mm_mask_store_ss/sd and _mm_mask(z)_load_ss/sd intrinsics.
llvm-svn: 361187
2019-05-20 18:48:31 +00:00
Craig Topper af7a188453 [Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well.

Differential Revision: https://reviews.llvm.org/D62026

llvm-svn: 361169
2019-05-20 16:27:09 +00:00
Ben Dunbobbin 1d16515fb4 [ELF] Implement Dependent Libraries Feature
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.

Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.

The design goals were to provide:

- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
  environments (MSVC in particular).

Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.

In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:

1. There are no competing defined symbols in a given set of libraries, or
   if they exist, the program owner doesn't care which is linked to their
   program.
2. There may be circular dependencies between libraries.

The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:

.section ".deplibs","MS",@llvm_dependent_libraries,1
         .asciz "foo"

For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.

LLD processes the dependent library specifiers in the following way:

1. Dependent libraries which are found from the specifiers in .deplibs sections
   of relocatable object files are added when the linker decides to include that
   file (which could itself be in a library) in the link. Dependent libraries
   behave as if they were appended to the command line after all other options. As
   a consequence the set of dependent libraries are searched last to resolve
   symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
   to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
   strings in a .deplibs section by; first, handling the string as if it was
   specified on the command line; second, by looking for the string in each of the
   library search paths in turn; third, by looking for a lib<string>.a or
   lib<string>.so (depending on the current mode of the linker) in each of the
   library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
   dependent libraries.

Rationale for the above points:

1. Adding the dependent libraries last makes the process simple to understand
   from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
   failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
   will affect all of the dependent libraries. There is a potential problem of
   surprise for developers, who might not realize that these options would apply
   to these "invisible" input files; however, despite the potential for surprise,
   this is easy for developers to reason about and gives developers the control
   that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
   find input files. The different search methods are tried by the linker in most
   obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
   ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
   is not necessary: if finer control is required developers can fall back to using
   the command line directly.

RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.

Differential Revision: https://reviews.llvm.org/D60274

llvm-svn: 360984
2019-05-17 03:44:15 +00:00
Adhemerval Zanella 0d9dcd7bf0 [clang] Handle lround/llround builtins
As for other floating-point rounding builtins that can be optimized
when build with -fno-math-errno, this patch adds support for lround
and llround.  It currently only optimize for AArch64 backend.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D61392

llvm-svn: 360896
2019-05-16 13:43:25 +00:00
Karl-Johan Karlsson 0e525a4d6b [builtin] Fixed definitions of builtins that rely on the int/long long type is 32/64 bits
Summary:
The definition of the builtins __builtin_bswap32, __builtin_bitreverse32, __builtin_rotateleft32 and __builtin_rotateright32 rely on that the int type is 32 bits wide on the target.
The defintions of the builtins __builtin_bswap64, __builtin_bitreverse64, __builtin_rotateleft64, and __builtin_rotateright64 rely on that the long long type is 64 bits wide.

On targets where this is not the case (e.g. AVR) clang will generate faulty code (wrong llvm assembler intrinsics).

This patch add support for using 'Z' (the int32_t type) in Bultins.def. The builtins above are changed to be based on the int32_t type instead of the int type, and the int64_t type instead of the long long type.

The AVR backend (experimental) have a native int type that is only 16 bits wide. The supplied testcase will therefore fail if running the testcase on trunk as clang will convert e.g. __builtin_bitreverse32 into llvm.bitreverse.i16 on AVR.

Reviewers: dylanmckay, spatel, rsmith, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D61845

llvm-svn: 360863
2019-05-16 07:18:02 +00:00
Leonard Chan 048a97bca4 Fix bots by adding target triple to test.
llvm-svn: 360720
2019-05-14 22:37:34 +00:00
Leonard Chan 0cdd3b1d81 [NewPM] Port HWASan and Kernel HWASan
Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan.

Changes:
- Separate HWAddressSanitizer into a pass class and a sanitizer class.
- Create new PM wrapper pass for the sanitizer class.
- Use the getOrINsert pattern for some module level initialization declarations.
- Also enable kernel-kwasan in new PM
- Update llvm tests and add clang test.

Differential Revision: https://reviews.llvm.org/D61709

llvm-svn: 360707
2019-05-14 21:17:21 +00:00
Hans Wennborg b0dbc9612f Revert r360637 "PR41817: Fix regression in r359260 that caused the MS compatibility"
> extension allowing a "static" declaration to follow an "extern"
> declaration to stop working.

It introduced asserts for some "static-following-extern" cases, breaking the
Chromium build. See the cfe-commits thread for reproducer.

llvm-svn: 360657
2019-05-14 10:11:33 +00:00
Richard Smith 3bde7bf3e0 PR41817: Fix regression in r359260 that caused the MS compatibility
extension allowing a "static" declaration to follow an "extern"
declaration to stop working.

llvm-svn: 360637
2019-05-14 00:27:16 +00:00
Teresa Johnson 962a6f35b5 [ThinLTO] Clang test changes for new CanAutoHide flag
llvm-svn: 360468
2019-05-10 20:38:31 +00:00
Reid Kleckner 6bf108d77a [COFF] Use COFF stubs for extern_weak functions
Summary:
A COFF stub indirects the reference to a symbol through memory. A
.refptr.$sym global variable pointer is created to refer to $sym.
Typically mingw uses these for external global variable declarations,
but we can use them for weak function declarations as well.

Updates the dso_local classification to add a special case for
extern_weak symbols on COFF in both clang and LLVM.

Fixes PR37598

Reviewers: smeenai, mstorsjo

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61615

llvm-svn: 360207
2019-05-07 23:06:21 +00:00
Richard Smith b30657938c Improve function / variable disambiguation.
Keep looking for decl-specifiers after an unknown identifier. Don't
issue diagnostics about an error type specifier conflicting with later
type specifiers.

llvm-svn: 360117
2019-05-07 07:36:07 +00:00
Petr Hosek 5f2e10e9c3 [Clang][NewPM] Don't bail out if the target machine is empty
This matches the behavior of the old pass manager. There are some
targets that don't have target machine at all (e.g. le32, spir) which
whose tests would never run with new pass manager. Similarly, we would
need to disable tests for targets that are disabled.

Differential Revision: https://reviews.llvm.org/D58374

llvm-svn: 360100
2019-05-06 23:24:17 +00:00
Martin Storsjo 7037a13679 [AArch64] Add __builtin_sponentry, for calling setjmp in MinGW
In MinGW, setjmp isn't expanded as a builtin in the compiler (like it
is for MSVC), but manually hooked up as calls to the right underlying
functions in headers. Using the actual CRT's real setjmp/longjmp
functions requires this intrinsic. (Currently this is worked around by
using MinGW specific reimplementations of setjmp/longjmp on aarch64.)

Differential Revision: https://reviews.llvm.org/D61592

llvm-svn: 360082
2019-05-06 21:19:07 +00:00
Fangrui Song 041c377a59 [X86] Move files to correct directories after D60552
llvm-svn: 360022
2019-05-06 09:24:36 +00:00
Luo, Yuanke 844f662932 Enable intrinsics of AVX512_BF16, which are supported for BFLOAT16 in Cooper Lake
Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Patch by LiuTianle

Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon

Reviewed By: craig.topper

Subscribers: mgorny, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60552

llvm-svn: 360018
2019-05-06 08:25:11 +00:00
Mandeep Singh Grang 85a0f8fe6c [COFF, ARM64] Fix ABI implementation of struct returns
Summary:
Related llvm patch: D60348.
Patch co-authored by Sanjin Sijaric.

Reviewers: rnk, efriedma, TomTan, ssijaric, ostannard

Reviewed By: efriedma

Subscribers: dmajor, richard.townsend.arm, ostannard, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60349

llvm-svn: 359932
2019-05-03 21:12:24 +00:00
Amy Huang 301a5bbd59 Change the metadata for heapallocsite calls when the type is cast.
llvm-svn: 359823
2019-05-02 20:07:35 +00:00
Tom Tan b7c6d95af5 [COFF, ARM64] Align global symbol by size for ARM64 MSVC ABI
According to alignment section in below ARM64 ABI document, MSVC could increase
alignment of global data based on its total size. Clang doesn't do this. Compile
the same symbol into different alignments by Clang and MSVC could cause link
error because some instruction encodings, like 64-bit LDR/STR with immediate,
require the target to be 8 bytes aligned, and linker could choose code stream
with such LDR/STR instruction from MSVC and 4 bytes aligned data from Clang into
final image, which actually cannot be linked together
(see https://bugs.llvm.org/show_bug.cgi?id=41506 for more details).

https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#alignment

Differential Revision: https://reviews.llvm.org/D61225

llvm-svn: 359744
2019-05-02 00:38:14 +00:00
Fangrui Song 324ace4b5c Change llvm-{objdump,readobj} -long-option to --long-option or well-known short options in tests. NFC
llvm-svn: 359662
2019-05-01 09:30:45 +00:00
JF Bastien d39fbc7e20 Variable auto-init: don't initialize aggregate padding of all aggregates
Summary:
C guarantees that brace-init with fewer initializers than members in the
aggregate will initialize the rest of the aggregate as-if it were static
initialization. In turn static initialization guarantees that padding is
initialized to zero bits.

Quoth the Standard:

C17 6.7.9 Initialization ❡21

If there are fewer initializers in a brace-enclosed list than there are elements
or members of an aggregate, or fewer characters in a string literal used to
initialize an array of known size than there are elements in the array, the
remainder of the aggregate shall be initialized implicitly the same as objects
that have static storage duration.

C17 6.7.9 Initialization ❡10

If an object that has automatic storage duration is not initialized explicitly,
its value is indeterminate. If an object that has static or thread storage
duration is not initialized explicitly, then:

 * if it has pointer type, it is initialized to a null pointer;
 * if it has arithmetic type, it is initialized to (positive or unsigned) zero;
 * if it is an aggregate, every member is initialized (recursively) according to
   these rules, and any padding is initialized to zero bits;
 * if it is a union, the first named member is initialized (recursively)
   according to these rules, and any padding is initialized to zero bits;

<rdar://problem/50188861>

Reviewers: glider, pcc, kcc, rjmccall, erik.pilkington

Subscribers: jkorous, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61280

llvm-svn: 359628
2019-04-30 22:56:53 +00:00
Ahsan Saghir 3962d6da17 Add __builtin_dcbf support for PPC
Summary:
This patch adds support for __builtin_dcbf for PPC.

__builtin_dcbf copies the contents of a modified block from the data cache
to main memory and flushes the copy from the data cache.

Differential revision: https://reviews.llvm.org/D59843

llvm-svn: 359517
2019-04-29 23:25:33 +00:00
Qiu Chaofan 8eeb33497c [PowerPC][Clang] Add tests for PowerPC MMX intrinsics
Add the rest of test cases covering functions defined in mmintrin.h on PowerPC.

Reviewed By: Jinsong Ji

llvm-svn: 359393
2019-04-28 06:27:33 +00:00
Richard Smith 31cfb311c5 Reinstate r359059, reverted in r359361, with a fix to properly prevent
us emitting the operand of __builtin_constant_p if it has side-effects.

Original commit message:

Fix interactions between __builtin_constant_p and constexpr to match
current trunk GCC.

GCC permits information from outside the operand of
__builtin_constant_p (but in the same constant evaluation context) to be
used within that operand; clang now does so too. A few other minor
deviations from GCC's behavior showed up in my testing and are also
fixed (matching GCC):
  * Clang now supports nullptr_t as the argument type for
    __builtin_constant_p
    * Clang now returns true from __builtin_constant_p if called with a
    null pointer
    * Clang now returns true from __builtin_constant_p if called with an
    integer cast to pointer type

llvm-svn: 359367
2019-04-27 02:58:17 +00:00
Javed Absar 18b0c40bc5 [AArch64] Add support for MTE intrinsics
This provides intrinsics support for Memory Tagging Extension (MTE),
which was introduced with the Armv8.5-a architecture.
These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined.
Each intrinsic is described in detail in the ACLE Q1 2019 documentation:
https://developer.arm.com/docs/101028/latest
Reviewed By: Tim Nortover, David Spickett
Differential Revision: https://reviews.llvm.org/D60485

llvm-svn: 359348
2019-04-26 21:08:11 +00:00
Reid Kleckner 1be5369a0c Revert [COFF] Statically link certain runtime library functions
This reverts r359250 (git commit 4730604bd3)

The newly added test should use -cc1 and -emit-llvm and there are other
test failures that need fixing.

llvm-svn: 359251
2019-04-25 23:30:41 +00:00
Reid Kleckner 4730604bd3 [COFF] Statically link certain runtime library functions
Statically link certain runtime library functions for MSVC/GNU Windows
environments. This is consistent with MSVC behavior.

Fixes LNK4286 and LNK4217 warnings from link.exe when linking the static
CRT:
  LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_noinst_test.cc.x86_64-calls.o'
  LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_test_main.cc.x86_64-calls.o'
  LINK : warning LNK4217: symbol '_CxxThrowException' defined in 'libvcruntime.lib(throw.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.gtest-all.cc.x86_64-calls.o' in function '"int `public: static class UnitTest::GetInstance * __cdecl testing::UnitTest::GetInstance(void)'::`1'::dtor$5" (?dtor$5@?0??GetInstance@UnitTest@testing@@SAPEAV12@XZ@4HA)'

Reviewers: mstorsjo, efriedma, TomTan, compnerd, smeenai, mgrang

Subscribers: abdulras, theraven, smeenai, pcc, mehdi_amini, javed.absar, inglorion, kristof.beyls, dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D55229

llvm-svn: 359250
2019-04-25 23:04:20 +00:00
Artem Belevich 5fe85a003f [CUDA] Implemented _[bi]mma* builtins.
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.

Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)

Differential Revision: https://reviews.llvm.org/D60279

llvm-svn: 359248
2019-04-25 22:28:09 +00:00
Rong Xu 4059e143dc [PGO] Enable InstrProf lowering for Clang PGO instrumentation in the new pass manager
Currently InstrProf lowering is not enabled for Clang PGO instrumentation in
the new pass manager. The following command
"-fprofile-instr-generate -fexperimental-new-pass-manager ..." is broken.

This CL enables InstrProf lowering pass for Clang PGO instrumentation in the
new pass manager.

Differential Revision: https://reviews.llvm.org/D61138

llvm-svn: 359215
2019-04-25 17:52:43 +00:00
Teresa Johnson 867bc3951b [ThinLTO] Pass down opt level to LTO backend and handle -O0 LTO in new PM
Summary:
The opt level was not being passed down to the ThinLTO backend when
invoked via clang (for distributed ThinLTO).

This exposed an issue where the new PM was asserting if the Thin or
regular LTO backend pipelines were invoked with -O0 (not a new issue,
could be provoked by invoking in-process *LTO backends via linker using
new PM and -O0). Fix this similar to the old PM where -O0 only does the
necessary lowering of type metadata (WPD and LowerTypeTest passes) and
then quits, rather than asserting.

Reviewers: xur

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61022

llvm-svn: 359025
2019-04-23 18:56:19 +00:00
Joel E. Denny 3234887fe2 [APSInt][OpenMP] Fix isNegative, etc. for unsigned types
Without this patch, APSInt inherits APInt::isNegative, which merely
checks the sign bit without regard to whether the type is actually
signed.  isNonNegative and isStrictlyPositive call isNegative and so
are also affected.

This patch adjusts APSInt to override isNegative, isNonNegative, and
isStrictlyPositive with implementations that consider whether the type
is signed.

A large set of Clang OpenMP tests are affected.  Without this patch,
these tests assume that `true` is not a valid argument for clauses
like `collapse`.  Indeed, `true` fails APInt::isStrictlyPositive but
not APSInt::isStrictlyPositive.  This patch adjusts those tests to
assume `true` should be accepted.

This patch also adds tests revealing various other similar fixes due
to APSInt::isNegative calls in Clang's ExprConstant.cpp and
SemaExpr.cpp: `++` and `--` overflow in `constexpr`, evaluated object
size based on `alloc_size`, `<<` and `>>` shift count validation, and
OpenMP array section validation.

Reviewed By: lebedev.ri, ABataev, hfinkel

Differential Revision: https://reviews.llvm.org/D59712

llvm-svn: 359012
2019-04-23 17:04:15 +00:00
Fangrui Song fb2783f680 [PowerPC] Fix test with -fno-discard-value-names after rC358949
For the clang driver, -DLLVM_ENABLE_ASSERTIONS=off builds default to discard value names.

llvm-svn: 358953
2019-04-23 07:39:23 +00:00
Qiu Chaofan 19828e399b [PowerPC] [Clang] Port MMX intrinsics and basic test cases to Power
Port mmintrin.h which include x86 MMX intrinsics implementation to PowerPC platform (using Altivec).

To make the include process correct, PowerPC's toolchain class is overrided to insert new headers directory (named ppc_wrappers) into the path. Basic test cases for several intrinsic functions are added.

The header is mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.

Reviewed By: Jinsong Ji

Differential Revision: https://reviews.llvm.org/D59924

llvm-svn: 358949
2019-04-23 05:50:24 +00:00
Craig Topper a54a11e22a [X86] Improve avx512-kconstraints-att_inline_asm.c to not be easily defeated by deadcode elimination. Improve CHECK lines to check IR types used. NFC
I plan to use this as the basis for backend IR test cases. We currently crash hard for using 32 or 64 bit mask registers without avx512bw.

llvm-svn: 358435
2019-04-15 18:39:36 +00:00
Craig Topper 8e364c680f [X86] Restore the pavg intrinsics.
The pattern we replaced these with may be too hard to match as demonstrated by
PR41496 and PR41316.

This patch restores the intrinsics and then we can start focusing
on the optimizing the intrinsics.

I've mostly reverted the original patch that removed them. Though I modified
the avx512 intrinsics to not have masking built in.

Differential Revision: https://reviews.llvm.org/D60674

llvm-svn: 358427
2019-04-15 17:17:35 +00:00
Amy Huang 0d0334fe1b Relanding r357928 with fixed debuginfo check.
[MS] Add metadata for __declspec(allocator)

Original summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.

Differential Revision: https://reviews.llvm.org/D60237

llvm-svn: 358307
2019-04-12 20:25:30 +00:00
Diogo N. Sampaio eb312ddfdf [Aarch64] Add v8.2-a half precision element extract intrinsics
Summary:
Implements the intrinsics define on the ACLE to extract half precision fp scalar elements from float16x4_t and float16x8_t vector types.
a.k.a:
vduph_lane_f16
vduph_laneq_f16

Reviewers: pablooliveira, olista01, LukeGeeson, DavidSpickett

Reviewed By: DavidSpickett

Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60272

llvm-svn: 358276
2019-04-12 10:43:48 +00:00
John McCall 103556279f Fix for different build configurations.
llvm-svn: 358125
2019-04-10 19:11:32 +00:00
John McCall 8b36ac818c Don't emit an unreachable return block.
Patch by Brad Moody.

llvm-svn: 358104
2019-04-10 17:03:09 +00:00
Alex Bradbury 91542e14c7 [RISCV] Unbreak test from r357989
There were some errors in the committed test checks, left in due to a git
stash apply mishap.

llvm-svn: 357993
2019-04-09 10:44:47 +00:00
Alex Bradbury fa3eb12010 [RISCV][NFC] Minor fixup for r357989
One of the tests in riscv64-lp64-lp64f-lp64d would have had a different
lowering for lp64f/lp64d as a float argument was missed.

llvm-svn: 357991
2019-04-09 10:25:05 +00:00
Alex Bradbury c0e8231cdd [RISCV][NFC] Refactor RISC-V ABI lowering tests in preparation for hard float patches
Split tests in to files representing the subset of RISC-V ABIs they should
have identical output for.

llvm-svn: 357989
2019-04-09 10:12:49 +00:00
Amy Huang 8a96fa23e6 Revert "[MS] Add metadata for __declspec(allocator)"
This reverts commit e7bd735bb0.
Reverting because of buildbot failure.

llvm-svn: 357952
2019-04-08 22:46:41 +00:00
Amy Huang e7bd735bb0 [MS] Add metadata for __declspec(allocator)
Summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.

Reviewers: rnk

Subscribers: jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60237

llvm-svn: 357928
2019-04-08 17:58:29 +00:00
Sanjay Patel b276dd195a [InstCombine] canonicalize select shuffles by commuting
In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) values.

Rather than try to match commuted variants of the pattern, we can canonicalize the
shuffles and check for mask equality with commuted operands.

We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a
special case that the backend is required to handle because we already canonicalize
vector select to this shuffle form.

So there should be no codegen difference from this change. It's possible that this
improves CSE in IR though.

Differential Revision: https://reviews.llvm.org/D60016

llvm-svn: 357366
2019-03-31 15:01:30 +00:00
Kang Zhang e5ac385fb1 [PowerPC] Add the support for __builtin_setrnd() in clang
Summary:
PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode.
double __builtin_setrnd(int mode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2).

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D59403

llvm-svn: 357242
2019-03-29 09:11:52 +00:00
Reid Kleckner 73253bdefc [MS] Make __iso_volatile_* available on all targets
Future versions of MSVC make these intrinsics available on x86 & x64,
according to:
http://lists.llvm.org/pipermail/cfe-dev/2019-March/061711.html

The purpose of these builtins is to emit plain, non-atomic, volatile
stores when /volatile:ms (-cc1 -fms-volatile) is enabled.

llvm-svn: 357220
2019-03-28 22:59:09 +00:00
Craig Topper 88f4054f48 [X86] Add BSR/BSF/BSWAP intrinsics to ia32intrin.h to match gcc.
Summary:
These are all implemented by icc as well.

I made bit_scan_forward/reverse forward to the __bsfd/__bsrq since we also have
__bsfq/__bsrq.

Note, when lzcnt is enabled the bsr intrinsics generates lzcnt+xor instead of bsr.

Reviewers: RKSimon, spatel

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59682

llvm-svn: 356848
2019-03-24 00:56:52 +00:00
Evandro Menezes 36b31bbe8c [clang] Add support for Exynos M5 (NFC)
Add Exynos M5 test cases.

llvm-svn: 356794
2019-03-22 18:44:09 +00:00
Amara Emerson c10b24691a [AArch64] Split the neon.addp intrinsic into integer and fp variants.
This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.

This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.

This is a more permanent solution to PR40968.

Differential Revision: https://reviews.llvm.org/D59655

llvm-svn: 356722
2019-03-21 22:31:37 +00:00
Craig Topper 7339e61b89 [X86] Correct the value of MaxAtomicInlineWidth for pre-586 cpus
Use the new cx8 feature flag that was added to the backend to represent support for cmpxchg8b. Use this flag to set the MaxAtomicInlineWidth.

This also assumes all the cmpxchg instructions are enabled for CK_Generic which is what cc1 defaults to when nothing is specified.

Differential Revision: https://reviews.llvm.org/D59566

llvm-svn: 356709
2019-03-21 20:36:08 +00:00
Craig Topper 1383340422 [X86] Add __popcntd and __popcntq to ia32intrin.h to match gcc and icc. Remove popcnt feature flag from _popcnt32/_popcnt64 and move to ia32intrin.h to match gcc
gcc and icc both implement popcntd and popcntq which we did not. gcc doesn't seem to require a feature flag for the _popcnt32/_popcnt64 spelling and will use a libcall if its not supported.

Differential Revision: https://reviews.llvm.org/D59567

llvm-svn: 356689
2019-03-21 17:43:53 +00:00
Erich Keane 505427cb2f Permit redeclarations of a builtin to specify calling convention.
After https://reviews.llvm.org/rL355317 we noticed that quite a decent
amount of code redeclares builtins (memcpy in particular, I believe
reduced from an MSVC header) with a calling convention specified.
This gets particularly troublesome when the user specifies a new
'default' calling convention on the command line.

When looking to add a diagnostic for this case, it was noticed that we
had 3 other diagnostics that differed only slightly.  This patch ALSO
unifies those under a 'select'.  Unfortunately, the order of words in
ONE of these diagnostics was reversed ("'thiscall' calling convention"
vs "calling convention 'thiscall'"), so this patch also standardizes on
the former.

Differential Revision: https://reviews.llvm.org/D59560

Change-Id: I79f99fe7c2301640755ffdd774b46eb44526bb22
llvm-svn: 356663
2019-03-21 13:30:56 +00:00
Craig Topper e0941cb326 [X86] Add __crc32b/__crc32w/__crc32d/__crc32q intrinsics to match gcc and icc.
gcc has these intrinsics in ia32intrin.h as well. And icc implements them
though they aren't documented in the Intel Intrinsics Guide.

Differential Revision: https://reviews.llvm.org/D59533

llvm-svn: 356609
2019-03-20 20:25:28 +00:00
Jordan Rupprecht 993a05fe1b Fix CodeGen/arm64-microsoft-status-reg.cpp test
Summary: This test is failing after r356499 (verified with `ninja check-clang-codegen`). Update the register selection used in the test from x0 to x8.

Reviewers: arsenm, MatzeB, efriedma

Reviewed By: efriedma

Subscribers: efriedma, wdng, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59557

llvm-svn: 356517
2019-03-19 20:55:14 +00:00
Erik Pilkington 02d5fb1a6e Add a spelling of pass_object_size that uses __builtin_dynamic_object_size
The attribute pass_dynamic_object_size(n) behaves exactly like
pass_object_size(n), but instead of evaluating __builtin_object_size on calls,
it evaluates __builtin_dynamic_object_size, which has the potential to produce
runtime code when the object size can't be determined statically.

Differential revision: https://reviews.llvm.org/D58757

llvm-svn: 356515
2019-03-19 20:44:18 +00:00
Aaron Ballman 165435ffa0 Ensure that const variables declared at namespace scope correctly have external linkage when marked as dllexport and targeting the MSVC ABI.
Patch thanks to Zahira Ammarguellat.

llvm-svn: 356458
2019-03-19 14:53:52 +00:00
Heejin Ahn 802fe81df3 [WebAssembly] Change wasm.throw's first argument to an immediate
Summary:
`wasm.throw` builtin's first 'tag' argument should be an immediate index
into the event section.

Reviewers: dschuff, craig.topper

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59448

llvm-svn: 356436
2019-03-19 04:58:59 +00:00
Craig Topper 8b653d0308 [X86] Add gcc rotate intrinsics to ia32intrin.h
This is another attempt at what Erich Keane tried to do in r355322.

This adds rolb, rolw, rold, rolq and their ror equivalent as always_inline wrappers around __builtin_rotate* which will lower to funnel shift intrinsics in IR.

Additionally, when _MSC_VER is not defined we will define _rotl, _lrotl, _rotr, _lrotr as macros to one of the always_inline intrinsics mentioned above. Making sure that _lrotl/_lrotr use either 32 or 64 bit based on the size of long. These need to be macros because we have builtins with the same name for MS compatibility, but _MSC_VER isn't always defined when those builtins are enabled.

We also define _rotwl and _rotwr as macros aliasing to rolw/rorw just like gcc to complete the set. These don't need to be gated with _MSC_VER because these aren't MS builtins.

I've added tests both for non-MS and -ms-extensions with and without _MSC_VER being defined.

Differential Revision: https://reviews.llvm.org/D59346

llvm-svn: 356423
2019-03-18 22:25:57 +00:00
Michael Liao 3c2aadbe67 [AMDGPU] Add the missing clang change of the experimental buffer fat pointer
llvm-svn: 356385
2019-03-18 18:11:37 +00:00
Matt Arsenault 541bccf4d9 Add testcase from bug 41079
llvm-svn: 356354
2019-03-17 23:16:31 +00:00
Heejin Ahn 7e66a50bb4 [WebAssembly] Use rethrow intrinsic in the rethrow block
Summary:
Because in wasm we merge all catch clauses into one big catchpad, in
case none of the types in catch handlers matches after we test against
each of them, we should unwind to the next EH enclosing scope. For this,
we should NOT use a call to `__cxa_rethrow` but rather a call to our own
rethrow intrinsic, because what we're trying to do here is just to
transfer the control flow into the next enclosing EH pad (or the
caller). Calls to `__cxa_rethrow` should only be used after a call to
`__cxa_begin_catch`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59353

llvm-svn: 356317
2019-03-16 05:39:12 +00:00
Eli Friedman 4af1c26502 [CodeGen] Consider tied operands when adjusting inline asm operands.
The constraint "0" in the following asm did not consider the its
relationship with "=y" when try to replace the type of the operands.

asm ("nop" : "=y"(Mu8_1 ) : "0"(Mu8_0 ));

Patch by Xiang Zhang.

Differential Revision: https://reviews.llvm.org/D56990

llvm-svn: 356196
2019-03-14 19:46:51 +00:00
Erik Pilkington 02886e5476 Revert "Add a new attribute, fortify_stdlib"
This reverts commit r353765. After talking with our c stdlib folks, we decided
to use the existing pass_object_size attribute to implement _FORTIFY_SOURCE
wrappers, like Bionic does (I didn't realize that pass_object_size could be used
for this purpose). Sorry for the flip/flop, and thanks to James Y. Knight for
pointing this out to me.

llvm-svn: 356103
2019-03-13 21:37:01 +00:00
Francis Visoiu Mistrih dd42236c6c Reland "[Remarks] Add -foptimization-record-passes to filter remark emission"
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.

This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`

will only emit the remarks coming from the pass `inline`.

This adds:

* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin

Differential Revision: https://reviews.llvm.org/D59268

Original llvm-svn: 355964

llvm-svn: 355984
2019-03-12 21:22:27 +00:00
Francis Visoiu Mistrih 1d6c47ad2b Revert "[Remarks] Add -foptimization-record-passes to filter remark emission"
This reverts commit 20fff32b7d.

llvm-svn: 355976
2019-03-12 20:54:18 +00:00
Francis Visoiu Mistrih 20fff32b7d [Remarks] Add -foptimization-record-passes to filter remark emission
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.

This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`

will only emit the remarks coming from the pass `inline`.

This adds:

* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin

Differential Revision: https://reviews.llvm.org/D59268

llvm-svn: 355964
2019-03-12 20:28:50 +00:00
Erich Keane 92146ce399 Re-fix _lrotl/_lrotr to always take Long, no matter the platform.
r355322 fixed this, however is being reverted due to concerns with
enabling it in other modes.

Change-Id: I6a939b7469b8fa196d5871a627eb2330dbd30f29
llvm-svn: 355698
2019-03-08 15:10:07 +00:00
Erich Keane 00a5b4a275 Revert "Enable _rotl, _lrotl, _rotr, _lrotr on all platforms."
This reverts commit 24400dafe16716f28cd0e7e5fa6e004c0e50686a.

llvm-svn: 355697
2019-03-08 15:10:05 +00:00