Commit Graph

5094 Commits

Author SHA1 Message Date
Manoj Gupta da08f6ac16 [clang]: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in as the function attribute
"null-pointer-is-valid"="true".
This CL only adds the attribute on the function.
It also strips "nonnull" attributes from function arguments but
keeps the related warnings unchanged.

Corresponding LLVM change rL336613 already updated the
optimizations to not treat null pointer dereferencing
as undefined if the attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: jyknight

Subscribers: drinkcat, xbolva00, cfe-commits

Differential Revision: https://reviews.llvm.org/D47894

llvm-svn: 337433
2018-07-19 00:44:52 +00:00
JF Bastien 7d60a0f118 Support implicit _Atomic struct load / store
Summary:
Using _Atomic to do implicit load / store is just a seq_cst atomic_load / atomic_store. Stores currently assert in Sema::ImpCastExprToType with 'can't implicitly cast lvalue to rvalue with this cast kind', but that's erroneous. The codegen is fine as the test shows.

While investigating I found that Richard had found the problem here: https://reviews.llvm.org/D46112#1113557

<rdar://problem/40347123>

Reviewers: dexonsmith

Subscribers: cfe-commits, efriedma, rsmith, aaron.ballman

Differential Revision: https://reviews.llvm.org/D49458

llvm-svn: 337410
2018-07-18 18:01:41 +00:00
Peter Collingbourne 14b468bab6 Re-land r337333, "Teach Clang to emit address-significance tables.",
which was reverted in r337336.

The problem that required a revert was fixed in r337338.

Also added a missing "REQUIRES: x86-registered-target" to one of
the tests.

Original commit message:
> Teach Clang to emit address-significance tables.
>
> By default, we emit an address-significance table on all ELF
> targets when the integrated assembler is enabled. The emission of an
> address-significance table can be controlled with the -faddrsig and
> -fno-addrsig flags.
>
> Differential Revision: https://reviews.llvm.org/D48155

llvm-svn: 337339
2018-07-18 00:27:07 +00:00
Peter Collingbourne 35c6996b68 Revert r337333, "Teach Clang to emit address-significance tables."
Causing multiple failures on sanitizer bots due to TLS symbol errors,
e.g.

/usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o

llvm-svn: 337336
2018-07-17 23:56:30 +00:00
Peter Collingbourne 27242c0402 Teach Clang to emit address-significance tables.
By default, we emit an address-significance table on all ELF
targets when the integrated assembler is enabled. The emission of an
address-significance table can be controlled with the -faddrsig and
-fno-addrsig flags.

Differential Revision: https://reviews.llvm.org/D48155

llvm-svn: 337333
2018-07-17 23:17:16 +00:00
Mandeep Singh Grang 0054f48b44 [COFF] Add more missing MSVC ARM64 intrinsics
Summary:
Added the following intrinsics:
_BitScanForward, _BitScanReverse, _BitScanForward64, _BitScanReverse64
_InterlockedAnd64, _InterlockedDecrement64, _InterlockedExchange64,
_InterlockedExchangeAdd64, _InterlockedExchangeSub64,
_InterlockedIncrement64, _InterlockedOr64, _InterlockedXor64.

Reviewers: compnerd, mstorsjo, rnk, javed.absar

Reviewed By: mstorsjo

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D49445

llvm-svn: 337327
2018-07-17 22:03:24 +00:00
Joerg Sonnenberger b79e61f8b3 Always use __mcount on NetBSD. Some platforms don't provide _mcount.
llvm-svn: 337277
2018-07-17 13:13:34 +00:00
Roman Lebedev a67e0d0989 Harden/relax clang/test/CodeGen/opt-record-MIR.c test
Summary:
If the build path is short, `Line` field can end up fitting on the same line as `File`,
but the `{{.*}}` would consume it. Keeping in mind rL293149, i think we can fix it,
while keeping it working when there are and there are not any quotations.
At least this fixes this test for me.

Reviewers: anemet, aaron.ballman, hfinkel

Reviewed By: anemet

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D49348

llvm-svn: 337249
2018-07-17 07:12:08 +00:00
Krzysztof Parzyszek d66941fab9 [Hexagon] Fix hvx-length feature name in testcases
llvm-svn: 337049
2018-07-13 21:32:33 +00:00
JF Bastien 9aab85a6a0 CodeGen: specify alignment + inbounds for automatic variable initialization
Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds.

Subscribers: dexonsmith, cfe-commits

Differential Revision: https://reviews.llvm.org/D49209

llvm-svn: 337041
2018-07-13 20:33:23 +00:00
Craig Topper 2f7de23bea [X86] Change the rounding mode used when testing the sqrt_round intrinsics.
Using CUR_DIRECTION is not a realistic scenario. That is equivalent to the intrinsic without rounding.

llvm-svn: 337040
2018-07-13 20:16:38 +00:00
Krzysztof Parzyszek d5cd7ce743 [Hexagon] Fix testcases failing after r336933
llvm-svn: 336936
2018-07-12 19:44:39 +00:00
Akira Hatanaka e18c2d2ce3 os_log: When there are multiple privacy annotations in the format
string, choose the strictest one instead of the last.

Also fix an undefined behavior. Move the pointer update to a later point to
avoid adding StringRef::npos to the pointer.

rdar://problem/40706280

llvm-svn: 336863
2018-07-11 22:19:14 +00:00
Joel E. Denny 72c2783012 [FileCheck] Add -allow-deprecated-dag-overlap to failing clang tests
See https://reviews.llvm.org/D47106 for details.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D47172

llvm-svn: 336844
2018-07-11 20:26:20 +00:00
Petr Pavlu a934f9da41 Fix setting of empty implicit-section-name attribute
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty
attribute `implicit-section-name` on a function that is affected by
`#pragma clang text="section"`. This is incorrect because the attribute
should contain a valid section name. If the function additionally also
used `__attribute__((section("section")))` then this could result in
emitting the function in a section with an empty name.

The patch fixes the issue by removing the problematic code that sets
empty `implicit-section-name` from
`CodeGenModule::SetFunctionAttributes()` because it is sufficient to set
this attribute only from a similar code in `setNonAliasAttributes()`
when the function is emitted.

Differential Revision: https://reviews.llvm.org/D48916

llvm-svn: 336842
2018-07-11 20:17:54 +00:00
Craig Topper db5c9aa366 [X86] Also fix the test for _mm512_mullo_epi64 to test the intrinsic instead of a copy of the intrinsic implementation.
This had the same issue I just fixed in r336739. Apparently I copy pasted _mm512_mullo_epi64 when I added _mm512_mullox_epi64.

llvm-svn: 336740
2018-07-10 23:28:05 +00:00
Craig Topper 4ddb2f3a18 [X86] Fix the test for _mm512_mullox_epi64 to test the intrinsic instead of a copy of the intrinsic implementation.
llvm-svn: 336739
2018-07-10 23:13:01 +00:00
Bjorn Pettersson 404f414ee1 Patch to fix pragma metadata for do-while loops
Summary:
Make sure that loop metadata only is put on the backedge
when expanding a do-while loop.
Previously we added the loop metadata also on the branch
in the pre-header. That could confuse optimization passes
and result in the loop metadata being associated with the
wrong loop.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38011

Committing on behalf of deepak2427 (Deepak Panickal)

Reviewers: #clang, ABataev, hfinkel, aaron.ballman, bjope

Reviewed By: bjope

Subscribers: bjope, rsmith, shenhan, zzheng, xbolva00, lebedev.ri, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D48721

llvm-svn: 336717
2018-07-10 19:55:02 +00:00
Matt Arsenault 1dae500c4e AMDGPU: Try to fix test again
llvm-svn: 336681
2018-07-10 14:47:31 +00:00
Matt Arsenault 3ae7d63c80 Update test for backend error message change
llvm-svn: 336676
2018-07-10 14:03:50 +00:00
Mikhail Dvoretckii d1bf9ef0c7 [X86] Lowering integer truncation intrinsics to native IR
This patch lowers the _mm[256|512]_cvtepi{64|32|16}_epi{32|16|8} intrinsics to
native IR in cases where the result's length is less than 128 bits.

The resulting IR for 256-bit inputs is folded into VPMOV instructions, while for
128-bit inputs the vpshufb (or, in the 64-to-32-bit case, vinsertps)
instructions are generated instead

Differential Revision: https://reviews.llvm.org/D48712

llvm-svn: 336643
2018-07-10 08:22:44 +00:00
Craig Topper 36ab775cc1 [X86] Use masked the masked scalar fma builtins to implement the default rounding version of the fma intrinsics.
The rounding mode is checked in CGBuiltin.cpp to generate the correct intrinsic call.

Making this switch switchs the masking to use the i8 bitcast to <8 x i1> and extract i1 version of the IR for the mask. Previously we ended up with a scalar 'and' plus an icmp.

llvm-svn: 336637
2018-07-10 04:38:29 +00:00
Akira Hatanaka 189359d1ff Fix parsing of privacy annotations in os_log format strings.
Privacy annotations shouldn't have to appear in the first
comma-delimited string in order to be recognized. Also, they should be
ignored if they are preceded or followed by non-whitespace characters.

rdar://problem/40706280

llvm-svn: 336629
2018-07-10 00:50:25 +00:00
Craig Topper 638426fc36 [X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics.
This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics.

This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling.

I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle.

llvm-svn: 336622
2018-07-10 00:37:25 +00:00
Stefan Pintilie c172604dba [Power9] [CLANG] Add __float128 support for trunc to double round to odd
Add support for this builtin:
double builtin_truncf128_round_to_odd(float128)

Differential Revision: https://reviews.llvm.org/D48482

llvm-svn: 336596
2018-07-09 20:09:52 +00:00
Craig Topper 74c10e3236 [Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.

Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.

To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.

To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.

To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.

To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.

There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.

Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.

Differential Revision: https://reviews.llvm.org/D48617

llvm-svn: 336583
2018-07-09 19:00:16 +00:00
Stefan Pintilie 3dbde8a778 [Power9] Add __float128 builtins for Round To Odd
Add a number of builtins for __float128 Round To Odd.
This is the Clang portion of the builtins work.

Differential Revision: https://reviews.llvm.org/D47548

llvm-svn: 336579
2018-07-09 18:50:40 +00:00
Craig Topper 8a8d72794f [X86] Add new scalar fma intrinsics with rounding mode that use f32/f64 types.
This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma

llvm-svn: 336507
2018-07-08 01:10:47 +00:00
Craig Topper 3e720a302c [X86] Fix a few intrinsics that were ignoring their rounding mode argument and hardcoded _MM_FROUND_CUR_DIRECTION internally.
I believe these have been broken since their introduction into clang.

I've enhanced the tests for these intrinsics to using a real rounding mode and checking all the intrinsic arguments instead of just the name.

llvm-svn: 336498
2018-07-07 22:03:16 +00:00
Craig Topper 5cbeeedd27 [X86] Fix various type mismatches in intrinsic headers and intrinsic tests that cause extra bitcasts to be emitted in the IR.
Found via imprecise grepping of the -O0 IR. There could still be more bugs out there.

llvm-svn: 336487
2018-07-07 17:03:32 +00:00
Craig Topper f89f62a680 [X86] When creating a select for scalar masked sqrt and div builtins make sure we optimize the all ones mask case.
This case occurs in the intrinsic headers so we should avoid emitting the mask in those cases.

Factor the code into a helper function to make this easy.

llvm-svn: 336472
2018-07-06 22:46:52 +00:00
Craig Topper 10f20fc42b [X86] Add missing scalar fma intrinsics with rounding, but no mask.
We had the mask versions of the rounding intrinsics, but not one without masking.

Also change the rounding tests to not use the CUR_DIRECTION rounding mode.

llvm-svn: 336470
2018-07-06 22:08:43 +00:00
Craig Topper be4c2933a2 [X86] Implement _builtin_ia32_vfmaddss and _builtin_ia32_vfmaddsd with native IR using llvm.fma intrinsic.
This generates some extra zeroing currently, but we should be able to quickly address that with some isel patterns.

llvm-svn: 336417
2018-07-06 07:14:47 +00:00
Hans Wennborg 7525edc890 [ms] Fix mangling of string literals used to initialize arrays larger or smaller than the literal
A Chromium developer reported a bug which turned out to be a mangling
collision between these two literals:

  char s[] = "foo";
  char t[32] = "foo";

They may look the same, but for the initialization of t we will (under
some circumstances) use a literal that's extended with zeros, and
both the length and those zeros should be accounted for by the mangling.

This actually makes the mangling code simpler: where it previously had
special logic for null terminators, which are not part of the
StringLiteral, that is now covered by the general algorithm.

(The problem was reported at https://crbug.com/857442)

Differential Revision: https://reviews.llvm.org/D48928

llvm-svn: 336415
2018-07-06 06:54:16 +00:00
Craig Topper 284c5f342c [X86] Use shufflevector instead of a select with a constant mask for fmaddsub/fmsubadd IR emission.
Shufflevector is easier to generate and matches what the backend pattern matches without relying on constant selects being turned into shuffles.

While I was there I also made the IR regular expressions a little stricter to ensure operand order on the shuffle.

llvm-svn: 336388
2018-07-05 20:38:31 +00:00
Gabor Buella 9679eb6527 [X86] Fix some vector cmp builtins - TRUE/FALSE predicates
This patch removes on optimization used with the TRUE/FALSE
predicates, as was suggested in https://reviews.llvm.org/D45616
for r335339.
The optimization was buggy, since r335339 used it also
for *_mask builtins, without actually applying the mask -- the
mask argument was just ignored.

Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D48715

llvm-svn: 336355
2018-07-05 14:26:56 +00:00
Gabor Buella 38a7ce76ae [X86] NFC - add more test cases for vector cmp intrinsics
Add test cases with each predicate using the following
intrinsics:
_mm_cmp_pd
_mm_cmp_ps
_mm256_cmp_pd
_mm256_cmp_ps
_mm_cmp_pd_mask
_mm_cmp_ps_mask
_mm256_cmp_pd_mask
_mm256_cmp_ps_mask
_mm512_cmp_pd_mask
_mm512_cmp_ps_mask
_mm_mask_cmp_pd_mask
_mm_mask_cmp_ps_mask
_mm256_mask_cmp_pd_mask
_mm256_mask_cmp_ps_mask
_mm512_mask_cmp_pd_mask
_mm512_mask_cmp_ps_mask

Some of these are marked with FIXME, as there is bug in lowering
e.g. _mm512_mask_cmp_ps_mask.

llvm-svn: 336346
2018-07-05 12:57:47 +00:00
Lei Huang 449252d2ad [Power9] Update fp128 as a valid homogenous aggregate base type
Update clang to treat fp128 as a valid base type for homogeneous aggregate
passing and returning.

Differential Revision: https://reviews.llvm.org/D48044

llvm-svn: 336308
2018-07-05 04:32:01 +00:00
Gabor Buella bf591794f4 NFC - Fix type in builtins-ppc-p9vector.c test
llvm-svn: 336264
2018-07-04 11:29:21 +00:00
Gabor Buella ab38eb29cf NFC - typo fix in test/CodeGen/avx512f-builtins.c
llvm-svn: 336243
2018-07-04 08:32:02 +00:00
Yaron Keren 14cd1997f1 Add expected fail triple x86_64-pc-windows-gnu to test as x86_64-w64-mingw32 is already there
llvm-svn: 336047
2018-06-30 11:18:44 +00:00
Craig Topper 0029470dde [X86] Correct the width of mask arguments in intrinsic headers and tests.
All of these found by grepping through IR from the builtin tests for extra trunc and zext/sext instructions that shouldn't have been there.

Some of these were real bugs where we lost bits from the user input:
_mm512_mask_broadcast_f32x8
_mm512_maskz_broadcast_f32x8
_mm512_mask_broadcast_i32x8
_mm512_maskz_broadcast_i32x8
_mm256_mask_cvtusepi16_storeu_epi8

llvm-svn: 336042
2018-06-30 06:05:17 +00:00
Craig Topper 0e9de769a0 [X86] Remove masking from the avx512 rotate builtins. Use a select builtin instead.
llvm-svn: 336036
2018-06-30 01:32:14 +00:00
Craig Topper 8bf793fb35 [X86] Remove masking from the avx512 packed sqrt builtins. Use select builtins instead.
llvm-svn: 335945
2018-06-29 05:43:33 +00:00
Francis Visoiu Mistrih 7b7b5eb60d [NEON] Remove empty test file from r335734
Fails on Green Dragon:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50174/consoleFull

UNRESOLVED: Clang :: CodeGen/vld_dup.c (5546 of 38947)
******************** TEST 'Clang :: CodeGen/vld_dup.c' FAILED ********************
Test has no run line!

llvm-svn: 335750
2018-06-27 16:17:32 +00:00
Craig Topper 851f363691 [X86] Rename llvm.x86.avx512.mask.fpclass.p* to exclude 'mask.' from the name to match llvm.
llvm-svn: 335745
2018-06-27 15:57:57 +00:00
Ivan A. Kosarev a9f484ac4a [NEON] Support vldNq intrinsics in AArch32 (Clang part)
This patch reworks the support for dup NEON intrinsics as
described in D48439.

Differential Revision: https://reviews.llvm.org/D48440

llvm-svn: 335734
2018-06-27 13:58:43 +00:00
Teresa Johnson 25becb35e9 [ThinLTO] Add testing of summary index parsing to a couple CFI tests
Summary:
Changes to some clang side tests to go with the summary parsing patch.

Depends on D47905.

Reviewers: pcc, dexonsmith, mehdi_amini

Subscribers: inglorion, eraman, cfe-commits, steven_wu

Differential Revision: https://reviews.llvm.org/D47906

llvm-svn: 335618
2018-06-26 15:50:34 +00:00
Sam Clegg 6fd7d680b0 [WebAssembly] Add no-prototype attribute to prototype-less C functions
The WebAssembly backend in particular benefits from being
able to distinguish between varargs functions (...) and prototype-less
C functions.

Differential Revision: https://reviews.llvm.org/D48443

llvm-svn: 335510
2018-06-25 18:47:32 +00:00
Hans Wennborg 08c5a7b8fd [clang-cl] Don't emit dllexport inline functions etc. from pch files (PR37801)
With MSVC, PCH files are created along with an object file that needs to
be linked into the final library or executable. That object file
contains the code generated when building the headers. In particular, it
will include definitions of inline dllexport functions, and because they
are emitted in this object file, other files using the PCH do not need
to emit them. See the bug for an example.

This patch makes clang-cl match MSVC's behaviour in this regard, causing
significant compile-time savings when building dlls using precompiled
headers.

For example, in a 64-bit optimized shared library build of Chromium with
PCH, it reduces the binary size and compile time of
stroke_opacity_custom.obj from 9315564 bytes to 3659629 bytes and 14.6
to 6.63 s. The wall-clock time of building blink_core.dll goes from
38m41s to 22m33s. ("user" time goes from 1979m to 1142m).

Differential Revision: https://reviews.llvm.org/D48426

llvm-svn: 335466
2018-06-25 13:23:49 +00:00
Tobias Edler von Koch 7609cb83e6 Re-land "[LTO] Enable module summary emission by default for regular LTO"
Since we are now producing a summary also for regular LTO builds, we
need to run the NameAnonGlobals pass in those cases as well (the
summary cannot handle anonymous globals).

See https://reviews.llvm.org/D34156 for details on the original change.

This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3.

llvm-svn: 335385
2018-06-22 20:23:21 +00:00
Gabor Buella 716863c820 [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR
Summary:
Lowering some vector comparision builtins to fcmp IR instructions.
This ignores the signaling behaviour specified in the predicate
argument of said builtins.

Affected AVX512 builtins:

__builtin_ia32_cmpps128_mask
__builtin_ia32_cmpps256_mask
__builtin_ia32_cmpps512_mask
__builtin_ia32_cmppd128_mask
__builtin_ia32_cmppd256_mask
__builtin_ia32_cmppd512_mask

Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma

Reviewed By: craig.topper, spatel, efriedma

Differential Revision: https://reviews.llvm.org/D45616

llvm-svn: 335339
2018-06-22 11:59:16 +00:00
Chandler Carruth 16e6bc23a1 [x86] Teach the builtin argument range check to allow invalid ranges in
dead code.

This is important for C++ templates that essentially compute the valid
input in a way that is constant and will cause all the invalid cases to
be dead code that is deleted. Code in the wild actually does this and
GCC also accepts these kinds of patterns so it is important to support
it.

To make this work, we provide a non-error path to diagnose these issues,
and use a default-error warning instead. This keeps the relatively
strict handling but prevents nastiness like SFINAE on these errors. It
also allows us to safely use the system to diagnose this only when it
occurs at runtime (in emitted code).

Entertainingly, this required fixing the syntax in various other ways
for the x86 test because we never bothered to diagnose that the returns
were invalid.

Since debugging these compile failures was super confusing, I've also
improved the diagnostic to actually say what the value was. Most of the
checks I've made ignore this to simplify maintenance, but I've checked
it in a few places to make sure the diagnsotic is working.

Depends on D48462. Without that, we might actually crash some part of
the compiler after bypassing the error here.

Thanks to Richard, Ben Kramer, and especially Craig Topper for all the
help here.

Differential Revision: https://reviews.llvm.org/D48464

llvm-svn: 335309
2018-06-21 23:46:09 +00:00
Craig Topper 342b095689 [X86] Update handling in CGBuiltin to be tolerant of out of range immediates.
D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled.

This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled.

llvm-svn: 335308
2018-06-21 23:39:47 +00:00
Evgeniy Stepanov fb762b27f2 Ignore blacklist when generating __cfi_check_fail.
Summary: Fixes PR37898.

Reviewers: pcc, vlad.tsyrklevich

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D48454

llvm-svn: 335305
2018-06-21 23:22:37 +00:00
Tobias Edler von Koch e597a2cf81 Revert "[LTO] Enable module summary emission by default for regular LTO"
This is breaking a couple of buildbots. We need to run the
NameAnonGlobal pass for regular LTO now as well (since we're producing a
summary). I'll post a separate patch for review to make this happen and
then re-commit.

This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd.

llvm-svn: 335291
2018-06-21 21:24:30 +00:00
Tobias Edler von Koch 9a8be606f3 [LTO] Enable module summary emission by default for regular LTO
Summary:
With D33921, we gained the ability to have module summaries in regular
LTO modules without triggering ThinLTO compilation. Module summaries in
regular LTO allow garbage collection (dead stripping) before LTO
compilation and thus open up additional optimization opportunities.

This patch enables summary emission in regular LTO for all targets
except ld64-based ones (which use the legacy LTO API).

Reviewers: pcc, tejohnson, mehdi_amini

Subscribers: inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D34156

llvm-svn: 335284
2018-06-21 20:20:41 +00:00
Craig Topper 1763dbb278 [X86] Correct the inline assembly implementations of __movsb/w/d/q and __stosw/d/q to mark registers/memory as modified
The inline assembly for these didn't mark that edi, esi, ecx are modified by movs/stos instruction. It also didn't mark that memory is modified.

This issue was reported to llvm-dev last year http://lists.llvm.org/pipermail/cfe-dev/2017-November/055863.html but no bug was ever filed.

Differential Revision: https://reviews.llvm.org/D48448

llvm-svn: 335270
2018-06-21 18:56:30 +00:00
Anastasis Grammenos dfe8fe503c [DebugInfo] Inline for without DebugLocation
Summary:
This test is a strip down version of a function inside the
amalgamated sqlite source. When converted to IR clang produces
a phi instruction without debug location.

This patch fixes the above issue.

Differential Revision: https://reviews.llvm.org/D47720

llvm-svn: 335255
2018-06-21 16:53:48 +00:00
Craig Topper ddfe69cc99 [X86] Rewrite the add/mul/or/and reduction intrinsics to make better use of other intrinsics and remove undef shuffle indices.
Similar to what was done to max/min recently.

These already reduced the vector width to 256 and 128 bit as we go unlike the original max/min code.

Differential Revision: https://reviews.llvm.org/D48346

llvm-svn: 335253
2018-06-21 16:41:28 +00:00
Craig Topper 2da60bc231 [X86] Remove masking from the 512-bit floating point max/min builtins. Use select in IR instead.
llvm-svn: 335200
2018-06-21 05:01:01 +00:00
Sjoerd Meijer 5c4c998a54 [SPIR] Prevent SPIR targets from using half conversion intrinsics
The SPIR target currently allows for half precision floating point types to be
emitted using the LLVM intrinsic functions which convert half types to floats
and doubles. However, this is illegal in SPIR as the only intrinsic allowed by
SPIR is memcpy, as per section 3 of the SPIR specification. Currently this is
leading to an assert being hit in the Clang CodeGen when attempting to emit a
constant or literal _Float16 type in a comparison operation on a SPIR or SPIR64
target. This assert stems from the CodeGen attempting to emit a constant half
value as an integer because the backend has specified that it is using these
half conversion intrinsics (which represents half as i16). This patch prevents
SPIR targets from using these intrinsics by overloading the responsible target
info method, marks SPIR targets as having a legal half type and provides
additional regression testing for the _Float16 type on SPIR targets.

Patch by: Stephen McGroarty

Differential Revision: https://reviews.llvm.org/D48188

llvm-svn: 335111
2018-06-20 09:49:40 +00:00
Craig Topper 61495b3650 Recommit r335070 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible.""
Test has been updated to reflect the IRGen.

llvm-svn: 335075
2018-06-19 21:00:30 +00:00
Craig Topper 79b13a0348 Revert r335070 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible."
The test changes are failing the buildbot and its going to take me some time to fix it.

llvm-svn: 335072
2018-06-19 19:37:07 +00:00
Craig Topper 873afd0262 [X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible.
We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX.

For v16i32 and floating point we have legacy 128/256 bit instructions we can use.

I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization.

Differential Revision: https://reviews.llvm.org/D47401

llvm-svn: 335070
2018-06-19 19:13:54 +00:00
Tomasz Krupa f1792bb3d6 [X86] Lowering sqrt intrinsics to native IR
Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k

Reviewed By: craig.topper

Subscribers: tkrupa, cfe-commits

Differential Revision: https://reviews.llvm.org/D41168

llvm-svn: 334850
2018-06-15 18:05:59 +00:00
Luke Geeson da2b2e8c26 [AArch64] Reverted rC334696 with Clang VCVTA test fix
llvm-svn: 334820
2018-06-15 10:10:45 +00:00
Craig Topper b521dc3acf [X86] Add inline assembly versions of _InterlockedExchange_HLEAcquire/Release and _InterlockedCompareExchange_HLEAcquire/Release for MSVC compatibility.
Clang/LLVM doesn't have a way to pass an HLE hint through to the X86 backend to emit HLE prefixed instructions. So this is a good short term fix.

Differential Revision: https://reviews.llvm.org/D47672

llvm-svn: 334751
2018-06-14 18:43:52 +00:00
Tomasz Krupa 82aa42af49 [X86] Lowering Mask Scalar intrinsics to native IR (Clang part)
Summary: Lowering add, sub, mul, and div mask scalar intrinsic calls
to native IR.

Reviewers: craig.topper, RKSimon, spatel, sroland

Reviewed By: craig.topper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47979

llvm-svn: 334741
2018-06-14 17:36:23 +00:00
Luke Geeson bb399f8013 [AArch64] reverting rC334693 due to build failures
llvm-svn: 334696
2018-06-14 08:59:33 +00:00
Luke Geeson 010bbbf390 [AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A
llvm-svn: 334693
2018-06-14 08:28:56 +00:00
Mandeep Singh Grang 2d28383097 [COFF] Add ARM64 intrinsics: __yield, __wfe, __wfi, __sev, __sevl
Summary: These intrinsics result in hint instructions. They are provided here for MSVC ARM64 compatibility.

Reviewers: mstorsjo, compnerd, javed.absar

Reviewed By: mstorsjo

Subscribers: kristof.beyls, chrib, cfe-commits

Differential Revision: https://reviews.llvm.org/D48132

llvm-svn: 334639
2018-06-13 18:49:35 +00:00
Sanjay Patel 1d7ed94439 [CodeGen] make nan builtins pure rather than const (PR37778)
https://bugs.llvm.org/show_bug.cgi?id=37778
...shows a miscompile resulting from marking nan builtins as 'const'.

The nan libcalls/builtins take a pointer argument:
http://www.cplusplus.com/reference/cmath/nan-function/
...and the chars dereferenced by that arg are used to fill in the NaN constant payload bits.

"const" means that the pointer argument isn't dereferenced. That's translated to "readnone" in LLVM.
"pure" means that the pointer argument may be dereferenced. That's translated to "readonly" in LLVM.

This change prevents the IR optimizer from killing the lead-up to the nan call here:

double a() {
  char buf[4];
  buf[0] = buf[1] = buf[2] = '9';
  buf[3] = '\0';
  return __builtin_nan(buf);
}

...the optimizer isn't currently able to simplify this to a constant as we might hope, 
but this patch should solve the miscompile.

Differential Revision: https://reviews.llvm.org/D48134

llvm-svn: 334628
2018-06-13 17:54:52 +00:00
Craig Topper 2527c378c6 [X86] Remove masking from avx512vbmi2 concat and shift by immediate builtins. Use select builtins instead.
llvm-svn: 334577
2018-06-13 07:19:28 +00:00
Luke Geeson dc54b37414 [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests
Summary:
This fixes the ranges for the vcvth family of FP16 intrinsics in the clang front end. Previously it was accepting incorrect ranges
-Changed builtin range checking in SemaChecking
-added tests SemaCheck changes - included in  their own file since no similar one exists
-modified existing tests to reflect new ranges

Reviewers: SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Subscribers: kristof.beyls, cfe-commits

Differential Revision: https://reviews.llvm.org/D47592

llvm-svn: 334489
2018-06-12 09:54:27 +00:00
Mikhail Maltsev 5434047862 [Driver] Add aliases for -Qn/-Qy
This patch adds aliases for -Qn (-fno-ident) and -Qy (-fident) which
look less cryptic than -Qn/-Qy. The aliases are compatible with GCC.

Differential Revision: https://reviews.llvm.org/D48021

llvm-svn: 334414
2018-06-11 16:10:06 +00:00
Craig Topper 91bbe98757 [X86] Remove masking from dbpsadbw builtins, use select builtin instead.
llvm-svn: 334385
2018-06-11 06:18:29 +00:00
Craig Topper 3cce6a7ed9 [X86] Use target independent masked expandload and compressstore intrinsics to implement expandload/compressstore builtins.
Summary: We've had these target independent intrinsics for at least a year and a half. Looks like they do exactly what we need here and the backend already supports them.

Reviewers: RKSimon, delena, spatel, GBuella

Reviewed By: RKSimon

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D47693

llvm-svn: 334366
2018-06-10 17:27:05 +00:00
Ivan A. Kosarev 73c76c35a5 [NEON] Support VST1xN intrinsics in AArch32 mode (Clang part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47446

llvm-svn: 334362
2018-06-10 09:28:10 +00:00
Craig Topper 3614b41a8e [X86] Remove masking from the 512-bit packed floating point add/sub/mul/div builtins. Use select in IR instead.
llvm-svn: 334359
2018-06-10 06:01:42 +00:00
Craig Topper 03f4f04b91 [X86] Add builtins for vpermq/vpermpd instructions to enable target feature checking.
llvm-svn: 334311
2018-06-08 18:00:25 +00:00
Craig Topper 03de166ccd [X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking.
llvm-svn: 334265
2018-06-08 06:13:16 +00:00
Craig Topper 3428beeb2f [X86] Add subvector insert and extract builtins to enable target feature checking and immediate range checking.
Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc.

llvm-svn: 334261
2018-06-08 03:24:47 +00:00
Craig Topper acf5601961 [X86] Add builtins for vpermilps/pd instructions to enable target feature checking.
llvm-svn: 334256
2018-06-08 00:59:27 +00:00
Craig Topper 7d17d7278b [X86] Add builtins for blend with immediate control to enforce target feature requirements and check immediate range.
llvm-svn: 334249
2018-06-08 00:00:21 +00:00
Craig Topper 9392136414 [X86] Add builtins for shuff32x4/shuff64x2/shufi32x4/shuff64x2 to enable target feature checking and immediate range checking.
llvm-svn: 334244
2018-06-07 23:03:08 +00:00
Shoaib Meenai d8d1547387 [Frontend] Disallow non-MSVC exception models for windows-msvc targets
The windows-msvc target is used for MSVC ABI compatibility, including
the exceptions model. It doesn't make sense to pair a windows-msvc
target with a non-MSVC exception model. This would previously cause an
assertion failure; explicitly error out for it in the frontend instead.
This also allows us to reduce the matrix of target/exception models a
bit (see the modified tests), and we can possibly simplify some of the
personality code in a follow-up.

Differential Revision: https://reviews.llvm.org/D47853

llvm-svn: 334243
2018-06-07 22:54:54 +00:00
Reid Kleckner aa46ed9278 [MS] Re-add support for the ARM interlocked bittest intrinscs
Adds support for these intrinsics, which are ARM and ARM64 only:
  _interlockedbittestandreset_acq
  _interlockedbittestandreset_rel
  _interlockedbittestandreset_nf
  _interlockedbittestandset_acq
  _interlockedbittestandset_rel
  _interlockedbittestandset_nf

Refactor the bittest intrinsic handling to decompose each intrinsic into
its action, its width, and its atomicity.

llvm-svn: 334239
2018-06-07 21:39:04 +00:00
Craig Topper d3623155a2 [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics.
We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.

This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously.

llvm-svn: 334208
2018-06-07 17:28:03 +00:00
Gabor Buella 1a83d06768 [CodeGen] Improve diagnostics related to target attributes
Summary:
When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.

This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:

```
$ cat foo.c
static inline void __attribute__((__always_inline__, __target__("avx512f")))
x(void)
{
}

int main(void)
{
            x();
}
$ clang foo.c
foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2'
        x();
    ^
1 error generated.
```

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338

Reviewers: craig.topper, echristo, dblaikie

Reviewed By: craig.topper, echristo

Differential Revision: https://reviews.llvm.org/D46541

llvm-svn: 334174
2018-06-07 08:48:36 +00:00
Craig Topper b92c77d176 [X86] Add back _mask, _maskz, and _mask3 builtins for some 512-bit fmadd/fmsub/fmaddsub/fmsubadd builtins.
Summary:
We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't.

I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim.

The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform.

I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it.

Reviewers: tkrupa, RKSimon, spatel, GBuella

Reviewed By: tkrupa

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47724

llvm-svn: 334159
2018-06-07 02:46:02 +00:00
Evandro Menezes 0804523bd5 [PATCH 2/2] [test] Add support for Samsung Exynos M4 (NFC)
Add test cases for Exynos M4.

llvm-svn: 334116
2018-06-06 18:58:01 +00:00
Reid Kleckner 11c99ed05f [MS][ARM64]: Promote _setjmp to_setjmpex as there is no _setjmp in the ARM64 libvcruntime.lib
Factor out the common setjmp call emission code.

Based on a patch by Chris January

Differential Revision: https://reviews.llvm.org/D47784

llvm-svn: 334112
2018-06-06 18:39:47 +00:00
Reid Kleckner 368d52b7e0 Implement bittest intrinsics generically for non-x86 platforms
I tested these locally on an x86 machine by disabling the inline asm
codepath and confirming that it does the same bitflips as we do with the
inline asm.

Addresses code review feedback.

llvm-svn: 334059
2018-06-06 01:35:08 +00:00
Craig Topper f3914b74c1 [X86] Add builtins for vector element insert and extract for different 128 and 256 bit vector types. Use them to implement the extract and insert intrinsics.
Previously we were just using extended vector operations in the header file.

This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load.

By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count.

User code still has the option to use extended vector operations themselves if they need non-constant indexing.

llvm-svn: 334057
2018-06-06 00:24:55 +00:00
Reid Kleckner 1d9c249db5 Reimplement the bittest intrinsic family as builtins with inline asm
We need to implement _interlockedbittestandset as a builtin for
windows.h, so we might as well do the whole family. It reduces code
duplication anyway.

Fixes PR33188, a long standing bug in our bittest implementation
encountered by Chakra.

llvm-svn: 333978
2018-06-05 01:33:40 +00:00
Teresa Johnson e2e630b01b [ThinLTO] Add testing of new summary index format to a couple CFI tests
Summary:
Adds testing of combined index summary entries in disassembly format
to CFI tests that were already testing the bitcode format.

Depends on D46699.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D46700

llvm-svn: 333966
2018-06-04 23:05:24 +00:00
Reid Kleckner 89fbd55145 Revert r333791 "Cap "voluntary" vector alignment at 16 for all Darwin platforms."
Adding __attribute__((aligned(32))) to __m256 breaks the implementation
of _mm256_loadu_ps on Windows. On Windows, alignment attributes have
higher precedence than packing attributes.

We also might want to carefully consider the consequences of changing
our vector typedefs, since many users copy them and invent their own
new, non-Intel specific vector type names.

llvm-svn: 333958
2018-06-04 21:39:20 +00:00
Craig Topper 95ed88a1a9 [X86] Avoid passing _mm_undefined* to builtin_shufflevector if we are able to pass the first input a second time.
This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later.

llvm-svn: 333944
2018-06-04 19:28:09 +00:00
Craig Topper 6fb26f93ef [X86] Replace __builtin_ia32_vbroadcastf128_pd256 and __builtin_ia32_vbroadcastf128_ps256 with an unaligned load intrinsics and a __builtin_shufflevector call.
llvm-svn: 333853
2018-06-03 19:42:59 +00:00
Ivan A. Kosarev 9c40c0ad0c [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47121

llvm-svn: 333829
2018-06-02 17:42:59 +00:00
John McCall 280c656031 Cap "voluntary" vector alignment at 16 for all Darwin platforms.
This fixes two major problems:
- We were not capping vector alignment as desired on 32-bit ARM.
- We were using different alignments based on the AVX settings on
  Intel, so we did not have a consistent ABI.

This is an ABI break, but we think we can get away with it because
vectors tend to be used mostly in inline code (which is why not having
a consistent ABI has not proven disastrous on Intel).

Intel's AVX types are specified as having 32-byte / 64-byte alignment,
so align them explicitly instead of relying on the base ABI rule.
Note that this sort of attribute is stripped from template arguments
in template substitution, so there's a possibility that code templated
over vectors will produce inadequately-aligned objects.  The right
long-term solution for this is for alignment attributes to be
interpreted as true qualifiers and thus preserved in the canonical type.

llvm-svn: 333791
2018-06-01 21:34:26 +00:00
Dan Gohman 9f8ee03772 [WebAssembly] Update to the new names for the memory builtin functions.
The WebAssembly committee has decided on the names `memory.size` and
`memory.grow` for the memory intrinsics, so update the clang builtin
functions to follow those names, keeping both sets of old names in place
for compatibility.

llvm-svn: 333712
2018-06-01 00:05:51 +00:00
Peter Collingbourne 3aa30e8062 IRGen: Write .dwo files when -split-dwarf-file is used together with -fthinlto-index.
Differential Revision: https://reviews.llvm.org/D47597

llvm-svn: 333677
2018-05-31 18:25:59 +00:00
Craig Topper a6dd2faaea [X86] Make 512-bit unmasked load/store builtins more like their 128/256-bit equivalents.
Previously we were just passing -1 mask to the masked builtin. This changes it to the more generic way that the 128/256 bit use.

llvm-svn: 333626
2018-05-31 05:02:08 +00:00
Craig Topper c633867944 [X86] Remove __extension__ from macro intrinsics when its not needed.
I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added.

Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load.

It also removed a bogus error message on another test.

llvm-svn: 333613
2018-05-31 00:51:20 +00:00
Eric Christopher 5b91350b4a Add fopen to the list of builtins that we check and whitelist.
llvm-svn: 333594
2018-05-30 21:11:45 +00:00
Craig Topper c5ec55e921 [X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss.
We don't need the insertion back into the original vector at the end. The builtin already understands that.

This is different than _mm_sqrt_sd which takes two arguments and we do need to insert.

llvm-svn: 333572
2018-05-30 18:27:07 +00:00
Craig Topper dff5b311af [X86] Reduce the number of setzero intrinsics to just the set defined by the Intel Intrinsics Guide.
We had quite a few for different element sizes of integers sometimes with strange target features attached to them.

We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types.

llvm-svn: 333568
2018-05-30 18:02:11 +00:00
Gabor Buella 70d8d51073 [X86] Lowering FMA intrinsics to native IR (Clang part)
This patch replaces all packed (and scalar without rounding
mode) fused intrinsics with fmadd/fmaddsub variations.
Then fmadd/fmaddsub are lowered to native IR.

Patch by tkrupa

Reviewers: craig.topper, sroland, spatel, RKSimon

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D47444

llvm-svn: 333555
2018-05-30 15:27:49 +00:00
Simon Tatham 89e31fa7fc Support __iso_volatile_load8 etc on aarch64-win32.
These intrinsics are used by MSVC's header files on AArch64 Windows as
well as AArch32, so we should support them for both targets. I've
factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate
functions that EmitAArch64BuiltinExpr can call as well.

Reviewers: javed.absar, mstorsjo

Reviewed By: mstorsjo

Subscribers: kristof.beyls, cfe-commits

Differential Revision: https://reviews.llvm.org/D47476

llvm-svn: 333513
2018-05-30 07:54:05 +00:00
Daniel Cederman 8cc53aecad [Sparc] Add floating-point register names
Reviewers: jyknight

Reviewed By: jyknight

Subscribers: eraman, fedor.sergeev, jrtc27, cfe-commits

Differential Revision: https://reviews.llvm.org/D47137

llvm-svn: 333510
2018-05-30 06:02:18 +00:00
Craig Topper f6e79c6d3f [X86] Remove masking from the AVX512VNNI builtins. Use a select in IR instead.
llvm-svn: 333509
2018-05-30 05:26:04 +00:00
Craig Topper 3d9305f28b [X86] Fix the names of a bunch of icelake intrinsics.
Mostly this fixes the names of all the 128-bit intrinsics to start with _mm_ instead of _mm128_ as is the convention and what the Intel docs say.

This also fixes the name of the bitshuffle intrinsics to say epi64 for 128 and 256 bit versions.

llvm-svn: 333497
2018-05-30 03:38:15 +00:00
Craig Topper 68a272d501 [X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins to a single version without masking. Use select builtins with appropriate operand instead.
llvm-svn: 333387
2018-05-29 03:26:38 +00:00
Craig Topper f99532faee Revert r333347 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible."
This wasn't supposed to be commited yet.

llvm-svn: 333349
2018-05-26 18:57:41 +00:00
Craig Topper 387b1423db [X86] Remove mask from avx512ifma builtins. Use a select instruction instead.
This reduces from 12 builtins to 6 since we no longer need a mask and maskz version.

llvm-svn: 333348
2018-05-26 18:55:26 +00:00
Craig Topper e091523c82 [X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible.
Summary:
We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX.

For v16i32 and floating point we have legacy 128/256 bit instructions we can use.

I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization.

Reviewers: RKSimon, spatel, GBuella

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D47401

llvm-svn: 333347
2018-05-26 18:55:24 +00:00
Paul Robinson 76178632a2 Revert "[DebugInfo] Don't bother with MD5 checksums of preprocessed files."
This reverts commit d734f2aa3f76fbf355ecd2bbe081d0c1f49867ab.
Also known as r333311.  A very small but nonzero number of bots fail.

llvm-svn: 333319
2018-05-25 22:35:59 +00:00
Paul Robinson 638d606f83 [DebugInfo] Don't bother with MD5 checksums of preprocessed files.
The checksum will not reflect the real source, so there's no clear
reason to include them in the debug info.  Also this was causing a
crash on the DWARF side.

Differential Revision: https://reviews.llvm.org/D47260

llvm-svn: 333311
2018-05-25 20:59:29 +00:00
Gabor Buella 078bb99a90 [x86] invpcid intrinsic
An intrinsic for an old instruction, as described in the Intel SDM.

Reviewers: craig.topper, rnk

Reviewed By: craig.topper, rnk

Differential Revision: https://reviews.llvm.org/D47142

llvm-svn: 333256
2018-05-25 06:34:42 +00:00
Gabor Buella 5219ed89be [X86] NFC Include immintrin.h in CodeGen tests
Following r333110:
"Move all Intel defined intrinsic includes into immintrin.h"

llvm-svn: 333160
2018-05-24 07:09:08 +00:00
Eric Christopher 85f0e505e7 Add Builtins.def support for fread and fwrite to ensure that -fno-builtin-
works with them and test accordingly.

llvm-svn: 333156
2018-05-24 06:09:28 +00:00
Eric Christopher c27ad9bbc8 Migrate libcalls-fno-builtin.c test from checking optimized assembly
to checking for attributes on the call site - and fix up builtin
functions that we were testing for but not ensuring wouldn't be
optimized by the backend.

Leave one set of asm tests to make sure that we're also communicating
builtin-ness to TLI.

llvm-svn: 333154
2018-05-24 06:00:50 +00:00
Richard Smith 3e268632cf Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.

This re-commits r333044 with a fix for PR37560.

llvm-svn: 333141
2018-05-23 23:41:38 +00:00
Hans Wennborg 156349fa10 Revert r333044 "Use zeroinitializer for (trailing zero portion of) large array initializers"
It caused asserts, see PR37560.

> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166

llvm-svn: 333067
2018-05-23 08:24:01 +00:00
Craig Topper 39e0347e6a [X86] In the floating point max reduction intrinsics, negate infinity before feeding it to set1.
Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant.

llvm-svn: 333062
2018-05-23 05:51:52 +00:00
Craig Topper f2043b08b4 [X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
llvm-svn: 333061
2018-05-23 04:51:54 +00:00
Richard Smith 9062bbf419 Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.

Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.

In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.

Differential Revision: https://reviews.llvm.org/D47166

llvm-svn: 333044
2018-05-23 00:09:29 +00:00
Sanjay Patel 74c7fb002f [CodeGen] use nsw negation for builtin abs
The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."

That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.

Differential Revision: https://reviews.llvm.org/D47202

llvm-svn: 333038
2018-05-22 23:02:13 +00:00
Peter Collingbourne 91d02844a3 Reland r332885, "CodeGen, Driver: Start using direct split dwarf emission in clang."
As well as two follow-on commits r332906, r332911 with a fix for
test clang/test/CodeGen/split-debug-filename.c.

llvm-svn: 333013
2018-05-22 18:52:37 +00:00
Sanjay Patel 1ff6b27940 [CodeGen] produce the LLVM canonical form of abs
We chose the 'slt' form as canonical in IR with:
rL332819
...so we should generate that form directly for efficiency.

llvm-svn: 332989
2018-05-22 15:36:50 +00:00
Sanjay Patel b6e5d4ead1 [CodeGen] add tests for abs builtins; NFC
llvm-svn: 332988
2018-05-22 15:11:59 +00:00
Brock Wyma 8557ec5d64 [CodeView] Enable debugging of captured variables within C++ lambdas
This change will help Visual Studio resolve forward references to C++ lambda
routines used by captured variables.

Differential Revision: https://reviews.llvm.org/D45438

llvm-svn: 332975
2018-05-22 12:41:19 +00:00
Amara Emerson f528bcc32a Revert "CodeGen, Driver: Start using direct split dwarf emission in clang."
This reverts commit r332885 as it broke several greendragon buildbots.

llvm-svn: 332973
2018-05-22 11:18:58 +00:00
Craig Topper 9efb77e25f [X86] Remove a builtin that should have been removed in r332882.
llvm-svn: 332909
2018-05-21 22:10:02 +00:00
Craig Topper 288bd2e5a0 [X86] Remove masking from pternlog llvm intrinsics and use a select instruction instead.
Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions.

To avoid that just generate IR directly in CGBuiltin.cpp.

Differential Revision: https://reviews.llvm.org/D47125

llvm-svn: 332891
2018-05-21 20:58:23 +00:00
Richard Smith 3f1d6de4f7 Revert r332847; it caused us to miscompile certain forms of reference initialization.
llvm-svn: 332886
2018-05-21 20:36:58 +00:00
Peter Collingbourne 47bc01786d CodeGen, Driver: Start using direct split dwarf emission in clang.
Fixes PR37466.

Differential Revision: https://reviews.llvm.org/D47093

llvm-svn: 332885
2018-05-21 20:31:59 +00:00
Craig Topper 842171de36 [X86] Use __builtin_convertvector to implement some of the packed integer to packed float conversion intrinsics.
I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions.

We already do something similar for scalar conversions.

Differential Revision: https://reviews.llvm.org/D46863

llvm-svn: 332882
2018-05-21 20:19:17 +00:00
Serge Pavlov 9f8068420a [CodeGen] Recognize more cases of zero initialization
If a variable has an initializer, codegen tries to build its value. If
the variable is large in size, building its value requires substantial
resources. It causes strange behavior from user viewpoint: compilation
of huge zero initialized arrays like:

    char data_1[2147483648u] = { 0 };

consumes enormous amount of time and memory.

With this change codegen tries to determine if variable initializer is
equivalent to zero initializer. In this case variable value is not
constructed.

This change fixes PR18978.

Differential Revision: https://reviews.llvm.org/D46241

llvm-svn: 332847
2018-05-21 16:09:54 +00:00
Craig Topper 55b4067350 [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all the builtins.

llvm-svn: 332825
2018-05-20 23:34:10 +00:00
Alexander Ivchenko 0fb8c877c4 This patch aims to match the changes introduced
in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html.
The -mibt feature flag is being removed, and the -fcf-protection
option now also defines a CET macro and causes errors when used
on non-X86 targets, while X86 targets no longer check for -mibt
and -mshstk to determine if -fcf-protection is supported. -mshstk
is now used only to determine availability of shadow stack intrinsics.

Comes with an LLVM patch (D46882).

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46881

llvm-svn: 332704
2018-05-18 11:56:21 +00:00
Peter Smith 84a9c481f5 [AArch64] Correct inline assembly test case for S modifier [NFC]
The existing test for the AArch64 inline assembly constraint S uses the
A and L modifiers. These modifiers were implemented in the original
AArch64 backend but were not carried forward to the merged backend. The
A is associated with ADRP and does nothing, the L is associated with
:lo12: . Given that A and L are not supported by GCC and not supported
by the new implementation of constraint S in LLVM (see D46745) I've
altered the test to put :lo12: directly in the string so that A and L
are not needed.

Differential Revision: https://reviews.llvm.org/D46932

llvm-svn: 332606
2018-05-17 13:17:33 +00:00
Craig Topper 9d146bbaf7 [X86] Revert part of r332266: Use __builtin_convertvector to replace some of the avx512 truncate builtins.
The masking doesn't work right in the backend for the ones that produce byte or word elements without avx512bw.

llvm-svn: 332322
2018-05-15 03:17:52 +00:00
Craig Topper 25de41cfbc [X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins.
As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend.

Differential Revision: https://reviews.llvm.org/D46742

llvm-svn: 332266
2018-05-14 17:50:40 +00:00
Craig Topper 8cb261e353 [X86] Use select instrution and fpextend in the implementation of _mm512_mask_cvtps_pd and _mm512_maskz_cvtps_pd.
llvm-svn: 332213
2018-05-14 04:57:46 +00:00
Craig Topper daaf105f86 [X86] Use __builtin_convertvector to implement _mm512_cvtps_pd.
If we're using default rounding mode we can let __builtin_convertvector to generate an fpextend. This matches 128 and 256 bit.

If we're using the version that takes an explicit rounding mode argument we would need to look at the immediate to see if its CUR_DIRECTION.

llvm-svn: 332210
2018-05-14 04:05:06 +00:00
Craig Topper 6fa91254e4 [X86] Emit better code for _mm_cvtu32_sd, _mm_cvtu64_sd, _mm_cvtu32_ss, and _mm_cvtu64_ss.
We can use direct C code for these that will use uitofp and insertelement instructions.

For the versions that take an explicit rounding mode we can't do this.

llvm-svn: 332203
2018-05-13 23:03:30 +00:00
Elena Demikhovsky d31327d505 Added atomic_fetch_min, max, umin, umax intrinsics to clang.
These intrinsics work exactly as all other atomic_fetch_* intrinsics and allow to create *atomicrmw* with ordering.
Updated the clang-extensions document.

Differential Revision: https://reviews.llvm.org/D46386

llvm-svn: 332193
2018-05-13 07:45:58 +00:00
Krzysztof Parzyszek 458506871a [Hexagon] Implement checking arguments of builtin calls
llvm-svn: 332105
2018-05-11 16:41:51 +00:00
Gabor Buella 9cd4f16601 [X86] Assume alignment of movdir64b dst argument
Reviewers: craig.topper

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D46683

llvm-svn: 332091
2018-05-11 14:22:04 +00:00
Gabor Buella 3a7571259e [X86] ptwrite intrinsic
Reviewers: craig.topper, RKSimon

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D46540

llvm-svn: 331962
2018-05-10 07:28:54 +00:00
Craig Topper 74ac0eda68 [X86] Change the implementation of scalar masked load/store intrinsics to not use a 512-bit intermediate vector.
This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs.

Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well.

llvm-svn: 331958
2018-05-10 05:43:43 +00:00
Craig Topper 2b248849ae [Builtins] Improve the IR emitted for MSVC compatible rotr/rotl builtins to match what the middle and backends understand
Previously we emitted something like

rotl(x, n) {
  n &= bitwidth-1;
  return n != 0 ? ((x << n) | (x >> (bitwidth - n)) : x;
}

We use a select to avoid the undefined behavior on the (bitwidth - n) shift.

The middle and backend don't really recognize this as a rotate and end up emitting a cmov or control flow because of the select.

A better pattern is (x << (n & mask)) | (x << (-n & mask)) where mask is bitwidth - 1.

Fixes the main complaint in PR37387. There's still some work to be done if the user writes that sequence directly on a short or char where type promotion rules can prevent it from being recognized. The builtin is emitting direct IR with unpromoted types so that isn't a problem for it.

Differential Revision: https://reviews.llvm.org/D46656

llvm-svn: 331943
2018-05-10 00:05:13 +00:00
Manoj Gupta 4fbf84c173 [Clang] Implement function attribute no_stack_protector.
Summary:
This attribute tells clang to skip this function from stack protector
when -stack-protector option is passed.
GCC option for this is:
__attribute__((__optimize__("no-stack-protector"))) and the
equivalent clang syntax would be: __attribute__((no_stack_protector))

This is used in Linux kernel to selectively disable stack protector
in certain functions.

Reviewers: aaron.ballman, rsmith, rnk, probinson

Reviewed By: aaron.ballman

Subscribers: probinson, srhines, cfe-commits

Differential Revision: https://reviews.llvm.org/D46300

llvm-svn: 331925
2018-05-09 21:41:18 +00:00
Hans Wennborg ef2f6948be Revert r331843 "[DebugInfo] Generate debug information for labels."
It broke the Chromium build (see reply on the review).

> Generate DILabel metadata and call llvm.dbg.label after label
> statement to associate the metadata with the label.
>
> Differential Revision: https://reviews.llvm.org/D45045
>
> Patch by Hsiangkai Wang.

This doesn't revert the change to backend-unsupported-error.ll
that seems to correspond to an llvm-side change.

llvm-svn: 331861
2018-05-09 09:29:58 +00:00
JF Bastien 801fca259e _Atomic of empty struct shouldn't assert
Summary:

An _Atomic of an empty struct is pretty silly. In general we just widen empty
structs to hold a byte's worth of storage, and we represent size and alignment
as 0 internally and let LLVM figure out what to do. For _Atomic it's a bit
different: the memory model mandates concrete effects occur when atomic
operations occur, so in most cases actual instructions need to get emitted. It's
really not worth trying to optimize empty struct atomics by figuring out e.g.
that a fence would do, even though sane compilers should do optimize atomics.
Further, wg21.link/p0528 will fix C++20 atomics with padding bits so that
cmpxchg on them works, which means that we'll likely need to do the zero-init
song and dance for empty atomic structs anyways (and I think we shouldn't
special-case this behavior to C++20 because prior standards are just broken).

This patch therefore makes a minor change to r176658 "Promote atomic type sizes
up to a power of two": if the width of the atomic's value type is 0, just use 1
byte for width and leave alignment as-is (since it should never be zero, and
over-aligned zero-width structs are weird but fine).

This fixes an assertion:
   (NumBits >= MIN_INT_BITS && "bitwidth too small"), function get, file ../lib/IR/Type.cpp, line 241.

It seems like this has run into other assertions before (namely the unreachable
Kind check in ImpCastExprToType), but I haven't reproduced that issue with
tip-of-tree.

<rdar://problem/39678063>

Reviewers: arphaman, rjmccall

Subscribers: aheejin, cfe-commits

Differential Revision: https://reviews.llvm.org/D46613

llvm-svn: 331845
2018-05-09 03:51:12 +00:00
Shiva Chen 667fbe2cb0 [DebugInfo] Generate debug information for labels.
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.

Differential Revision: https://reviews.llvm.org/D45045

Patch by Hsiangkai Wang.

llvm-svn: 331843
2018-05-09 02:41:56 +00:00
Craig Topper 45fc2c83e6 [X86] Use target feature defines in tests instead of defining our own flag on the command line. NFCI
llvm-svn: 331683
2018-05-07 21:47:13 +00:00
Teresa Johnson fedd39045f Add -target to address errors in test from r331592
The error turns out to be:
Assertion failed: (Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n"), function init, file /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/lib/CodeGen/MachineFunction.cpp, line 180.

Add -target to address this. Also re-enable the test I had temporarily
commented, and move it further down in case there is still a failure
(since it pipes stderr to FileCheck).

llvm-svn: 331597
2018-05-05 16:37:31 +00:00
Teresa Johnson 1237b3acc9 Skip part of test added in r331592 to help debug bot failures
Trying to debug why/where a few bots getting exit code 256 e.g.
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/48471/testReport/Clang/CodeGen/thinlto_diagnostic_handler_remarks_with_hotness_ll/

and a few windows bots getting no output from that RUN line e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11865/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Athinlto-diagnostic-handler-remarks-with-hotness.ll

llvm-svn: 331596
2018-05-05 15:54:57 +00:00
Teresa Johnson 259f8ddff5 Add required target to address bot failures from r331592
Failing on non-x86 bots, needs x86 target for code gen.

llvm-svn: 331593
2018-05-05 15:15:04 +00:00
Teresa Johnson 66744f8137 [ThinLTO] Support opt remarks options with distributed ThinLTO backends
Summary:
Passes down the necessary code ge options to the LTO Config to enable
-fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO
backend for a distributed build.

Also, remove warning about not having PGO when the input is IR.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D46464

llvm-svn: 331592
2018-05-05 14:37:29 +00:00
Chandler Carruth 9325c38fdb [gcov] Make the CLang side coverage test work with the new
instrumentation codegeneration strategy of using a data structure and
a loop. Required some finesse to get the critical things being tested to
surface in a nice way for FileCheck but I think this preserves the
original intent of the test.

llvm-svn: 331411
2018-05-02 22:57:20 +00:00
Shoaib Meenai c4cf3daad8 [ARM] Remove redundant #if in test. NFC
Both sides of this #if #include the same file. Drop the #if, leaving only the #include.

Patch by Matt Glazar.

Differential Revision: https://reviews.llvm.org/D45779

llvm-svn: 331305
2018-05-01 20:38:05 +00:00
Danil Malyshev cd3fd82da3 Update existed CodeGen TBAA tests
Reviewers: hfinkel, kosarev, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D44616

llvm-svn: 331292
2018-05-01 18:14:36 +00:00
Gabor Buella a51e0c2243 [X86] directstore and movdir64b intrinsics
Reviewers: spatel, craig.topper, RKSimon

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45984

llvm-svn: 331249
2018-05-01 10:05:42 +00:00
Nirav Dave 6c0665e221 [MC] Change AsmParser to leverage Assembler during evaluation
Teach AsmParser to check with Assembler for when evaluating constant
expressions.  This improves the handing of preprocessor expressions
that must be resolved at parse time. This idiom can be found as
assembling-time assertion checks in source-level assemblers. Note that
this relies on the MCStreamer to keep sufficient tabs on Section /
Fragment information which the MCAsmStreamer does not. As a result the
textual output may fail where the equivalent object generation would
pass. This can most easily be resolved by folding the MCAsmStreamer
and MCObjectStreamer together which is planned for in a separate
patch.

Currently, this feature is only enabled for assembly input, keeping IR
compilation consistent between assembly and object generation.

Reviewers: echristo, rnk, probinson, espindola, peter.smith

Reviewed By: peter.smith

Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45164

llvm-svn: 331218
2018-04-30 19:22:40 +00:00
Sanjay Patel c81450e29b [Driver, CodeGen] rename options to disable an FP cast optimization
As suggested in the post-commit thread for rL331056, we should match these 
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs, 
and we can improve the descriptive text in the docs.

So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.

Differential Revision: https://reviews.llvm.org/D46236

llvm-svn: 331209
2018-04-30 18:19:03 +00:00
Sanjay Patel d175476566 [Driver, CodeGen] add options to enable/disable an FP cast optimization
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )

We need a way to opt-out of a float-to-int-to-float cast optimization because too much 
existing code relies on the platform-specific undefined result of those casts when the 
float-to-int overflows.

The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951

Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that 
catches this problem:
rL330958

Differential Revision: https://reviews.llvm.org/D46135

llvm-svn: 331041
2018-04-27 14:22:48 +00:00
Oliver Stannard 2fcee8bd52 [ARM,AArch64] Add intrinsics for dot product instructions
The ACLE spec which describes these intrinsics hasn't been published yet, but
this is based on the final draft which will be published soon, and these have
already been implemented by GCC.

Differential revision: https://reviews.llvm.org/D46109

llvm-svn: 331039
2018-04-27 14:03:32 +00:00
Chandler Carruth 16429acacb [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics
The LLVM commit introduces a crash in LLVM's instruction selection.

I filed http://llvm.org/PR37260 with the test case.

llvm-svn: 330997
2018-04-26 21:46:01 +00:00
Craig Topper e95bde33df [X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 intrinsics to match icc.
On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld.

Fixes PR37140.

llvm-svn: 330923
2018-04-26 05:38:39 +00:00
Eli Friedman e54d0ff400 [TargetInfo] Sort target features before passing them to the backend
Passing the features in random order will lead to unpredictable results
when some of the features are related (like the architecture-version
features on ARM).

It might be possible to fix this particular case in the ARM target code,
to avoid adding overlapping target features. But we should probably be
sorting in any case: the behavior shouldn't depend on StringMap's
hashing algorithm.

Differential Revision: https://reviews.llvm.org/D46030

llvm-svn: 330861
2018-04-25 19:14:05 +00:00
Paul Semel 80daae2736 add check for long double for __builtin_dump_struct
llvm-svn: 330808
2018-04-25 10:09:20 +00:00
Craig Topper ce281a41b5 [X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics.
The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either.

llvm-svn: 330681
2018-04-24 03:36:08 +00:00
Mikhail Maltsev 4a4e7a31ad [CodeGen] Reland r330442: Add an option to suppress output of llvm.ident
The test case in the original patch was overly contrained and
failed on PPC targets.

llvm-svn: 330575
2018-04-23 10:08:46 +00:00
Tim Northover 9dc1d0c74e [Atomics] warn about atomic accesses using libcalls
If an atomic variable is misaligned (and that suspicion is why Clang emits
libcalls at all) the runtime support library will have to use a lock to safely
access it, with potentially very bad performance consequences. There's a very
good chance this is unintentional so it makes sense to issue a warning.

Also give it a named group so people can promote it to an error, or disable it
if they really don't care.

llvm-svn: 330566
2018-04-23 08:16:24 +00:00
Gabor Buella eba6c42e66 [X86] WaitPKG intrinsics
Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45254

llvm-svn: 330463
2018-04-20 18:44:33 +00:00
Mikhail Maltsev 42b2a0e162 Revert r330442, CodeGen/no-ident-version.c is failing on PPC
llvm-svn: 330451
2018-04-20 17:14:39 +00:00
Mikhail Maltsev 6550c13912 [CodeGen] Add an option to suppress output of llvm.ident
Summary:
By default Clang outputs its version (including git commit hash, in
case of trunk builds) into object and assembly files. It might be
useful to have an option to disable this, especially for debugging
purposes.
This patch implements new command line flags -Qn and -Qy (the names
are chosen for compatibility with GCC). -Qn disables output of
the 'llvm.ident' metadata string and the 'producer' debug info. -Qy
(enabled by default) does the opposite.

Reviewers: faisalv, echristo, aprantl

Reviewed By: aprantl

Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01

Differential Revision: https://reviews.llvm.org/D45255

llvm-svn: 330442
2018-04-20 16:29:03 +00:00
Hans Wennborg a417362c28 Fix some tests that were failing on Windows
llvm-svn: 330441
2018-04-20 15:33:44 +00:00
Saleem Abdulrasool 3fe5b7a497 Implement proper support for `-falign-functions`
This implements support for the previously ignored flag
`-falign-functions`.  This allows the frontend to request alignment on
function definitions in the translation unit where they are not
explicitly requested in code.  This is compatible with the GCC behaviour
and the ICC behaviour.

The scalar value passed to `-falign-functions` aligns functions to a
power-of-two boundary.  If flag is used, the functions are aligned to
16-byte boundaries.  If the scalar is specified, it must be an integer
less than or equal to 4096.  If the value is not a power-of-two, the
driver will round it up to the nearest power of two.

llvm-svn: 330378
2018-04-19 23:14:57 +00:00
Ivan A. Kosarev 9b20c245ca [NEON] Define vfma_n_f32() and vfmaq_n_f32() intrinsics in AArch32 mode
Differential Revision: https://reviews.llvm.org/D45670

llvm-svn: 330336
2018-04-19 15:27:28 +00:00
Erich Keane b127a39404 Fix __attribute__((force_align_arg_pointer)) misalignment bug
The force_align_arg_pointer attribute was using a hardcoded 16-byte
alignment value which in combination with -mstack-alignment=32 (or
larger) would produce a misaligned stack which could result in crashes
when accessing stack buffers using aligned AVX load/store instructions.

Fix the issue by using the "stackrealign" function attribute instead
of using a hardcoded 16-byte alignment.

Patch By: Gramner

Differential Revision: https://reviews.llvm.org/D45812

llvm-svn: 330331
2018-04-19 14:27:05 +00:00
Alexander Ivchenko d96ddccdb4 Lowering x86 adds/addus/subs/subus intrinsics (clang)
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D44786

llvm-svn: 330323
2018-04-19 12:15:11 +00:00
Artem Belevich 0ae8590354 [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.

Differential Revision: https://reviews.llvm.org/D45068

llvm-svn: 330296
2018-04-18 21:51:48 +00:00
Ivan A. Kosarev 1243ebdcdb Revert r330195 "[NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only".
Differential Revision: https://reviews.llvm.org/D45668

llvm-svn: 330248
2018-04-18 12:02:49 +00:00
Keith Wyss f437e35671 [XRay] Add clang builtin for xray typed events.
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.

This change depends on D45633 for the intrinsic definition.

Reviewers: dberris, pelikan, rnk, eizan

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D45716

llvm-svn: 330220
2018-04-17 21:32:43 +00:00
Teresa Johnson 005aadaa0d Require shell for test
Attempt to fix windows bot which doesn't like the "(cd .." invocation
added in r330194:
http://lab.llvm.org:8011/builders/clang-with-thin-lto-windows/builds/8704/steps/stage%202%20check/logs/stdio

llvm-svn: 330212
2018-04-17 20:36:51 +00:00
Akira Hatanaka 617e26152d Add a command line option 'fregister_global_dtors_with_atexit' to
register destructor functions annotated with __attribute__((destructor))
using __cxa_atexit or atexit.

Register destructor functions annotated with __attribute__((destructor))
calling __cxa_atexit in a synthesized constructor function instead of
emitting references to the functions in a special section.

The primary reason for adding this option is that we are planning to
deprecate the __mod_term_funcs section on Darwin in the future. This
feature is enabled by default only on Darwin. Users who do not want this
can use command line option 'fno_register_global_dtors_with_atexit' to
disable it.

rdar://problem/33887655

Differential Revision: https://reviews.llvm.org/D45578

llvm-svn: 330199
2018-04-17 18:41:52 +00:00
Ivan A. Kosarev b3b87c3314 [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only
Differential Revision: https://reviews.llvm.org/D45668

llvm-svn: 330195
2018-04-17 16:43:07 +00:00
Teresa Johnson 9e4321c12d [ThinLTO] Pass -save-temps to LTO backend for distributed ThinLTO builds
Summary:
The clang driver option -save-temps was not passed to the LTO config,
so when invoking the ThinLTO backends via clang during distributed
builds there was no way to get LTO to save temp files.

Getting this to work with ThinLTO distributed builds also required
changing the driver to avoid a separate compile step to emit unoptimized
bitcode when the input was already bitcode under -save-temps. Not only is
this unnecessary in general, it is problematic for ThinLTO backends since
the temporary bitcode file to the backend would not match the module path
in the combined index, leading to incorrect ThinLTO backend index-based
optimizations.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D45217

llvm-svn: 330194
2018-04-17 16:39:25 +00:00
Aaron Ballman fe93546b11 Add modifiers for unsigned char and signed char field printing for __builtin_dump_struct.
Patch by Paul Semel.

llvm-svn: 330188
2018-04-17 14:00:06 +00:00
Aaron Ballman b6a7702297 Add checks for format specifiers used by __builtin_dump_struct and added a new specifier for null-terminated constant strings.
Patch by Paul Semel.

llvm-svn: 330185
2018-04-17 11:57:47 +00:00
Eli Friedman 642a5ee1c1 [ARM] Compute a target feature which corresponds to the ARM version.
Currently, the interaction between the triple, the CPU, and the
supported features is a mess: the driver edits the triple to indicate
the supported architecture version, and the LLVM backend uses this to
figure out what instructions are legal.  This makes it difficult to
understand what's happening, and makes it impossible to LTO together two
modules with different computed architectures.

Instead of relying on triple rewriting to get the correct target
features, we should add the right target features explicitly.

Differential Revision: https://reviews.llvm.org/D45240

llvm-svn: 330169
2018-04-16 23:52:58 +00:00
Andrey Konovalov 1ba9d9c6ca hwasan: add -fsanitize=kernel-hwaddress flag
This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables
-hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff.

Differential Revision: https://reviews.llvm.org/D45046

llvm-svn: 330044
2018-04-13 18:05:21 +00:00
Ivan A. Kosarev 9cdb2c75d9 [NEON] Support vrndns_f32 intrinsic
Differential Revision: https://reviews.llvm.org/D45515

llvm-svn: 330012
2018-04-13 12:46:02 +00:00
Gabor Buella b220dd2b6c [X86] Introduce cldemote intrinsic
Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45257

llvm-svn: 329993
2018-04-13 07:37:24 +00:00
Dean Michael Berris 488f7c2b67 [XRay][clang] Add flag to choose instrumentation bundles
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:

- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation

These can be combined either as comma-separated values, or as
repeated flag values.

Reviewers: echristo, kpw, eizan, pelikan

Reviewed By: pelikan

Subscribers: mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D44970

llvm-svn: 329985
2018-04-13 02:31:58 +00:00
Eli Friedman 01d349bab1 Remove -cc1 option "-backend-option".
It means the same thing as -mllvm; there isn't any reason to have two
options which do the same thing.

Differential Revision: https://reviews.llvm.org/D45109

llvm-svn: 329965
2018-04-12 22:21:36 +00:00
Gabor Buella e708a09e21 [X86] Introduce wbinvd intrinsic
A previously missing intrinsic for an old instruction.

Reviewers: craig.topper, echristo

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45311

llvm-svn: 329937
2018-04-12 18:42:02 +00:00
Gabor Buella a052016ef2 [x86] wbnoinvd intrinsic
The WBNOINVD instruction writes back all modified
cache lines in the processor’s internal cache to main memory
but does not invalidate (flush) the internal caches.

Reviewers: craig.topper, zvi, ashlykov

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D43817

llvm-svn: 329848
2018-04-11 20:09:09 +00:00
Shoaib Meenai 34aa13169b [CodeGen] Handle __func__ inside __finally
When we enter a __finally block, the CGF's CurCodeDecl will be null
(because CodeGenFunction::StartFunction is given an empty GlobalDecl for
a __finally block), and so the dyn_cast here will result in an assertion
failure. Change it to dyn_cast_or_null to handle this case.

Differential Revision: https://reviews.llvm.org/D45523

llvm-svn: 329836
2018-04-11 18:17:35 +00:00
Artem Belevich 24e8a680e5 [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829
2018-04-11 17:51:19 +00:00
Ivan A. Kosarev 2f326d453f [NEON] Support vfma_n and vfms_n intrinsics
Differential Revision: https://reviews.llvm.org/D45483

llvm-svn: 329814
2018-04-11 14:43:11 +00:00
Craig Topper 2575454fe9 [X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked intrinsic and a select.
This makes it consistent with the 128/256-bit functions.

Someday maybe we'll have all the masking moved to selects.

llvm-svn: 329775
2018-04-11 04:55:10 +00:00
Aaron Ballman 0652534131 Introduce a new builtin, __builtin_dump_struct, that is useful for dumping structure contents at runtime in circumstances where debuggers may not be easily available (such as in kernel work).
Patch by Paul Semel.

llvm-svn: 329762
2018-04-10 21:58:13 +00:00
Craig Topper 298e1712d8 [X86] Add test case for llvm change r329734
This test ensures the popfd instruction in MS inline assembly can properly find a clobber name for the dirflag register. Previously the register was named 'DF', but it needs to be named 'dirflag' to match the name in the GCC register name list.

llvm-svn: 329738
2018-04-10 18:43:44 +00:00
Gabor Buella 58fe46d99f CodeGen tests - typo fixes NFC
llvm-svn: 329689
2018-04-10 11:20:05 +00:00
Vitaly Buka 69a2e18b4a asan: kernel: make no_sanitize("address") attribute work with -fsanitize=kernel-address
Summary:
Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address")  disable both ASan and KASan instrumentation. Also remove redundant test.

Patch by Andrey Konovalov

Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D44981

llvm-svn: 329612
2018-04-09 20:10:29 +00:00
Craig Topper 304edc1e75 [X86] Emit native IR for pmuldq/pmuludq builtins.
I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts.

Differential Revision: https://reviews.llvm.org/D45421

llvm-svn: 329605
2018-04-09 19:17:54 +00:00
Dean Michael Berris 20dc6ef746 [XRay][llvm+clang] Consolidate attribute list files
Summary:
This change consolidates the always/never lists that may be provided to
clang to externally control which functions should be XRay instrumented
by imbuing attributes. The files follow the same format as defined in
https://clang.llvm.org/docs/SanitizerSpecialCaseList.html for the
sanitizer blacklist.

We also deprecate the existing `-fxray-instrument-always=` and
`-fxray-instrument-never=` flags, in favour of `-fxray-attr-list=`.

This fixes http://llvm.org/PR34721.

Reviewers: echristo, vlad.tsyrklevich, eugenis

Reviewed By: vlad.tsyrklevich

Subscribers: llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D45357

llvm-svn: 329543
2018-04-09 04:02:09 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Manoj Gupta 4b3eefa5e8 Disable -fmerge-all-constants as default.
Summary:
"-fmerge-all-constants" is a non-conforming optimization and should not
be the default. It is also causing miscompiles when building Linux
Kernel (https://lkml.org/lkml/2018/3/20/872).

Fixes PR18538.

Reviewers: rjmccall, rsmith, chandlerc

Reviewed By: rsmith, chandlerc

Subscribers: srhines, cfe-commits

Differential Revision: https://reviews.llvm.org/D45289

llvm-svn: 329300
2018-04-05 15:29:52 +00:00
Vlad Tsyrklevich e55aa03ad4 Add the -fsanitize=shadow-call-stack flag
Summary:
Add support for the -fsanitize=shadow-call-stack flag which causes clang
to add ShadowCallStack attribute to functions compiled with that flag
enabled.

Reviewers: pcc, kcc

Reviewed By: pcc, kcc

Subscribers: cryptoad, cfe-commits, kcc

Differential Revision: https://reviews.llvm.org/D44801

llvm-svn: 329122
2018-04-03 22:33:53 +00:00
Rafael Espindola b2c47fbf94 Set dso_local on cfi_slowpath.
llvm-svn: 328836
2018-03-29 22:08:01 +00:00
Manoj Gupta cb668d8512 [AArch64]: Add support for parsing rN registers.
Summary:
Allow rN registers to be simply parsed as correspoing xN registers.
The "register ... asm("rN")" is an command to the
compiler's register allocator, not an operand to any individual assembly
instruction. GCC documents this syntax as "...the name of the register
that should be used."

This is needed to support the changes in Linux kernel (see
https://lkml.org/lkml/2018/3/1/268 )

Note: This will add support only for the limited use case of
register ... asm("rN"). Any other uses that make rN leak into assembly
are not supported.

Reviewers: kristof.beyls, rengolin, peter.smith, t.p.northover

Reviewed By: peter.smith

Subscribers: javed.absar, eraman, cfe-commits, srhines

Differential Revision: https://reviews.llvm.org/D44815

llvm-svn: 328829
2018-03-29 21:11:15 +00:00
Rafael Espindola 54d44bf14c Mark __cfi_check as dso_local.
llvm-svn: 328825
2018-03-29 20:51:30 +00:00
Akira Hatanaka 673af7a688 Generalize NRVO to cover C structs.
This commit generalizes NRVO to cover C structs (both trivial and
non-trivial structs).

rdar://problem/33599681

Differential Revision: https://reviews.llvm.org/D44968

llvm-svn: 328809
2018-03-29 17:56:24 +00:00
Rafael Espindola 7e9b87648b Add a dllimport test.
Thanks to rnk for the suggestion.

llvm-svn: 328800
2018-03-29 16:35:52 +00:00
Krzysztof Parzyszek 790e422be9 [Hexagon] Aid bit-reverse load intrinsics lowering with bitcode
The conversion of operatios to bitcode helps to eliminate an additional
store in certain cases. We used to lower these load intrinsics in DAG to
DAG conversion by which time, the "Dead Store Elimination" pass is
already run. There is an associated LLVM patch.
    
Patch by Sumanth Gundapaneni.

llvm-svn: 328776
2018-03-29 13:54:31 +00:00
Krzysztof Parzyszek 1ef2a1f414 [Hexagon] Add support for "new" circular buffer intrinsics
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" vesrions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.

There is a related llvm patch.

Patch by Brendon Cahoon.

llvm-svn: 328725
2018-03-28 19:40:57 +00:00
Matt Arsenault b130ea5605 AMDGPU: Update datalayout for stack alignment
llvm-svn: 328657
2018-03-27 19:26:51 +00:00
Krzysztof Parzyszek 0aead04325 Update test after r328635 in LLVM
llvm-svn: 328641
2018-03-27 17:17:39 +00:00
Pirama Arumuga Nainar fbfba29d74 [CodeGen] Mark fma as const for Android
Summary:
r318093 sets fma, fmaf, fmal as const for Gnu and MSVC.  Android also
does not set errno for these functions.  So mark these const for
Android.

Reviewers: spatel, efriedma, srhines, chh, enh

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D44852

llvm-svn: 328552
2018-03-26 17:03:34 +00:00
Abderrazek Zaafrani b5ac56fb81 [ARM] Add ARMv8.2-A FP16 vector intrinsic
Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM.

Differential revision https://reviews.llvm.org/D43650

llvm-svn: 328277
2018-03-23 00:08:40 +00:00
Rafael Espindola 1193c370b4 Set dso_local on builtin functions.
The difference between CreateRuntimeFunction and CreateBuiltinFunction
is that CreateBuiltinFunction would not set dllimport or dso_local.

To keep the current semantics, just forward to CreateRuntimeFunction
with Local=true so it doesn't add dllimport.

llvm-svn: 328224
2018-03-22 18:03:13 +00:00
Jonas Devlieghere f070268701 [CodeGen] Emit DWARF "constructor" calling convention
Now that LLVM has support for emitting calling conventions in DWARF (see
r328191) have clang emit them.

Patch by: Adrien Guinet

Differential revision: https://reviews.llvm.org/D42351

llvm-svn: 328196
2018-03-22 13:53:30 +00:00
Artem Belevich 30512869ff [NVPTX] Make tensor shape part of WMMA intrinsic's name.
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.

Differential Revision: https://reviews.llvm.org/D44719

llvm-svn: 328158
2018-03-21 21:55:02 +00:00
Rafael Espindola e28ff4d43f Add CHECKs for a few declarations. NFC.
We were just missing test coverage for this.

llvm-svn: 328048
2018-03-20 21:54:14 +00:00
Rafael Espindola 0d40f12596 Set dso_local on string literals.
llvm-svn: 328040
2018-03-20 20:42:55 +00:00
Abderrazek Zaafrani 585051ae74 [AArch64] Add vmulxh_lane fp16 vector intrinsic
https://reviews.llvm.org/D44591

llvm-svn: 328038
2018-03-20 20:37:31 +00:00
Saleem Abdulrasool 29149d5cb7 Basic: support PreserveMost and PreserveAll on Windows ARM
Do not ignore these calling conventions on Windows ARM.  They are used
by the swift runtime for certain calls.

llvm-svn: 328007
2018-03-20 17:33:26 +00:00
Rafael Espindola dca06024e8 Set dso_local for CFConstantStringClassReference.
This one cannot use setGVProperties since it has special logic for
when it is dllimport or not.

llvm-svn: 327993
2018-03-20 15:48:00 +00:00
Oren Ben Simhon 220671a080 Adding nocf_check attribute for cf-protection fine tuning
The patch adds nocf_check target independent attribute for disabling checks that were enabled by cf-protection flag.
The attribute can be appertained to functions and function pointers.
Attribute name follows GCC's similar attribute name.

Differential Revision: https://reviews.llvm.org/D41880

llvm-svn: 327768
2018-03-17 13:31:35 +00:00
Reid Kleckner fb93154bf1 [MS] Don't escape MS C++ names with \01
It is not needed after LLVM r327734. Now it will be easier to copy-paste
IR symbol names from Clang.

llvm-svn: 327738
2018-03-16 20:36:49 +00:00
Rafael Espindola 3c8a39cfbb Set dso_local for NSConcreteStackBlock.
llvm-svn: 327544
2018-03-14 18:19:26 +00:00
Sjoerd Meijer 95da875898 This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic"
This is causing problems in testing, and PR36683 was raised.
Reverting it until we have sorted out how to pass f16 vectors.

llvm-svn: 327437
2018-03-13 19:38:56 +00:00
George Burgess IV 4deb75d2e8 [CodeGen] Eagerly emit lifetime.end markers for calls
In C, we'll wait until the end of the scope to clean up aggregate
temporaries used for returns from calls. This means in cases like:

{
  // Assuming that `Bar` is large enough to warrant indirect returns
  struct Bar b = {};
  b = foo(&b);
  b = foo(&b);
  b = foo(&b);
  b = foo(&b);
}

...We'll allocate space for 5 Bars on the stack (`b`, and 4
temporaries). This becomes painful in things like large switch
statements.

If cleaning up sooner is trivial, we should do it.

llvm-svn: 327229
2018-03-10 23:06:31 +00:00
Abderrazek Zaafrani 5bd68cf742 [ARM] Add ARMv8.2-A FP16 vector intrinsic
Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document.

Reviews in https://reviews.llvm.org/D43650

llvm-svn: 327189
2018-03-09 23:39:34 +00:00
Peter Collingbourne a2f10056d1 Fix Clang test case.
llvm-svn: 327166
2018-03-09 19:37:28 +00:00
Saleem Abdulrasool 3e70132753 CodeGen: simplify and validate exception personalities
Simplify the dispatching for the personality routines.  This really had
no test coverage previously, so add test coverage for the various cases.
This turns out to be pretty complicated as the various languages and
models interact to change personalities around.

You really should feel bad for the compiler if you are using exceptions.
There is no reason for this type of cruelty.

llvm-svn: 327105
2018-03-09 07:06:42 +00:00
George Burgess IV 5701606165 Fix a typo from r326844; NFC
llvm-svn: 326845
2018-03-06 23:09:01 +00:00
George Burgess IV 7e03f350e8 [CodeGen] Don't emit lifetime.end without lifetime.start
EmitLifetimeStart returns a non-null `size` pointer if it actually
emits a lifetime.start. Later in this function, we use `tempSize`'s
nullness to determine whether or not we should emit a lifetime.end.

llvm-svn: 326844
2018-03-06 23:07:00 +00:00
Alexander Ivchenko 9d3b45301f [x86][CET] Introduce _get_ssp, _inc_ssp intrinsics
Summary:
The _get_ssp intrinsic can be used to retrieve the
shadow stack pointer, independent of the current arch -- in
contract with the rdsspd and the rdsspq intrinsics.
Also, this intrinsic returns zero on CPUs which don't
support CET. The rdssp[d|q] instruction is decoded as nop,
essentially just returning the input operand, which is zero.
Example result of compilation:

```
xorl    %eax, %eax
movl    %eax, %ecx
rdsspq  %rcx         # NOP when CET is not supported
movq    %rcx, %rax   # return zero
```

Reviewers: craig.topper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D43814

llvm-svn: 326689
2018-03-05 11:30:28 +00:00
Manoj Gupta 886b4505f2 Do not generate calls to fentry with __attribute__((no_instrument_function))
Summary:
Currently only calls to mcount were suppressed with
no_instrument_function attribute.
Linux kernel requires that calls to fentry should also not be
generated.
This is an extended fix for PR PR33515.

Reviewers: hfinkel, rengolin, srhines, rnk, rsmith, rjmccall, hans

Reviewed By: rjmccall

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D43995

llvm-svn: 326639
2018-03-02 23:52:44 +00:00
Martin Storsjo 87c2ad29ee [RecordLayout] Only assert that fundamental type sizes are power of two on MSVC
Make types with sizes that aren't a power of two an error (that can
be disabled) in structs with ms_struct layout, except on mingw where
the situation is quite likely to occur and GCC handles it silently.

Differential Revision: https://reviews.llvm.org/D43908

llvm-svn: 326476
2018-03-01 20:22:57 +00:00
Martin Storsjo 96b01bcfb6 [RecordLayout] Don't align to non-power-of-2 sizes when using -mms-bitfields
When targeting GNU/MinGW for i386, the size of the "long double" data
type is 12 bytes (while it is 8 bytes in MSVC). When building
with -mms-bitfields to have struct layouts match MSVC, data types
are laid out in a struct with alignment according to their size.
However, this doesn't make sense for the long double type, since
it doesn't match MSVC at all, and aligning to a non-power-of-2
size triggers other asserts later.

This matches what GCC does, aligning a long double to 4 bytes
in structs on i386 even when -mms-bitfields is specified.

This fixes asserts when using the max_align_t data type when
building for MinGW/i386 with the -mms-bitfields flag.

Differential Revision: https://reviews.llvm.org/D43734

llvm-svn: 326173
2018-02-27 06:27:06 +00:00
Scott Linder a2fbcef8ee [DebugInfo] Support DWARF v5 source code embedding extension
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. This vendor extension to DWARF v5 allows source text to be
embedded directly in the line tables of the debug line section.

Add new flag (-g[no-]embed-source) to Driver and CC1 which indicates
that source should be passed through to LLVM during CodeGen.

Differential Revision: https://reviews.llvm.org/D42766

llvm-svn: 326102
2018-02-26 17:32:31 +00:00
Mandeep Singh Grang ac24bb53bb [RISCV] Enable __int128_t and __uint128_t through clang flag
Summary:
If the flag -fforce-enable-int128 is passed, it will enable support for __int128_t and __uint128_t types.
This flag can then be used to build compiler-rt for RISCV32.

Reviewers: asb, kito-cheng, apazos, efriedma

Reviewed By: asb, efriedma

Subscribers: shiva0217, efriedma, jfb, dschuff, sdardis, sbc100, jgravelle-google, aheejin, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, cfe-commits

Differential Revision: https://reviews.llvm.org/D43105

llvm-svn: 326045
2018-02-25 03:58:23 +00:00
Craig Topper 21f66a3f6b [X86] Remove some masked cvt builtins that can be replaced with legacy sse/avx buiiltins and a select.
llvm-svn: 326039
2018-02-24 18:55:13 +00:00
Craig Topper 5dc6ca8e5b [X86] Remove __builtin_ia32_permvarsf256_mask and __builtin_ia32_permvarsi256_mask and use the avx2 unmasked versions and a select instead.
llvm-svn: 326022
2018-02-24 06:46:42 +00:00
Sriraman Tallam 80af005a48 Set Module Metadata "RtLibUseGOT" when fno-plt is used.
Differential Revision: https://reviews.llvm.org/D42217

llvm-svn: 325961
2018-02-23 21:27:33 +00:00
Rafael Espindola 922f2aa9b2 Bring r325915 back.
The tests that failed on a windows host have been fixed.

Original message:

Start setting dso_local for COFF.

With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.

llvm-svn: 325940
2018-02-23 19:30:48 +00:00
Rafael Espindola 9b1d63df37 Convert test to FileCheck. NFC.
llvm-svn: 325930
2018-02-23 18:18:01 +00:00
Rafael Espindola 43ce3a3a4d Revert "Start setting dso_local for COFF."
This reverts commit r325915.

It will take some time to fix the failures on a windows host.

llvm-svn: 325929
2018-02-23 18:09:29 +00:00
Paul Robinson fba2044e73 Revert "[Darwin] Add a test to check clang produces accelerator tables."
This reverts commit 7e24e5f8bff77b7e78da3bfcc68abf42457a66c9.
aka r325850.  Clang should not have end-to-end tests.

llvm-svn: 325920
2018-02-23 16:36:48 +00:00
Rafael Espindola 004d240b6a Start setting dso_local for COFF.
With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.

llvm-svn: 325915
2018-02-23 15:32:32 +00:00
Hans Wennborg d43f40df1c Support for the mno-stack-arg-probe flag
Adds support for this flag. There is also another piece for llvm
(separate review). More info:
https://bugs.llvm.org/show_bug.cgi?id=36221

By Ruslan Nikolaev!

Differential Revision: https://reviews.llvm.org/D43108

llvm-svn: 325901
2018-02-23 13:47:36 +00:00
Stefan Maksimovic c30034e574 [mips] Revert r325872
There are still outstanding issues with byVal arguments
that prevent this from being committed. Revert for now.

llvm-svn: 325899
2018-02-23 13:46:14 +00:00
Stefan Maksimovic 3cd76b1448 [mips] Reland r310704
Recommit this change which was previously reverted
for the 5.0.0 release since the failures identified
were dealt with in r325782.

llvm-svn: 325872
2018-02-23 08:37:48 +00:00
Davide Italiano 7b16df0a72 [Darwin] Add a test to check clang produces accelerator tables.
This test was previously in lldb, and was only checking that clang
was emitting the correct section. So, it belongs here and not
in the debugger.

llvm-svn: 325850
2018-02-23 01:25:03 +00:00
Ivan A. Kosarev 124a2187ad [CodeGen] Fix generation of TBAA tags for may-alias accesses
This patch fixes creating TBAA access descriptors for
may_alias-marked access types. Currently, for such types we
generate ordinary descriptors with char as its access type. The
patch changes this to produce proper may-alias descriptors.

Differential Revision: https://reviews.llvm.org/D42366

llvm-svn: 325575
2018-02-20 12:33:04 +00:00
Craig Topper 0a70c3c7af [X86] Remove mask from 512 bit pmulhrsw/pmulhw/pmulhuw builtins.
We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions.

llvm-svn: 325560
2018-02-20 07:28:18 +00:00
Ivan A. Kosarev e0ef348cb9 [CodeGen] Initialize large arrays by copying from a global
Currently, clang compiles explicit initializers for array
elements into series of store instructions. For large arrays of
built-in types this results in bloated output code and
significant amount of time spent on the instruction selection
phase. This patch fixes the issue by initializing such arrays
with global constants that store the binary image of the
initializer.

Differential Revision: https://reviews.llvm.org/D43181

llvm-svn: 325478
2018-02-19 09:49:11 +00:00
Dimitry Andric 2e3f23bbcc [X86] Add 'sahf' CPU feature to frontend
Summary:
Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
`+sahf` feature for the backend, for bug 36028 (Incorrect use of
pushf/popf enables/disables interrupts on amd64 kernels).  This was
originally submitted in bug 36037 by Jonathan Looney
<jonlooney@gmail.com>.

As described there, GCC also uses `-msahf` for this feature, and the
backend already recognizes the `+sahf` feature. All that is needed is to
teach clang to pass this on to the backend.

The mapping of feature support onto CPUs may not be complete; rather, it
was chosen to match LLVM's idea of which CPUs support this feature (see
lib/Target/X86/X86.td).

I also updated the affected test case (CodeGen/attr-target-x86.c) to
match the emitted output.

Reviewers: craig.topper, coby, efriedma, rsmith

Reviewed By: craig.topper

Subscribers: emaste, cfe-commits

Differential Revision: https://reviews.llvm.org/D43394

llvm-svn: 325446
2018-02-17 21:04:35 +00:00
Vitaly Buka 769134dac3 [ThinLTO] Allow indexing to request backend to ignore the module
Summary:
Gold plugin does not add pass to ThinLTO modules without useful symbols.
In this case ThinLTO can't create corresponding index file and some features, like CFI,
cannot be processes by backed correctly without index.
Given that we don't need the backed output we can request it to avoid
processing the module. This is implemented by this patch using new
"SkipModuleByDistributedBackend" flag.

Reviewers: pcc, tejohnson

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D42995

llvm-svn: 325411
2018-02-16 23:38:22 +00:00
Vitaly Buka c35ff824de [ThinLTO] Ignore object files with no ThinLTO modules if -fthinlto-index= is set
Summary:
ThinLTO compilation may decide not to split module and keep at as regular LTO.
In this can this module already processed during indexing and already a part of
merged object file. So here we can just skip it.

Reviewers: pcc, tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, eraman, cfe-commits

Differential Revision: https://reviews.llvm.org/D42680

llvm-svn: 325410
2018-02-16 23:34:16 +00:00
Sjoerd Meijer e145c1d44f [ARM] Add tests for the vcvtr builtins
This adds Sema and Codegen tests for the vcvtr builtins
(because they were missing).

Differential Revision: https://reviews.llvm.org/D43372

llvm-svn: 325351
2018-02-16 16:01:08 +00:00
Yaxun Liu f8ad59d99d Clean up AMDGCN tests
Differential Revision: https://reviews.llvm.org/D43340

llvm-svn: 325279
2018-02-15 19:12:41 +00:00
Vitaly Buka 0465e2a87e Moved CHECK in test closer to source code
llvm-svn: 325184
2018-02-14 22:52:49 +00:00
Vitaly Buka 44396faabc [ThinLTO/CFI] Include TYPE_ID summaries into GLOBALVAL_SUMMARY_BLOCK
Summary:
TypeID summaries are used by CFI and need to be serialized by ThinLTO
indexing for later use by LTO Backend.

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D42611

llvm-svn: 325182
2018-02-14 22:41:15 +00:00
Erich Keane 293a0556f3 Implement function attribute artificial
Added support in clang for GCC function attribute 'artificial'. This attribute 
is used to control stepping behavior of debugger with respect to inline 
functions.

Patch By: Elizabeth Andrews (eandrews)

Differential Revision: https://reviews.llvm.org/D43259

llvm-svn: 325081
2018-02-14 00:14:07 +00:00
Yaxun Liu 651bd73c02 [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43171

llvm-svn: 325031
2018-02-13 18:01:21 +00:00
Sander de Smalen 9084a3b118 [DebugInfo] Avoid name conflict of generated VLA expression variable.
Summary:
This patch also adds the 'DW_AT_artificial' flag to the generated variable.

Addresses the issues mentioned in http://llvm.org/PR30553.

Reviewers: CarlosAlbertoEnciso, probinson, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D43189

llvm-svn: 324988
2018-02-13 07:49:34 +00:00
Craig Topper ebb0838f74 [X86] Reverse the operand order of the implementation of the kunpack builtins.
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.

Fixes PR36360.

llvm-svn: 324954
2018-02-12 22:38:52 +00:00
Abderrazek Zaafrani e7ed880761 [AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic - clang portion
https://reviews.llvm.org/D42993

llvm-svn: 324940
2018-02-12 21:26:06 +00:00
Erich Keane 93e58667ee Make attribute-target on a Definition-after-use update the LLVM attributes
As reported here: https://bugs.llvm.org/show_bug.cgi?id=36301
The issue is that the 'use' causes the plain declaration to emit
the attributes to LLVM-IR. However, if the definition added it
later, these would silently disappear.

This commit extracts that logic to its own function in CodeGenModule,
and has the attribute-applications done during 'definition' update
the attributes properly.

Differential Revision: https://reviews.llvm.org/D43095

llvm-svn: 324907
2018-02-12 17:01:41 +00:00
Momchil Velikov 25f6be5326 Re-commit r324490: [DebugInfo] Improvements to representation of enumeration types (PR36168)
Differential revision: https://reviews.llvm.org/D42736

llvm-svn: 324900
2018-02-12 16:12:52 +00:00
Filipe Cabecinhas 4ba5817b8b ASan+operator new[]: Add an option for more thorough operator new[] cookie poisoning
Summary:
Right now clang is skipping array cookie poisoning for any operator
new[] which is not part of the set of replaceable global allocation
functions.

This commit adds a flag to tell clang to poison all operator new[]
cookies.

A previous review was poisoning all array cookies unconditionally, but
there is an edge case which would stop working under ASan (a custom
operator new[] saves whatever pointer it returned, and then accesses
it).

This newer revision adds a command line argument to toggle this feature.

Original revision: https://reviews.llvm.org/D41301
Compiler-rt test revision with an explanation of the edge case: https://reviews.llvm.org/D41664

Reviewers: rjmccall, kcc, rsmith

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D43013

llvm-svn: 324884
2018-02-12 11:49:02 +00:00
Craig Topper a57d64e30f [X86] Change the signature of the AVX512 packed fp compare intrinsics to return vXi1 mask. Make bitcasts to scalar explicit in IR
Summary: This is the clang equivalent of r324827

Reviewers: zvi, delena, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43143

llvm-svn: 324828
2018-02-10 23:34:27 +00:00
Simon Pilgrim 99af1e11b2 Add vector add/sub/mul/div by scalar tests (PR27085)
Ensure the scalar is correctly splatted to all lanes

llvm-svn: 324818
2018-02-10 17:55:23 +00:00
Matt Davis 2930d7662e [CodeGen] Use the zero initializer instead of storing an all zero representation.
Summary:
This change avoids the overhead of storing, and later crawling,
an initializer list of all zeros for arrays. When LLVM
visits this (llvm/IR/Constants.cpp) ConstantArray::getImpl()
it will scan the list looking for an array of all zero.

We can avoid the store, and short-cut the scan, by detecting
all zeros when clang builds-up the initialization representation.

This was brought to my attention when investigating PR36030


Reviewers: majnemer, rjmccall

Reviewed By: rjmccall

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D42549

llvm-svn: 324776
2018-02-09 22:10:09 +00:00
Matt Arsenault e7da136a74 AMDGPU: Update for datalayout change
llvm-svn: 324748
2018-02-09 16:58:41 +00:00
Craig Topper c0b2e982d9 [X86] Replace kortest intrinsics with native IR.
llvm-svn: 324647
2018-02-08 20:16:17 +00:00
Alexander Ivchenko 4b20b3c80c Fix for #31362 - ms_abi is implemented incorrectly for values >=16 bytes.
Summary:
This patch is a fix for following issue:
https://bugs.llvm.org/show_bug.cgi?id=31362 The problem was caused by front end
lowering C calling conventions without taking into account calling conventions
enforced by attribute. In this case win64cc was no correctly lowered on targets
other than Windows.

Reviewed By: rnk (Reid Kleckner)

Differential Revision: https://reviews.llvm.org/D43016

Author: belickim <mateusz.belicki@intel.com>
llvm-svn: 324594
2018-02-08 11:15:21 +00:00
Rafael Espindola 75e5736926 Don't try to use copy relocations with tls variables.
Should fix the lldb bot.

llvm-svn: 324539
2018-02-07 23:04:06 +00:00
Rafael Espindola 699f5d6bbc Recommit r324107 again.
The difference from the previous try is that we no longer directly
access function declarations from position independent executables. It
should work, but currently doesn't with some linkers.

It now includes a fix to not mark available_externally definitions as
dso_local.

Original message:

Start setting dso_local in clang.

This starts adding dso_local to clang.

The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.

This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.

llvm-svn: 324535
2018-02-07 22:15:33 +00:00
Momchil Velikov cd0ac25124 Revert [DebugInfo] Improvements to representation of enumeration types (PR36168)"
Revert due to breaking buildbots (LLDB tests)

llvm-svn: 324508
2018-02-07 19:57:04 +00:00
Rafael Espindola 880c3b24c5 Revert "Recommit r324107."
This reverts commit r324500.

The bots found two failures:

    ThreadSanitizer-x86_64 :: Linux/pie_no_aslr.cc
    ThreadSanitizer-x86_64 :: pie_test.cc

when using gold. The issue is a limitation in gold when building pie
binaries. I will investigate how to work around it.

llvm-svn: 324505
2018-02-07 19:44:15 +00:00
Rafael Espindola fa9874c33b Recommit r324107.
It now includes a fix to not mark available_externally definitions as
dso_local.

Original message:

Start setting dso_local in clang.

This starts adding dso_local to clang.

The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.

This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.

llvm-svn: 324500
2018-02-07 19:16:49 +00:00
Momchil Velikov d7e17c232f [DebugInfo] Improvements to representation of enumeration types (PR36168)
This patch:

* fixes an incorrect sign-extension of unsigned values, when emitting
  debug info metadata for enumerators
* the enumerators metadata is created with a flag, which determines
  interpretation of the value bits (signed or unsigned)
* the enumerations metadata contains the underlying integer type and a
  flag, indicating whether this is a C++ "fixed enum"

Differential Revision: https://reviews.llvm.org/D42736

llvm-svn: 324490
2018-02-07 16:52:02 +00:00
Saleem Abdulrasool fd4db5331e Support `#pragma comment(lib, "name")` in the frontend for ELF
This adds the frontend support required to support the use of the
comment pragma to enable auto linking on ELFish targets. This is a
generic ELF extension supported by LLVM. We need to change the handling
for the "dependentlib" in order to accommodate the previously discussed
encoding for the dependent library descriptor. Without the custom
handling of the PCK_Lib directive, the -l prefixed option would be
encoded into the resulting object (which is treated as a frontend
error).

llvm-svn: 324438
2018-02-07 01:46:46 +00:00
Sander de Smalen 891af03a55 Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types.
Fixed build issue when building with g++-4.8 (specialization after instantiation).

llvm-svn: 324173
2018-02-03 13:55:59 +00:00
Rafael Espindola 9f34b7b93b Revert "Start setting dso_local in clang."
This reverts commit r324107.

I will have to test it on OS X.

llvm-svn: 324108
2018-02-02 17:29:22 +00:00
Rafael Espindola 7e34a308ff Start setting dso_local in clang.
This starts adding dso_local to clang.

The hope is to eventually have TargetMachine::shouldAssumeDsoLocal go
away. My objective for now is to move enough of it to clang to remove
the need for the TargetMachine one to handle PIE copy relocations and
-fno-plt. With that it should then be easy to implement a
-fno-copy-reloc in clang.

This patch just adds the cases where we assume a symbol to be local
based on the file being compiled for an executable or a shared
library.

llvm-svn: 324107
2018-02-02 17:17:39 +00:00
Yaxun Liu f5f45e5e63 [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding llvm change.

Differential Revision: https://reviews.llvm.org/D40956

llvm-svn: 324102
2018-02-02 16:08:24 +00:00
Erich Keane 24e6840b9e [CodeGen][va_args] Correct Vector Struct va-arg 'in_reg' code gen
When trying to track down a different bug, we discovered
that calling __builtin_va_arg on a vec3f type caused
the SROA pass to issue a warning that there was an illegal
access.

Further research showed that the vec3f type is
alloca'ed as size '12', but the _builtin_va_arg code
on x86_64 was always loading this out of registers as
{double, double}. Thus, the 2nd store into the vec3f
was storing in bytes 12-15!

This patch alters the original implementation which always
assumed {double, double} to use the actual coerced type
instead, so the LLVM-IR generated is a load/GEP/store of
a <2 x float> and a float, rather than a double and a double.

Tests were added for all combinations I could think of that
would fit in 2 FP registers, and all work exactly as expected.

Differential Revision: https://reviews.llvm.org/D42811

llvm-svn: 324098
2018-02-02 15:53:35 +00:00
Sander de Smalen 4e9a1264dd Reverting patch rL323952 due to build errors that I
haven't encountered in local builds.

llvm-svn: 323956
2018-02-01 12:27:13 +00:00
Sander de Smalen 17c4633e7f [DebugInfo] Enable debug information for C99 VLA types
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
    
This should implement:
  Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553

Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D41698

llvm-svn: 323952
2018-02-01 11:25:10 +00:00
Akira Hatanaka fc681efde4 [CodeGen] Fix an assertion failure in CGRecordLowering.
This patch fixes a bug in CGRecordLowering::accumulateBitFields where it
unconditionally starts a new run and emits a storage field when it sees
a zero-sized bitfield, which causes an assertion in insertPadding to
fail when -fno-bitfield-type-align is used.

It shouldn't emit new storage if UseZeroLengthBitfieldAlignment and
UseBitFieldTypeAlignment are both false.

rdar://problem/36762205

llvm-svn: 323943
2018-02-01 03:04:15 +00:00
Alex Lorenz de07acb9a5 [PR32482] Fix bitfield layout for -mms-bitfield and pragma pack
The patch ensures that a new storage unit is created when the new bitfield's
size is wider than the available bits.

rdar://36343145

Differential Revision: https://reviews.llvm.org/D42660

llvm-svn: 323921
2018-01-31 21:59:02 +00:00
Daniel Neilson c8bdc8db73 Change memcpy/memove/memset to have dest and source alignment attributes.
Summary:
  This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:

Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: rjmccall

Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits

Differential Revision: https://reviews.llvm.org/D41677

llvm-svn: 323617
2018-01-28 17:27:45 +00:00
Ivan A. Kosarev 1860b520a2 [CodeGen] Decorate aggregate accesses with TBAA tags
Differential Revision: https://reviews.llvm.org/D41539

llvm-svn: 323421
2018-01-25 14:21:55 +00:00
Peter Collingbourne 9e31f0a389 IRGen: Emit an inline implementation of __builtin_wmemcmp on MSVCRT platforms.
The MSVC runtime library does not provide a definition of wmemcmp,
so we need an inline implementation.

Differential Revision: https://reviews.llvm.org/D42441

llvm-svn: 323362
2018-01-24 18:59:58 +00:00
Dan Gohman 4f637e0ccc [WebAssembly] Add mem.* builtin functions.
This corresponds to r323222 in LLVM. The new names are not yet
finalized, so use them at your own risk.

llvm-svn: 323224
2018-01-23 17:04:04 +00:00
Sjoerd Meijer ca8f4e7451 [ARM] Pass _Float16 as int or float
Pass and return _Float16 as if it were an int or float for ARM, but with the
top 16 bits unspecified, similarly like we already do for __fp16.

We will implement proper half-precision function argument lowering in the ARM
backend soon, but want to use this workaround in the mean time.

Differential Revision: https://reviews.llvm.org/D42318

llvm-svn: 323185
2018-01-23 10:13:49 +00:00
David Blaikie ac904d0e3a NewPM: Improve/fix GCOV - which needs to run early in the pass pipeline.
Using a new extension point in the new PM, register GCOV at the start of
the pipeline rather than the end.

llvm-svn: 323167
2018-01-23 01:25:24 +00:00
Volodymyr Sapsai 17ebdb239f Reland "[CodeGen] Fix crash when a function taking transparent union is redeclared."
When a function taking transparent union is declared as taking one of
union members earlier in the translation unit, clang would hit an
"Invalid cast" assertion during EmitFunctionProlog. This case
corresponds to function f1 in test/CodeGen/transparent-union-redecl.c.
We decided to cast i32 to union because after merging function
declarations function parameter type becomes int,
CGFunctionInfo::ArgInfo type matches with ABIArgInfo type, so we decide
it is a trivial case. But these types should also be castable to
parameter declaration type which is not the case here.

Now the fix is in converting from ABIArgInfo type to VarDecl type and using
argument demotion when necessary.

Additional tests in Sema/transparent-union.c capture current behavior and make
sure there are no regressions.

rdar://problem/34949329

Reviewers: rjmccall, rafael

Reviewed By: rjmccall

Subscribers: aemerson, cfe-commits, kristof.beyls, ahatanak

Differential Revision: https://reviews.llvm.org/D41311

llvm-svn: 323156
2018-01-22 22:29:24 +00:00
Craig Topper 8cdb94901d [X86] Add rdpid command line option and intrinsics.
Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made.

Reviewers: RKSimon, spatel, zvi, AndreiGrischenko

Reviewed By: RKSimon

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D42272

llvm-svn: 323047
2018-01-20 18:36:52 +00:00
Abderrazek Zaafrani ce8746d178 [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
https://reviews.llvm.org/D41792

llvm-svn: 323006
2018-01-19 23:11:18 +00:00
Daniel Neilson 6e938effaa Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary:
  Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.

  The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.

 For example, code which used to read:
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.

llvm-svn: 322964
2018-01-19 17:12:54 +00:00
George Burgess IV 1913115204 [CodeGen] Fix a crash on mangling multiversioned functions
`multiVersionSortPriority` expects features to have no prefix. We
currently carry them around in the format "+${feature}".

llvm-svn: 322618
2018-01-17 04:46:04 +00:00
Erich Keane 0a6fde4895 Move target MV resolver to COMDAT
As reported here: https://bugs.llvm.org/show_bug.cgi?id=35921
The resolver functions should be in their own
COMDAT regions. This patch sets that up.

Differential Revision: https://reviews.llvm.org/D42110

llvm-svn: 322592
2018-01-16 19:49:52 +00:00
Alex Bradbury 78b2c686b8 [RISCV] Fix test failures on non-assert builds introduced in r322494
Thanks to Eli Friedman, who suggested the reason these tests failed on a few 
buildbots yet works fine locally is because non-assert builds don't emit value 
labels.

llvm-svn: 322514
2018-01-15 20:45:15 +00:00
Alex Bradbury 8cbdd4892f [RISCV] Implement RISCV ABI lowering
RISCVABIInfo is implemented in terms of XLen, supporting both RV32 and RV64. 
Unfortunately we need to count argument registers in the frontend in order to 
determine when to emit signext and zeroext attributes. Integer scalars are 
extended according to their type up to 32-bits and then sign-extended to XLen 
when passed in registers, but are anyext when passed on the stack. This patch 
only implements the base integer (soft float) ABIs.

For more information on the RISC-V ABI, see [the ABI 
doc](https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md), 
my [golden model](https://github.com/lowRISC/riscv-calling-conv-model), and 
the [LLVM RISC-V calling convention 
patch](https://reviews.llvm.org/D39898#2d1595b4) (specifically the comment 
documenting frontend expectations).

Differential Revision: https://reviews.llvm.org/D40023

llvm-svn: 322494
2018-01-15 17:54:52 +00:00
Craig Topper f517f1a516 [X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or
Summary:
kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation.

I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations.

Reviewers: RKSimon, spatel, zvi, jina.nahias

Reviewed By: RKSimon

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D42016

llvm-svn: 322461
2018-01-14 19:23:50 +00:00
Paul Robinson 212f3b91ee [DWARFv5] Have -gdwarf-5 generate MD5 checksums
Differential Revision: https://reviews.llvm.org/D42011

llvm-svn: 322413
2018-01-12 22:19:03 +00:00
David Blaikie 7a4f7f56e5 Wire up GCOV to the new pass manager
GCOV in the old pass manager also strips debug info (if debug info is
disabled/only produced for profiling anyway) after the GCOV pass runs.

I think the strip pass hasn't been ported to the new pass manager, so it
might take me a little while to wire that up.

llvm-svn: 322126
2018-01-09 22:03:47 +00:00
Alexander Kornienko bb3c2432fa Explicitly specify output file.
Otherwise the test fails when LLVM sources are on a read-only partition.

llvm-svn: 322082
2018-01-09 15:05:13 +00:00
Oren Ben Simhon 57cc1a5d77 Added Control Flow Protection Flag
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc.
For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.

Differential Revision: https://reviews.llvm.org/D40478

Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea
llvm-svn: 322063
2018-01-09 08:53:59 +00:00
Craig Topper de91dff5d4 [X86] Replace cvt*2mask intrinsics with native IR using 'icmp slt X, zeroinitializer.
llvm-svn: 322038
2018-01-08 22:37:56 +00:00
Erich Keane 281d20b601 Implement Attribute Target MultiVersioning
GCC's attribute 'target', in addition to being an optimization hint,
also allows function multiversioning. We currently have the former
implemented, this is the latter's implementation.

This works by enabling functions with the same name/signature to coexist,
so that they can all be emitted. Multiversion state is stored in the
FunctionDecl itself, and SemaDecl manages the definitions.
Note that it ends up having to permit redefinition of functions so
that they can all be emitted. Additionally, all versions of the function
must be emitted, so this also manages that.

Note that this includes some additional rules that GCC does not, since
defining something as a MultiVersion function after a usage has been made illegal.

The only 'history rewriting' that happens is if a function is emitted before
it has been converted to a multiversion'ed function, at which point its name
needs to be changed.

Function templates and virtual functions are NOT yet supported (not supported
in GCC either).

Additionally, constructors/destructors are disallowed, but the former is 
planned.

llvm-svn: 322028
2018-01-08 21:34:17 +00:00
Sean Eveson 9e867b68fb Fix test added in r321992 failing on some buildbots (again), test requires x86.
llvm-svn: 322000
2018-01-08 15:46:18 +00:00
Ivan A. Kosarev ed4f330174 [CodeGen] Fix TBAA info for accesses to members of base classes
Resolves:
Bug 35724 - regression (r315984): fatal error: error in backend:
Broken function found (Did not see access type in access path!)
https://bugs.llvm.org/show_bug.cgi?id=35724

Differential Revision: https://reviews.llvm.org/D41547

llvm-svn: 321999
2018-01-08 15:36:06 +00:00
Sean Eveson 31db713615 Fix test added in r321992 failing on some buildbots.
llvm-svn: 321995
2018-01-08 14:43:28 +00:00
Sean Eveson 5110d4f5c0 [Driver] Add flag enabling the function stack size section that was added in r319430
Adds the -fstack-size-section flag to enable the .stack_sizes section. The flag defaults to on for the PS4 triple.

Differential Revision: https://reviews.llvm.org/D40712

llvm-svn: 321992
2018-01-08 13:42:26 +00:00
Benjamin Kramer dfecbe9ad8 Add support for a limited subset of TS 18661-3 math builtins.
These just overloads for _Float128. They're supported by GCC 7 and used
by glibc. APFloat support is already there so just add the overloads.

__builtin_copysignf128
__builtin_fabsf128
__builtin_huge_valf128
__builtin_inff128
__builtin_nanf128
__builtin_nansf128

This is the same support that GCC has, according to the documentation,
but limited to _Float128.

llvm-svn: 321948
2018-01-06 21:49:54 +00:00
Vedant Kumar bbafd50756 [CGBuiltin] Handle unsigned mul overflow properly (PR35750)
r320902 fixed the IRGen for some types of checked multiplications. It
did not handle unsigned overflow correctly in the case where the signed
operand is negative (PR35750).

Eli pointed out that on overflow, the result must be equal to the unique
value that is equivalent to the mathematically-correct result modulo two
raised to the k power, where k is the number of bits in the result type.

This patch fixes the specialized IRGen from r320902 accordingly.

Testing: Apart from check-clang, I modified the test harness from
r320902 to validate the results of all multiplications -- not just the
ones which don't overflow:

  https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081

llvm.org/PR35750, rdar://34963321

Differential Revision: https://reviews.llvm.org/D41717

llvm-svn: 321771
2018-01-03 23:11:32 +00:00
Filipe Cabecinhas 6f83fa9934 Revert "ASan+operator new[]: Fix operator new[] cookie poisoning"
This reverts r321645.

I missed a compiler-rt test that needs updating.

llvm-svn: 321647
2018-01-02 13:46:12 +00:00
Filipe Cabecinhas 016860cf2f ASan+operator new[]: Fix operator new[] cookie poisoning
Summary:
The C++ Itanium ABI says:
No cookie is required if the new operator being used is ::operator new[](size_t, void*).

We should only avoid poisoning the cookie if we're calling this
operator, not others. This is dealt with before the call to
InitializeArrayCookie.

Reviewers: rjmccall, kcc, rsmith

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D41301

llvm-svn: 321645
2018-01-02 13:21:50 +00:00
Coby Tayree a09663a5c1 [x86][icelake][vbmi2]
added vbmi2 feature recognition
added intrinsics support for vbmi2 instructions
_mm[128,256,512]_mask[z]_compress_epi[16,32]
_mm[128,256,512]_mask_compressstoreu_epi[16,32]
_mm[128,256,512]_mask[z]_expand_epi[16,32]
_mm[128,256,512]_mask[z]_expandloadu_epi[16,32]
_mm[128,256,512]_mask[z]_sh[l,r]di_epi[16,32,64]
_mm[128,256,512]_mask_sh[l,r]dv_epi[16,32,64]
matching a similar work on the backend (D40206)
Differential Revision: https://reviews.llvm.org/D41557

llvm-svn: 321487
2017-12-27 11:25:07 +00:00
Coby Tayree 3d9c88cfec [x86][icelake][vnni]
added vnni feature recognition
added intrinsics support for VNNI instructions
_mm256_mask_dpbusd_epi32
_mm256_maskz_dpbusd_epi32
_mm256_dpbusd_epi32
_mm256_mask_dpbusds_epi32
_mm256_maskz_dpbusds_epi32
_mm256_dpbusds_epi32
_mm256_mask_dpwssd_epi32
_mm256_maskz_dpwssd_epi32
_mm256_dpwssd_epi32
_mm256_mask_dpwssds_epi32
_mm256_maskz_dpwssds_epi32
_mm256_dpwssds_epi32
_mm128_mask_dpbusd_epi32
_mm128_maskz_dpbusd_epi32
_mm128_dpbusd_epi32
_mm128_mask_dpbusds_epi32
_mm128_maskz_dpbusds_epi32
_mm128_dpbusds_epi32
_mm128_mask_dpwssd_epi32
_mm128_maskz_dpwssd_epi32
_mm128_dpwssd_epi32
_mm128_mask_dpwssds_epi32
_mm128_maskz_dpwssds_epi32
_mm128_dpwssds_epi32
_mm512_mask_dpbusd_epi32
_mm512_maskz_dpbusd_epi32
_mm512_dpbusd_epi32
_mm512_mask_dpbusds_epi32
_mm512_maskz_dpbusds_epi32
_mm512_dpbusds_epi32
_mm512_mask_dpwssd_epi32
_mm512_maskz_dpwssd_epi32
_mm512_dpwssd_epi32
_mm512_mask_dpwssds_epi32
_mm512_maskz_dpwssds_epi32
_mm512_dpwssds_epi32
matching a similar work on the backend (D40208)
Differential Revision: https://reviews.llvm.org/D41558

llvm-svn: 321484
2017-12-27 10:37:51 +00:00
Coby Tayree 2268576fa0 [x86][icelake][bitalg]
added bitalg feature recognition
added intrinsics support for bitalg instructions
_mm512_popcnt_epi16
_mm512_mask_popcnt_epi16
_mm512_maskz_popcnt_epi16
_mm512_popcnt_epi8
_mm512_mask_popcnt_epi8
_mm512_maskz_popcnt_epi8
_mm512_mask_bitshuffle_epi64_mask
_mm512_bitshuffle_epi64_mask
_mm256_popcnt_epi16
_mm256_mask_popcnt_epi16
_mm256_maskz_popcnt_epi16
_mm128_popcnt_epi16
_mm128_mask_popcnt_epi16
_mm128_maskz_popcnt_epi16
_mm256_popcnt_epi8
_mm256_mask_popcnt_epi8
_mm256_maskz_popcnt_epi8
_mm128_popcnt_epi8
_mm128_mask_popcnt_epi8
_mm128_maskz_popcnt_epi8
_mm256_mask_bitshuffle_epi32_mask
_mm256_bitshuffle_epi32_mask
_mm128_mask_bitshuffle_epi16_mask
_mm128_bitshuffle_epi16_mask
matching a similar work on the backend (D40222)
Differential Revision: https://reviews.llvm.org/D41564

llvm-svn: 321483
2017-12-27 10:01:00 +00:00
Coby Tayree cf96c876c6 [x86][icelake][vpclmulqdq]
added vpclmulqdq feature recognition
added intrinsics support for vpclmulqdq instructions
  _mm256_clmulepi64_epi128
  _mm512_clmulepi64_epi128
matching a similar work on the backend (D40101)
Differential Revision: https://reviews.llvm.org/D41573

llvm-svn: 321480
2017-12-27 09:00:31 +00:00
Coby Tayree f4811ebc39 [x86][icelake][gfni]
added gfni feature recognition
added intrinsics support for gfni instructions
  _mm_gf2p8affineinv_epi64_epi8
  _mm_mask_gf2p8affineinv_epi64_epi8
  _mm_maskz_gf2p8affineinv_epi64_epi8
  _mm256_gf2p8affineinv_epi64_epi8
  _mm256_mask_gf2p8affineinv_epi64_epi8
  _mm256_maskz_gf2p8affineinv_epi64_epi8
  _mm512_gf2p8affineinv_epi64_epi8
  _mm512_mask_gf2p8affineinv_epi64_epi8
  _mm512_maskz_gf2p8affineinv_epi64_epi8
  _mm_gf2p8affine_epi64_epi8
  _mm_mask_gf2p8affine_epi64_epi8
  _mm_maskz_gf2p8affine_epi64_epi8
  _mm256_gf2p8affine_epi64_epi8
  _mm256_mask_gf2p8affine_epi64_epi8
  _mm256_maskz_gf2p8affine_epi64_epi8
  _mm512_gf2p8affine_epi64_epi8
  _mm512_mask_gf2p8affine_epi64_epi8
  _mm512_maskz_gf2p8affine_epi64_epi8
  _mm_gf2p8mul_epi8
  _mm_mask_gf2p8mul_epi8
  _mm_maskz_gf2p8mul_epi8
  _mm256_gf2p8mul_epi8
  _mm256_mask_gf2p8mul_epi8
  _mm256_maskz_gf2p8mul_epi8
  _mm512_gf2p8mul_epi8
  _mm512_mask_gf2p8mul_epi8
  _mm512_maskz_gf2p8mul_epi8
matching a similar work on the backend (D40373)
Differential Revision: https://reviews.llvm.org/D41582

llvm-svn: 321477
2017-12-27 08:37:47 +00:00
Coby Tayree a1e5f0c339 [x86][icelake][vaes]
added vaes feature recognition
added intrinsics support for vaes instructions, matching a similar work on the backend (D40078)
  _mm256_aesenc_epi128
  _mm512_aesenc_epi128
  _mm256_aesenclast_epi128
  _mm512_aesenclast_epi128
  _mm256_aesdec_epi128
  _mm512_aesdec_epi128
  _mm256_aesdeclast_epi128
  _mm512_aesdeclast_epi128

llvm-svn: 321474
2017-12-27 08:16:54 +00:00
Ivan A. Kosarev 57493e2919 [CodeGen] Represent array members in new-format TBAA type descriptors
Now that in the new TBAA format we allow access types to be of
any object types, including aggregate ones, it becomes critical
to specify types of all sub-objects such aggregates comprise as
their members. In order to meet this requirement, this patch
enables generation of field descriptors for members of array
types.

Differential Revision: https://reviews.llvm.org/D41399

llvm-svn: 321352
2017-12-22 09:57:24 +00:00
Ivan A. Kosarev d50b847ac8 [CodeGen] Support generation of TBAA info in the new format
Now that the MDBuilder helpers generating TBAA type and access
descriptors in the new format are in place, we can teach clang to
use them when requested.

Differential Revision: https://reviews.llvm.org/D41394

llvm-svn: 321351
2017-12-22 09:54:23 +00:00
Volodymyr Sapsai 22b00ec42e Revert "[CodeGen] Fix crash when a function taking transparent union is redeclared."
This reverts commit r321296. It caused performance regressions
FAIL: imp.execution_time
FAIL: 2007-01-04-KNR-Args.execution_time
FAIL: sse_expandfft.execution_time
FAIL: sse_stepfft.execution_time

llvm-svn: 321306
2017-12-21 20:52:59 +00:00
Abderrazek Zaafrani abb890b7be [AArch64] Enable fp16 data type for the Builtin for AArch64 only.
Differential Revision: https:://reviews.llvm.org/D41360

llvm-svn: 321301
2017-12-21 20:10:03 +00:00
Volodymyr Sapsai 614f3702d9 [CodeGen] Fix crash when a function taking transparent union is redeclared.
When a function taking transparent union is declared as taking one of
union members earlier in the translation unit, clang would hit an
"Invalid cast" assertion during EmitFunctionProlog. This case
corresponds to function f1 in test/CodeGen/transparent-union-redecl.c.
We decided to cast i32 to union because after merging function
declarations function parameter type becomes int,
CGFunctionInfo::ArgInfo type matches with ABIArgInfo type, so we decide
it is a trivial case. But these types should also be castable to
parameter declaration type which is not the case here.

The fix is in checking for the trivial case if ABIArgInfo type matches with
parameter declaration type. It exposed inconsistency that we check
hasScalarEvaluationKind for different types in EmitParmDecl and
EmitFunctionProlog, and comment says they should match.

Additional tests in Sema/transparent-union.c capture current behavior and make
sure there are no regressions.

rdar://problem/34949329

Reviewers: rjmccall, rafael

Reviewed By: rjmccall

Subscribers: aemerson, cfe-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41311

llvm-svn: 321296
2017-12-21 19:42:37 +00:00
Abderrazek Zaafrani f58a132eef [AARch64] Add ARMv8.2-A FP16 vector intrinsics
Putting back the code that was reverted few weeks ago.

Differential Revision: https://reviews.llvm.org/D34161

llvm-svn: 321294
2017-12-21 19:20:01 +00:00
Vedant Kumar 09b5bfdd85 [ubsan] Diagnose noreturn functions which return
Diagnose 'unreachable' UB when a noreturn function returns.

  1. Insert a check at the end of functions marked noreturn.

  2. A decl may be marked noreturn in the caller TU, but not marked in
     the TU where it's defined. To diagnose this scenario, strip away the
     noreturn attribute on the callee and insert check after calls to it.

Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700

rdar://33660464

Differential Revision: https://reviews.llvm.org/D40698

llvm-svn: 321231
2017-12-21 00:10:25 +00:00
Florian Hahn b1c9dbdd7d [Complex] Don't use __div?c3 when building with fast-math.
Summary: Plant an inline version of "((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))" instead.

Patch by Paul Walker.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D40299

llvm-svn: 321183
2017-12-20 15:50:52 +00:00
Martin Bohme 06997a767e [X86] Use {{.*}} instead of hardcoded %1 in knot test.
This makes the test more resilient and consistent with the other tests
introduced in r320919.

llvm-svn: 320971
2017-12-18 11:29:21 +00:00
Sanjay Patel cb8c009801 [Driver, CodeGen] pass through and apply -fassociative-math
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:

1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the 
   interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math. 
   This was mostly already done - we just need to translate the flag as a codegen option. 
   The test file is complicated because there are many potential combinations of flags here.
   Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.

2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and 
   corresponding test.

For the motivating example from PR27372:

float foo(float a, float x) { return ((a + x) - x); }

$ ./clang -O2 27372.c -S -o - -ffast-math  -fno-associative-math -emit-llvm  | egrep 'fadd|fsub'
  %add = fadd nnan ninf nsz arcp contract float %0, %1
  %sub = fsub nnan ninf nsz arcp contract float %add, %2

So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch). 
This case now works as expected end-to-end although the underlying logic is still wrong:

$ ./clang  -O2 27372.c -S -o - -ffast-math  -fno-associative-math | grep xmm
	addss	%xmm1, %xmm0
	subss	%xmm1, %xmm0

We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:

$ ./clang  -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm  | grep fadd
  %add = fadd reassoc float %0, %1

$ ./clang -O2  27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm
	addss	%xmm1, %xmm0
	subss	%xmm1, %xmm0

Differential Revision: https://reviews.llvm.org/D39812

llvm-svn: 320920
2017-12-16 16:11:17 +00:00
Craig Topper 5028ace602 [X86] Implement kand/kandn/kor/kxor/kxnor/knot intrinsics using native IR.
llvm-svn: 320919
2017-12-16 08:26:22 +00:00