llvm-project

Commit Graph

Author	SHA1	Message	Date
JF Bastien	4734fa1428	Improve incompatible triple error When complaining that the triple is incompatible with all targets, print out the triple not just a generic error about triples not matching. llvm-svn: 340510	2018-08-23 03:55:24 +00:00
Elizabeth Andrews	6593df241a	Currently clang does not emit unused static constants. GCC emits these constants by default when there is no optimization. GCC's option -fno-keep-static-consts can be used to not emit unused static constants. In Clang, since default behavior does not keep unused static constants, -fkeep-static-consts can be used to emit these if required. This could be useful for producing identification strings like SVN identifiers inside the object file even though the string isn't used by the program. Differential Revision: https://reviews.llvm.org/D40925 llvm-svn: 340439	2018-08-22 19:05:19 +00:00
Nico Weber	14a577bfd1	Eliminate instances of `EmitScalarExpr(E->getArg(n))` in EmitX86BuiltinExpr(). EmitX86BuiltinExpr() emits all args into Ops at the beginning, so don't do that work again. This changes behavior: If e.g. ++a was passed as an arg, we incremented a twice previously. This change fixes that bug. https://reviews.llvm.org/D50979 llvm-svn: 340348	2018-08-21 22:19:55 +00:00
Martin Storsjo	d39d53b0d1	[CodeGen] Implicitly set stackrealign on the main function, if custom stack alignment is used If using a custom stack alignment, one is expected to make sure that all callers provide such alignment, or realign the stack in all entry points (and callbacks). Despite this, the compiler can assume that the main function will need realignment in these cases, since the startup routines calling the main function most probably won't provide the custom alignment. This matches what GCC does in similar cases; if compiling with -mincoming-stack-boundary=X -mpreferred-stack-boundary=X, GCC normally assumes such alignment on entry to a function, but specifically for the main function still does realignment. Differential Revision: https://reviews.llvm.org/D51026 llvm-svn: 340334	2018-08-21 20:41:17 +00:00
Heejin Ahn	f0fe359bc3	[WebAssembly] Revert type of wake count in atomic.wake to i32 Summary: We decided to revert this from i64 to i32 in Nov 28 CG meeting. Fixes PR38632. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, jfb, cfe-commits Differential Revision: https://reviews.llvm.org/D51013 llvm-svn: 340235	2018-08-20 23:49:34 +00:00
David Blaikie	658645241b	DebugInfo: Add the ability to disable DWARF name tables entirely This changes the current default behavior (from emitting pubnames by default, to not emitting them by default) & moves to matching GCC's behavior* with one significant difference: -gno(-gnu)-pubnames disables pubnames even in the presence of -gsplit-dwarf (though -gsplit-dwarf still by default enables -ggnu-pubnames). This allows users to disable pubnames (& the new DWARF5 accelerated access tables) when they might not be worth the size overhead. * GCC's behavior is that -ggnu-pubnames and -gpubnames override each other, and that -gno-gnu-pubnames and -gno-pubnames act as synonyms and disable either kind of pubnames if they come last. (eg: -gpubnames -gno-gnu-pubnames causes no pubnames (neither gnu or standard) to be emitted) llvm-svn: 340206	2018-08-20 20:14:08 +00:00
Sanjay Patel	b197667eba	[CodeGen] add test file that should have been included with r340141 llvm-svn: 340142	2018-08-19 17:32:56 +00:00
Ivan A. Kosarev	1b7851212b	[NEON] Define fp16 vld and vst intrinsics conditionally This patch fixes definitions of vld and vst NEON intrinsics so that we only define them if half-precision arithmetic is supported on the target platform, as prescribed in ACLE 2.0. Differential Revision: https://reviews.llvm.org/D49075 llvm-svn: 340140	2018-08-19 16:30:57 +00:00
Sanjay Patel	a09ae4b8a6	revert r340137: [CodeGen] add rotate builtins At least a couple of bots (gcc host compiler on PPC only?) are showing the compiler dying while trying to compile. llvm-svn: 340138	2018-08-19 15:31:42 +00:00
Sanjay Patel	446529b0d9	[CodeGen] add/fix rotate builtins that map to LLVM funnel shift (retry) This is a retry of rL340135 (reverted at rL340136 because of gcc host compiler crashing) with 2 changes: 1. Move the code into a helper to reduce code duplication (and hopefully work-around the crash). 2. The original commit had a formatting bug in the docs (missing an underscore). Original commit message: This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang (when both halves of a funnel shift are the same value, it's a rotate). We're free to name these as we want because we're not copying gcc, but if there's some other existing art (eg, the microsoft ops that are modified in this patch) that we want to replicate, we can change the names. The funnel shift intrinsics were added here: https://reviews.llvm.org/D49242 With improved codegen in: https://reviews.llvm.org/rL337966 https://reviews.llvm.org/rL339359 And basic IR optimization added in: https://reviews.llvm.org/rL338218 https://reviews.llvm.org/rL340022 ...so these are expected to produce asm output that's equal or better to the multi-instruction alternatives using primitive C/IR ops. In the motivating loop example from PR37387: https://bugs.llvm.org/show_bug.cgi?id=37387#c7 ...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source. Differential Revision: https://reviews.llvm.org/D50924 llvm-svn: 340137	2018-08-19 14:44:47 +00:00
Sanjay Patel	39b4dd2da7	revert r340135: [CodeGen] add rotate builtins At least a couple of bots (PPC only?) are showing the compiler dying while trying to compile: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/11065/steps/build%20stage%201/logs/stdio http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18267/steps/build%20stage%201/logs/stdio llvm-svn: 340136	2018-08-19 13:48:06 +00:00
Sanjay Patel	9116f0438c	[CodeGen] add rotate builtins This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang (when both halves of a funnel shift are the same value, it's a rotate). We're free to name these as we want because we're not copying gcc, but if there's some other existing art (eg, the microsoft ops that are modified in this patch) that we want to replicate, we can change the names. The funnel shift intrinsics were added here: D49242 With improved codegen in: rL337966 rL339359 And basic IR optimization added in: rL338218 rL340022 ...so these are expected to produce asm output that's equal or better to the multi-instruction alternatives using primitive C/IR ops. In the motivating loop example from PR37387: https://bugs.llvm.org/show_bug.cgi?id=37387#c7 ...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source. Differential Revision: https://reviews.llvm.org/D50924 llvm-svn: 340135	2018-08-19 13:12:40 +00:00
Nico Weber	b2c53d3393	Make __shiftleft128 / __shiftright128 real compiler built-ins. r337619 added __shiftleft128 / __shiftright128 as functions in intrin.h. Microsoft's STL plans on using these functions, and they're using intrin0.h which just has declarations of built-ins to not pull in the huge intrin.h header in the standard library headers. That requires that these functions are real built-ins. https://reviews.llvm.org/D50907 llvm-svn: 340048	2018-08-17 17:19:06 +00:00
Luke Cheeseman	0ac44c18b7	[AArch64] - return address signing - Add a command line options -msign-return-address to enable return address signing - Armv8.3a added instructions to sign the return address to help mitigate against ROP attacks - This patch adds command line options to generate function attributes that signal to the back whether return address signing instructions should be added Differential revision: https://reviews.llvm.org/D49793 llvm-svn: 340019	2018-08-17 12:55:05 +00:00
Roman Lebedev	9cb37a2a15	[NFC] Some small test updates for Implicit Conversion sanitizer. Split off from D50250. llvm-svn: 339995	2018-08-17 07:33:25 +00:00
David Blaikie	9982bb8739	Disable pubnames in NVPTX debug info using metadata llvm-svn: 339968	2018-08-16 23:56:32 +00:00
Vedant Kumar	61cdea81e7	Relax a CHECK line to allow for dso_local Fixes a bot failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11806 llvm-svn: 339964	2018-08-16 23:19:50 +00:00
Vedant Kumar	ee6c233ae0	[InstrProf] Use atomic profile counter updates for TSan Thread sanitizer instrumentation fails to skip all loads and stores to profile counters. This can happen if profile counter updates are merged: %.sink = phi i64* ... %pgocount5 = load i64, i64* %.sink %27 = add i64 %pgocount5, 1 %28 = bitcast i64* %.sink to i8* call void @__tsan_write8(i8* %28) store i64 %27, i64* %.sink To suppress TSan diagnostics about racy counter updates, make the counter updates atomic when TSan is enabled. If there's general interest in this mode it can be surfaced as a clang/swift driver option. Testing: check-{llvm,clang,profile} rdar://40477803 Differential Revision: https://reviews.llvm.org/D50867 llvm-svn: 339955	2018-08-16 22:24:47 +00:00
Craig Topper	0609d1e211	[X86] Remove masking from the 512-bit padds and psubs builtins. Use select builtin instead. llvm-svn: 339843	2018-08-16 06:20:29 +00:00
Momchil Velikov	9e5e045b60	Use .cpp extension for certain tests instead of .cc The tests `CodeGen/aapcs[64]-align.cc` are not run since files with a `.cc` suffix aren't recognisze as tests. This patch renames the above two files to `.cpp`. Differential Revision: https://reviews.llvm.org/D46013 Comitting as obvious. llvm-svn: 339766	2018-08-15 12:22:08 +00:00
Craig Topper	2a87314e75	[InlineAsm] Update the min-legal-vector-width function attribute based on inputs and outputs to inline assembly Summary: Another piece of my ongoing to work for prefer-vector-width. min-legal-vector-width will eventually be used by the X86 backend to know whether it needs to make 512 bits type legal when prefer-vector-width=256. If the user used inline assembly that passed in/out a 512-bit register, we need to make sure 512 bits are considered legal. Otherwise we'll get an assert failure when we try to wire up the inline assembly to the rest of the code. This patch just checks the LLVM IR types to see if they are vectors and then updates the attribute based on their total width. I'm not sure if this is the best way to do this or if there's any subtlety I might have missed. So if anyone has other opinions on how to do this I'm open to suggestions. Reviewers: chandlerc, rsmith, rnk Reviewed By: rnk Subscribers: eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D50678 llvm-svn: 339721	2018-08-14 20:21:05 +00:00
Tomasz Krupa	e8cf972d86	[X86] Lowering addus/subus intrinsics to native IR Summary: This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. Reviewers: craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D46892 llvm-svn: 339651	2018-08-14 08:01:38 +00:00
Akira Hatanaka	936240c77a	[CodeGen] Before returning a copy/dispose helper function, bitcast it to a void pointer type. This fixes a bug introduced in r339438. llvm-svn: 339633	2018-08-14 00:15:42 +00:00
Akira Hatanaka	9978da3615	[CodeGen] Merge equivalent block copy/helper functions. Clang generates copy and dispose helper functions for each block literal on the stack. Often these functions are equivalent for different blocks. This commit makes changes to merge equivalent copy and dispose helper functions and reduce code size. To enable merging equivalent copy/dispose functions, the captured object infomation is encoded into the helper function name. This allows IRGen to check whether an equivalent helper function has already been emitted and reuse the function instead of generating a new helper function whenever a block is defined. In addition, the helper functions are marked as linkonce_odr to enable merging helper functions that have the same name across translation units and marked as unnamed_addr to enable the linker's deduplication pass to merge functions that have different names but the same content. rdar://problem/42640608 Differential Revision: https://reviews.llvm.org/D50152 llvm-svn: 339438	2018-08-10 15:09:24 +00:00
Hans Wennborg	a912e3e6be	clang-cl: Support /guard:cf,nochecks This extension emits the guard cf table without inserting the instrumentation. Currently that's what clang-cl does with /guard:cf anyway, but this allows a user to request that explicitly. Differential Revision: https://reviews.llvm.org/D50513 llvm-svn: 339420	2018-08-10 09:49:21 +00:00
David Chisnall	c5a458cc53	Correctly initialise global blocks on Windows. Summary: Windows does not allow globals to be initialised to point to globals in another DLL. Exported globals may be referenced only from code. Work around this by creating an initialiser that runs in early library initialisation and sets the isa pointer. Reviewers: rjmccall Reviewed By: rjmccall Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D50436 llvm-svn: 339317	2018-08-09 08:02:42 +00:00
Petr Hosek	eb46c95c3e	[CMake] Use normalized Windows target triples Changes the default Windows target triple returned by GetHostTriple.cmake from the old environment names (which we wanted to move away from) to newer, normalized ones. This also requires updating all tests to use the new systems names in constraints. Differential Revision: https://reviews.llvm.org/D47381 llvm-svn: 339307	2018-08-09 02:16:18 +00:00
Petr Hosek	7b27454477	[ADT] Normalize empty triple components LLVM triple normalization is handling "unknown" and empty components differently; for example given "x86_64-unknown-linux-gnu" and "x86_64-linux-gnu" which should be equivalent, triple normalization returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's config.sub returns "x86_64-unknown-linux-gnu" for both "x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the triple normalization to behave the same way, replacing empty triple components with "unknown". This addresses PR37129. Differential Revision: https://reviews.llvm.org/D50219 llvm-svn: 339294	2018-08-08 22:23:57 +00:00
Craig Topper	0a4f6be443	[Builtins] Implement __builtin_clrsb to be compatible with gcc gcc defines an intrinsic called __builtin_clrsb which counts the number of extra sign bits on a number. This is equivalent to counting the number of leading zeros on a positive number or the number of leading ones on a negative number and subtracting one from the result. Since we can't count leading ones we need to invert negative numbers to count zeros. This patch will cause the builtin to be expanded inline while gcc uses a call to a function like clrsbdi2 that is implemented in libgcc. But this is similar to what we already do for popcnt. And I don't think compiler-rt supports clrsbdi2. Differential Revision: https://reviews.llvm.org/D50168 llvm-svn: 339282	2018-08-08 19:55:52 +00:00
Hsiangkai Wang	ea1b0e0960	Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)" Build failed in http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/27258 In lib/CodeGen/LiveDebugVariables.cpp:589, it uses std::prev(MBBI) to get DebugValue's SlotIndex. however, the previous instruction may be also a debug instruction. llvm-svn: 338992	2018-08-06 07:07:18 +00:00
Hsiangkai Wang	3bec3abf38	[DebugInfo] Generate debug information for labels. (Fix PR37395) Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. After fixing PR37395. Differential Revision: https://reviews.llvm.org/D45045 llvm-svn: 338989	2018-08-06 05:58:59 +00:00
David Bolvansky	eb3aad9935	Fix tests for changed opt remarks format Summary: Optimization remark format is slightly changed by LLVM patch D49412. Two tests are fixed with expected messages changed. Frankly speaking I have not tested this change yet. I will test when manage to setup the project. Reviewers: xbolva00 Reviewed By: xbolva00 Subscribers: mehdi_amini, eraman, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D50241 llvm-svn: 338971	2018-08-05 14:53:34 +00:00
Richard Smith	06f71b5bd8	[constexpr] Support for constant evaluation of __builtin_memcpy and __builtin_memmove (in non-type-punning cases). This is intended to permit libc++ to make std::copy etc constexpr without sacrificing the optimization that uses memcpy on trivially-copyable types. __builtin_strcpy and __builtin_wcscpy are not handled by this change. They'd be straightforward to add, but we haven't encountered a need for them just yet. This reinstates r338455, reverted in r338602, with a fix to avoid trying to constant-evaluate a memcpy call if either pointer operand has an invalid designator. llvm-svn: 338941	2018-08-04 00:57:17 +00:00
Graham Yiu	668e894ccb	Fix asm label testcase flaw - Testcase attempts to (not) grep 'g0' in output to ensure asm symbol is properly renamed, but g0 is too generic and can be part of the module's path in LLVM IR output. - Changed to grep '@g0', which is what the proper global symbol name would be if not using asm. llvm-svn: 338895	2018-08-03 14:36:44 +00:00
Heejin Ahn	00aa81b4df	[WebAssembly] Support for atomic.wait / atomic.wake builtins Summary: Add support for atomic.wait / atomic.wake builtins based on the Wasm thread proposal. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits Differential Revision: https://reviews.llvm.org/D49396 llvm-svn: 338771	2018-08-02 21:44:40 +00:00
Hans Wennborg	6bd4f924e7	Revert r338455 "[constexpr] Support for constant evaluation of __builtin_memcpy and __builtin_memmove (in non-type-punning cases)." It caused asserts during Chromium builds, see reply on the cfe-commits thread. > This is intended to permit libc++ to make std::copy etc constexpr > without sacrificing the optimization that uses memcpy on > trivially-copyable types. > > __builtin_strcpy and __builtin_wcscpy are not handled by this change. > They'd be straightforward to add, but we haven't encountered a need for > them just yet. llvm-svn: 338602	2018-08-01 17:51:23 +00:00
Richard Smith	96beffba15	[constexpr] Support for constant evaluation of __builtin_memcpy and __builtin_memmove (in non-type-punning cases). This is intended to permit libc++ to make std::copy etc constexpr without sacrificing the optimization that uses memcpy on trivially-copyable types. __builtin_strcpy and __builtin_wcscpy are not handled by this change. They'd be straightforward to add, but we haven't encountered a need for them just yet. llvm-svn: 338455	2018-07-31 23:35:09 +00:00
Mandeep Singh Grang	1f6fb8d063	[COFF, ARM64] Enable SEH for ARM64 Windows Reviewers: rnk, mstorsjo, ssijaric, haripul, TomTan Reviewed By: rnk Subscribers: kristof.beyls, chrib, cfe-commits Differential Revision: https://reviews.llvm.org/D50029 llvm-svn: 338405	2018-07-31 17:42:05 +00:00
Roman Lebedev	b69ba22773	[clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part Summary: C and C++ are interesting languages. They are statically typed, but weakly. The implicit conversions are allowed. This is nice, allows to write code while balancing between getting drowned in everything being convertible, and nothing being convertible. As usual, this comes with a price: ``` unsigned char store = 0; bool consume(unsigned int val); void test(unsigned long val) { if (consume(val)) { // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`. // If their bit widths are different on this platform, the implicit // truncation happens. And if that `unsigned long` had a value bigger // than UINT_MAX, then you may or may not have a bug. // Similarly, integer addition happens on `int`s, so `store` will // be promoted to an `int`, the sum calculated (0+768=768), // and the result demoted to `unsigned char`, and stored to `store`. // In this case, the `store` will still be 0. Again, not always intended. store = store + 768; // before addition, 'store' was promoted to int. } // But yes, sometimes this is intentional. // You can either make the conversion explicit (void)consume((unsigned int)val); // or mask the value so no bits will be implicitly lost. (void)consume((~((unsigned int)0)) & val); } ``` Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda noisy, since it warns on everything (unlike sanitizers, warning on an actual issues), and second, there are cases where it does not warn. So a Sanitizer is needed. I don't have any motivational numbers, but i know i had this kind of problem 10-20 times, and it was never easy to track down. The logic to detect whether an truncation has happened is pretty simple if you think about it - https://godbolt.org/g/NEzXbb - basically, just extend (using the new, not original!, signedness) the 'truncated' value back to it's original width, and equality-compare it with the original value. The most non-trivial thing here is the logic to detect whether this `ImplicitCastExpr` AST node is actually an implicit conversion, //or// part of an explicit cast. Because the explicit casts are modeled as an outer `ExplicitCastExpr` with some `ImplicitCastExpr`'s as direct children. https://godbolt.org/g/eE1GkJ Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`. So if that flag is not set, then it is an actual implicit conversion. As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`. There are potentially some more implicit conversions to be warned about. Namely, implicit conversions that result in sign change; implicit conversion between different floating point types, or between fp and an integer, when again, that conversion is lossy. One thing i know isn't handled is bitfields. This is a clang part. The compiler-rt part is D48959. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 \| PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 \| PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 \| PR35409 ]]. Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 \| PR9821 ]]. Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions) Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane Reviewed By: rsmith, vsk, erichkeane Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D48958 llvm-svn: 338288	2018-07-30 18:58:30 +00:00
Momchil Velikov	20208cc046	[ARM, AArch64]: Use unadjusted alignment when passing composites as arguments The "Procedure Call Procedure Call Standard for the ARM® Architecture" (https://static.docs.arm.com/ihi0042/f/IHI0042F_aapcs.pdf), specifies that composite types are passed according to their "natural alignment", i.e. the alignment before alignment adjustment on the entire composite is applied. The same applies for AArch64 ABI. Clang, however, used the adjusted alignment. GCC already implements the ABI correctly. With this patch Clang becomes compatible with GCC and passes such arguments in accordance with AAPCS. Differential Revision: https://reviews.llvm.org/D46013 llvm-svn: 338279	2018-07-30 17:48:23 +00:00
Stefan Maksimovic	6e50be1e97	[mips64][clang] Adjust tests to account for changes in r338239 llvm-svn: 338246	2018-07-30 12:27:40 +00:00
Mandeep Singh Grang	2a153101bf	[COFF, ARM64] Decide when to mark struct returns as SRet Summary: Refer the MS ARM64 ABI Convention for the behavior for struct returns: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions#return-values Reviewers: mstorsjo, compnerd, rnk, javed.absar, yinma, efriedma Reviewed By: rnk, efriedma Subscribers: haripul, TomTan, yinma, efriedma, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49464 llvm-svn: 338050	2018-07-26 18:07:59 +00:00
JF Bastien	6508929da9	CodeGen: use non-zero memset when possible for automatic variables Summary: Right now automatic variables are either initialized with bzero followed by a few stores, or memcpy'd from a synthesized global. We end up encountering a fair amount of code where memcpy of non-zero byte patterns would be better than memcpy from a global because it touches less memory and generates a smaller binary. The optimizer could reason about this, but it's not really worth it when clang already knows. This code could definitely be more clever but I'm not sure it's worth it. In particular we could track a histogram of bytes seen and figure out (as we do with bzero) if a memset could be followed by a handful of stores. Similarly, we could tune the heuristics for GlobalSize, but using the same as for bzero seems conservatively OK for now. <rdar://problem/42563091> Reviewers: dexonsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D49771 llvm-svn: 337887	2018-07-25 04:29:03 +00:00
Shiva Chen	0ed11a9792	Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)" This reverts commit 4288dd3bf082482e02c8a044c611c18168cb0180. llvm-svn: 337803	2018-07-24 02:57:11 +00:00
Shiva Chen	c50fbb9da7	[DebugInfo] Generate debug information for labels. (Fix PR37395) Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. After fixing PR37395. Differential Revision: https://reviews.llvm.org/D45045 Patch by Hsiangkai Wang. llvm-svn: 337800	2018-07-24 02:23:59 +00:00
Ivan A. Kosarev	8264bb8d34	[NEON] Fix support for vrndi_f32(), vrndiq_f32() and vrndns_f32() intrinsics This patch adds support for vrndi_f32() and vrndiq_f32() intrinsics in AArch32 mode and for vrndns_f32() intrinsic in AArch64 mode. Differential Revision: https://reviews.llvm.org/D48829 llvm-svn: 337690	2018-07-23 13:26:37 +00:00
Erich Keane	3efe00206f	Implement cpu_dispatch/cpu_specific Multiversioning As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning. This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions. This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units. The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions. Differential Revision: https://reviews.llvm.org/D47474 llvm-svn: 337552	2018-07-20 14:13:28 +00:00
Richard Smith	4c6568869e	Fix typo causing assert in self-host. llvm-svn: 337508	2018-07-19 23:24:41 +00:00
Richard Smith	83497d9ead	When we choose to use zeroinitializer for a trailing portion of an array constant, don't convert the rest into a packed struct. If an array constant has a large non-zero portion and a large zero portion, we want to emit the first part as an array and the rest as a zeroinitializer if possible. This fixes a memory usage regression from r333141 when compiling PHP. llvm-svn: 337498	2018-07-19 21:38:56 +00:00
Nemanja Ivanovic	1ac56bd33f	[PowerPC] Handle __builtin_xxpermdi the same way as GCC does The codegen for this builtin was initially implemented to match GCC. However, due to interest from users GCC changed behaviour to account for the big endian bias of the instruction and correct it. This patch brings the handling inline with GCC. Fixes https://bugs.llvm.org/show_bug.cgi?id=38192 Differential Revision: https://reviews.llvm.org/D49424 llvm-svn: 337449	2018-07-19 12:44:15 +00:00
Manoj Gupta	da08f6ac16	[clang]: Add support for "-fno-delete-null-pointer-checks" Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in as the function attribute "null-pointer-is-valid"="true". This CL only adds the attribute on the function. It also strips "nonnull" attributes from function arguments but keeps the related warnings unchanged. Corresponding LLVM change rL336613 already updated the optimizations to not treat null pointer dereferencing as undefined if the attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: jyknight Subscribers: drinkcat, xbolva00, cfe-commits Differential Revision: https://reviews.llvm.org/D47894 llvm-svn: 337433	2018-07-19 00:44:52 +00:00
JF Bastien	7d60a0f118	Support implicit _Atomic struct load / store Summary: Using _Atomic to do implicit load / store is just a seq_cst atomic_load / atomic_store. Stores currently assert in Sema::ImpCastExprToType with 'can't implicitly cast lvalue to rvalue with this cast kind', but that's erroneous. The codegen is fine as the test shows. While investigating I found that Richard had found the problem here: https://reviews.llvm.org/D46112#1113557 <rdar://problem/40347123> Reviewers: dexonsmith Subscribers: cfe-commits, efriedma, rsmith, aaron.ballman Differential Revision: https://reviews.llvm.org/D49458 llvm-svn: 337410	2018-07-18 18:01:41 +00:00
Peter Collingbourne	14b468bab6	Re-land r337333, "Teach Clang to emit address-significance tables.", which was reverted in r337336. The problem that required a revert was fixed in r337338. Also added a missing "REQUIRES: x86-registered-target" to one of the tests. Original commit message: > Teach Clang to emit address-significance tables. > > By default, we emit an address-significance table on all ELF > targets when the integrated assembler is enabled. The emission of an > address-significance table can be controlled with the -faddrsig and > -fno-addrsig flags. > > Differential Revision: https://reviews.llvm.org/D48155 llvm-svn: 337339	2018-07-18 00:27:07 +00:00
Peter Collingbourne	35c6996b68	Revert r337333, "Teach Clang to emit address-significance tables." Causing multiple failures on sanitizer bots due to TLS symbol errors, e.g. /usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o llvm-svn: 337336	2018-07-17 23:56:30 +00:00
Peter Collingbourne	27242c0402	Teach Clang to emit address-significance tables. By default, we emit an address-significance table on all ELF targets when the integrated assembler is enabled. The emission of an address-significance table can be controlled with the -faddrsig and -fno-addrsig flags. Differential Revision: https://reviews.llvm.org/D48155 llvm-svn: 337333	2018-07-17 23:17:16 +00:00
Mandeep Singh Grang	0054f48b44	[COFF] Add more missing MSVC ARM64 intrinsics Summary: Added the following intrinsics: _BitScanForward, _BitScanReverse, _BitScanForward64, _BitScanReverse64 _InterlockedAnd64, _InterlockedDecrement64, _InterlockedExchange64, _InterlockedExchangeAdd64, _InterlockedExchangeSub64, _InterlockedIncrement64, _InterlockedOr64, _InterlockedXor64. Reviewers: compnerd, mstorsjo, rnk, javed.absar Reviewed By: mstorsjo Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49445 llvm-svn: 337327	2018-07-17 22:03:24 +00:00
Joerg Sonnenberger	b79e61f8b3	Always use __mcount on NetBSD. Some platforms don't provide _mcount. llvm-svn: 337277	2018-07-17 13:13:34 +00:00
Roman Lebedev	a67e0d0989	Harden/relax clang/test/CodeGen/opt-record-MIR.c test Summary: If the build path is short, `Line` field can end up fitting on the same line as `File`, but the `{{.*}}` would consume it. Keeping in mind rL293149, i think we can fix it, while keeping it working when there are and there are not any quotations. At least this fixes this test for me. Reviewers: anemet, aaron.ballman, hfinkel Reviewed By: anemet Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D49348 llvm-svn: 337249	2018-07-17 07:12:08 +00:00
Krzysztof Parzyszek	d66941fab9	[Hexagon] Fix hvx-length feature name in testcases llvm-svn: 337049	2018-07-13 21:32:33 +00:00
JF Bastien	9aab85a6a0	CodeGen: specify alignment + inbounds for automatic variable initialization Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041	2018-07-13 20:33:23 +00:00
Craig Topper	2f7de23bea	[X86] Change the rounding mode used when testing the sqrt_round intrinsics. Using CUR_DIRECTION is not a realistic scenario. That is equivalent to the intrinsic without rounding. llvm-svn: 337040	2018-07-13 20:16:38 +00:00
Krzysztof Parzyszek	d5cd7ce743	[Hexagon] Fix testcases failing after r336933 llvm-svn: 336936	2018-07-12 19:44:39 +00:00
Akira Hatanaka	e18c2d2ce3	os_log: When there are multiple privacy annotations in the format string, choose the strictest one instead of the last. Also fix an undefined behavior. Move the pointer update to a later point to avoid adding StringRef::npos to the pointer. rdar://problem/40706280 llvm-svn: 336863	2018-07-11 22:19:14 +00:00
Joel E. Denny	72c2783012	[FileCheck] Add -allow-deprecated-dag-overlap to failing clang tests See https://reviews.llvm.org/D47106 for details. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D47172 llvm-svn: 336844	2018-07-11 20:26:20 +00:00
Petr Pavlu	a934f9da41	Fix setting of empty implicit-section-name attribute Code in `CodeGenModule::SetFunctionAttributes()` could set an empty attribute `implicit-section-name` on a function that is affected by `#pragma clang text="section"`. This is incorrect because the attribute should contain a valid section name. If the function additionally also used `__attribute__((section("section")))` then this could result in emitting the function in a section with an empty name. The patch fixes the issue by removing the problematic code that sets empty `implicit-section-name` from `CodeGenModule::SetFunctionAttributes()` because it is sufficient to set this attribute only from a similar code in `setNonAliasAttributes()` when the function is emitted. Differential Revision: https://reviews.llvm.org/D48916 llvm-svn: 336842	2018-07-11 20:17:54 +00:00
Craig Topper	db5c9aa366	[X86] Also fix the test for _mm512_mullo_epi64 to test the intrinsic instead of a copy of the intrinsic implementation. This had the same issue I just fixed in r336739. Apparently I copy pasted _mm512_mullo_epi64 when I added _mm512_mullox_epi64. llvm-svn: 336740	2018-07-10 23:28:05 +00:00
Craig Topper	4ddb2f3a18	[X86] Fix the test for _mm512_mullox_epi64 to test the intrinsic instead of a copy of the intrinsic implementation. llvm-svn: 336739	2018-07-10 23:13:01 +00:00
Bjorn Pettersson	404f414ee1	Patch to fix pragma metadata for do-while loops Summary: Make sure that loop metadata only is put on the backedge when expanding a do-while loop. Previously we added the loop metadata also on the branch in the pre-header. That could confuse optimization passes and result in the loop metadata being associated with the wrong loop. Fixes https://bugs.llvm.org/show_bug.cgi?id=38011 Committing on behalf of deepak2427 (Deepak Panickal) Reviewers: #clang, ABataev, hfinkel, aaron.ballman, bjope Reviewed By: bjope Subscribers: bjope, rsmith, shenhan, zzheng, xbolva00, lebedev.ri, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D48721 llvm-svn: 336717	2018-07-10 19:55:02 +00:00
Matt Arsenault	1dae500c4e	AMDGPU: Try to fix test again llvm-svn: 336681	2018-07-10 14:47:31 +00:00
Matt Arsenault	3ae7d63c80	Update test for backend error message change llvm-svn: 336676	2018-07-10 14:03:50 +00:00
Mikhail Dvoretckii	d1bf9ef0c7	[X86] Lowering integer truncation intrinsics to native IR This patch lowers the _mm[256\|512]_cvtepi{64\|32\|16}_epi{32\|16\|8} intrinsics to native IR in cases where the result's length is less than 128 bits. The resulting IR for 256-bit inputs is folded into VPMOV instructions, while for 128-bit inputs the vpshufb (or, in the 64-to-32-bit case, vinsertps) instructions are generated instead Differential Revision: https://reviews.llvm.org/D48712 llvm-svn: 336643	2018-07-10 08:22:44 +00:00
Craig Topper	36ab775cc1	[X86] Use masked the masked scalar fma builtins to implement the default rounding version of the fma intrinsics. The rounding mode is checked in CGBuiltin.cpp to generate the correct intrinsic call. Making this switch switchs the masking to use the i8 bitcast to <8 x i1> and extract i1 version of the IR for the mask. Previously we ended up with a scalar 'and' plus an icmp. llvm-svn: 336637	2018-07-10 04:38:29 +00:00
Akira Hatanaka	189359d1ff	Fix parsing of privacy annotations in os_log format strings. Privacy annotations shouldn't have to appear in the first comma-delimited string in order to be recognized. Also, they should be ignored if they are preceded or followed by non-whitespace characters. rdar://problem/40706280 llvm-svn: 336629	2018-07-10 00:50:25 +00:00
Craig Topper	638426fc36	[X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics. This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics. This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling. I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle. llvm-svn: 336622	2018-07-10 00:37:25 +00:00
Stefan Pintilie	c172604dba	[Power9] [CLANG] Add __float128 support for trunc to double round to odd Add support for this builtin: double builtin_truncf128_round_to_odd(float128) Differential Revision: https://reviews.llvm.org/D48482 llvm-svn: 336596	2018-07-09 20:09:52 +00:00
Craig Topper	74c10e3236	[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter. Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible. To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future. To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin. To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute. To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins. There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch. Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue. Differential Revision: https://reviews.llvm.org/D48617 llvm-svn: 336583	2018-07-09 19:00:16 +00:00
Stefan Pintilie	3dbde8a778	[Power9] Add __float128 builtins for Round To Odd Add a number of builtins for __float128 Round To Odd. This is the Clang portion of the builtins work. Differential Revision: https://reviews.llvm.org/D47548 llvm-svn: 336579	2018-07-09 18:50:40 +00:00
Craig Topper	8a8d72794f	[X86] Add new scalar fma intrinsics with rounding mode that use f32/f64 types. This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma llvm-svn: 336507	2018-07-08 01:10:47 +00:00
Craig Topper	3e720a302c	[X86] Fix a few intrinsics that were ignoring their rounding mode argument and hardcoded _MM_FROUND_CUR_DIRECTION internally. I believe these have been broken since their introduction into clang. I've enhanced the tests for these intrinsics to using a real rounding mode and checking all the intrinsic arguments instead of just the name. llvm-svn: 336498	2018-07-07 22:03:16 +00:00
Craig Topper	5cbeeedd27	[X86] Fix various type mismatches in intrinsic headers and intrinsic tests that cause extra bitcasts to be emitted in the IR. Found via imprecise grepping of the -O0 IR. There could still be more bugs out there. llvm-svn: 336487	2018-07-07 17:03:32 +00:00
Craig Topper	f89f62a680	[X86] When creating a select for scalar masked sqrt and div builtins make sure we optimize the all ones mask case. This case occurs in the intrinsic headers so we should avoid emitting the mask in those cases. Factor the code into a helper function to make this easy. llvm-svn: 336472	2018-07-06 22:46:52 +00:00
Craig Topper	10f20fc42b	[X86] Add missing scalar fma intrinsics with rounding, but no mask. We had the mask versions of the rounding intrinsics, but not one without masking. Also change the rounding tests to not use the CUR_DIRECTION rounding mode. llvm-svn: 336470	2018-07-06 22:08:43 +00:00
Craig Topper	be4c2933a2	[X86] Implement _builtin_ia32_vfmaddss and _builtin_ia32_vfmaddsd with native IR using llvm.fma intrinsic. This generates some extra zeroing currently, but we should be able to quickly address that with some isel patterns. llvm-svn: 336417	2018-07-06 07:14:47 +00:00
Hans Wennborg	7525edc890	[ms] Fix mangling of string literals used to initialize arrays larger or smaller than the literal A Chromium developer reported a bug which turned out to be a mangling collision between these two literals: char s[] = "foo"; char t[32] = "foo"; They may look the same, but for the initialization of t we will (under some circumstances) use a literal that's extended with zeros, and both the length and those zeros should be accounted for by the mangling. This actually makes the mangling code simpler: where it previously had special logic for null terminators, which are not part of the StringLiteral, that is now covered by the general algorithm. (The problem was reported at https://crbug.com/857442) Differential Revision: https://reviews.llvm.org/D48928 llvm-svn: 336415	2018-07-06 06:54:16 +00:00
Craig Topper	284c5f342c	[X86] Use shufflevector instead of a select with a constant mask for fmaddsub/fmsubadd IR emission. Shufflevector is easier to generate and matches what the backend pattern matches without relying on constant selects being turned into shuffles. While I was there I also made the IR regular expressions a little stricter to ensure operand order on the shuffle. llvm-svn: 336388	2018-07-05 20:38:31 +00:00
Gabor Buella	9679eb6527	[X86] Fix some vector cmp builtins - TRUE/FALSE predicates This patch removes on optimization used with the TRUE/FALSE predicates, as was suggested in https://reviews.llvm.org/D45616 for r335339. The optimization was buggy, since r335339 used it also for *_mask builtins, without actually applying the mask -- the mask argument was just ignored. Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D48715 llvm-svn: 336355	2018-07-05 14:26:56 +00:00
Gabor Buella	38a7ce76ae	[X86] NFC - add more test cases for vector cmp intrinsics Add test cases with each predicate using the following intrinsics: _mm_cmp_pd _mm_cmp_ps _mm256_cmp_pd _mm256_cmp_ps _mm_cmp_pd_mask _mm_cmp_ps_mask _mm256_cmp_pd_mask _mm256_cmp_ps_mask _mm512_cmp_pd_mask _mm512_cmp_ps_mask _mm_mask_cmp_pd_mask _mm_mask_cmp_ps_mask _mm256_mask_cmp_pd_mask _mm256_mask_cmp_ps_mask _mm512_mask_cmp_pd_mask _mm512_mask_cmp_ps_mask Some of these are marked with FIXME, as there is bug in lowering e.g. _mm512_mask_cmp_ps_mask. llvm-svn: 336346	2018-07-05 12:57:47 +00:00
Lei Huang	449252d2ad	[Power9] Update fp128 as a valid homogenous aggregate base type Update clang to treat fp128 as a valid base type for homogeneous aggregate passing and returning. Differential Revision: https://reviews.llvm.org/D48044 llvm-svn: 336308	2018-07-05 04:32:01 +00:00
Gabor Buella	bf591794f4	NFC - Fix type in builtins-ppc-p9vector.c test llvm-svn: 336264	2018-07-04 11:29:21 +00:00
Gabor Buella	ab38eb29cf	NFC - typo fix in test/CodeGen/avx512f-builtins.c llvm-svn: 336243	2018-07-04 08:32:02 +00:00
Yaron Keren	14cd1997f1	Add expected fail triple x86_64-pc-windows-gnu to test as x86_64-w64-mingw32 is already there llvm-svn: 336047	2018-06-30 11:18:44 +00:00
Craig Topper	0029470dde	[X86] Correct the width of mask arguments in intrinsic headers and tests. All of these found by grepping through IR from the builtin tests for extra trunc and zext/sext instructions that shouldn't have been there. Some of these were real bugs where we lost bits from the user input: _mm512_mask_broadcast_f32x8 _mm512_maskz_broadcast_f32x8 _mm512_mask_broadcast_i32x8 _mm512_maskz_broadcast_i32x8 _mm256_mask_cvtusepi16_storeu_epi8 llvm-svn: 336042	2018-06-30 06:05:17 +00:00
Craig Topper	0e9de769a0	[X86] Remove masking from the avx512 rotate builtins. Use a select builtin instead. llvm-svn: 336036	2018-06-30 01:32:14 +00:00
Craig Topper	8bf793fb35	[X86] Remove masking from the avx512 packed sqrt builtins. Use select builtins instead. llvm-svn: 335945	2018-06-29 05:43:33 +00:00
Francis Visoiu Mistrih	7b7b5eb60d	[NEON] Remove empty test file from r335734 Fails on Green Dragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50174/consoleFull UNRESOLVED: Clang :: CodeGen/vld_dup.c (5546 of 38947) ****************** TEST 'Clang :: CodeGen/vld_dup.c' FAILED ****************** Test has no run line! llvm-svn: 335750	2018-06-27 16:17:32 +00:00
Craig Topper	851f363691	[X86] Rename llvm.x86.avx512.mask.fpclass.p* to exclude 'mask.' from the name to match llvm. llvm-svn: 335745	2018-06-27 15:57:57 +00:00
Ivan A. Kosarev	a9f484ac4a	[NEON] Support vldNq intrinsics in AArch32 (Clang part) This patch reworks the support for dup NEON intrinsics as described in D48439. Differential Revision: https://reviews.llvm.org/D48440 llvm-svn: 335734	2018-06-27 13:58:43 +00:00
Teresa Johnson	25becb35e9	[ThinLTO] Add testing of summary index parsing to a couple CFI tests Summary: Changes to some clang side tests to go with the summary parsing patch. Depends on D47905. Reviewers: pcc, dexonsmith, mehdi_amini Subscribers: inglorion, eraman, cfe-commits, steven_wu Differential Revision: https://reviews.llvm.org/D47906 llvm-svn: 335618	2018-06-26 15:50:34 +00:00
Sam Clegg	6fd7d680b0	[WebAssembly] Add no-prototype attribute to prototype-less C functions The WebAssembly backend in particular benefits from being able to distinguish between varargs functions (...) and prototype-less C functions. Differential Revision: https://reviews.llvm.org/D48443 llvm-svn: 335510	2018-06-25 18:47:32 +00:00
Hans Wennborg	08c5a7b8fd	[clang-cl] Don't emit dllexport inline functions etc. from pch files (PR37801) With MSVC, PCH files are created along with an object file that needs to be linked into the final library or executable. That object file contains the code generated when building the headers. In particular, it will include definitions of inline dllexport functions, and because they are emitted in this object file, other files using the PCH do not need to emit them. See the bug for an example. This patch makes clang-cl match MSVC's behaviour in this regard, causing significant compile-time savings when building dlls using precompiled headers. For example, in a 64-bit optimized shared library build of Chromium with PCH, it reduces the binary size and compile time of stroke_opacity_custom.obj from 9315564 bytes to 3659629 bytes and 14.6 to 6.63 s. The wall-clock time of building blink_core.dll goes from 38m41s to 22m33s. ("user" time goes from 1979m to 1142m). Differential Revision: https://reviews.llvm.org/D48426 llvm-svn: 335466	2018-06-25 13:23:49 +00:00
Tobias Edler von Koch	7609cb83e6	Re-land "[LTO] Enable module summary emission by default for regular LTO" Since we are now producing a summary also for regular LTO builds, we need to run the NameAnonGlobals pass in those cases as well (the summary cannot handle anonymous globals). See https://reviews.llvm.org/D34156 for details on the original change. This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3. llvm-svn: 335385	2018-06-22 20:23:21 +00:00
Gabor Buella	716863c820	[X86] Lower _mm[256\|512]_cmp[.]_mask intrinsics to native llvm IR Summary: Lowering some vector comparision builtins to fcmp IR instructions. This ignores the signaling behaviour specified in the predicate argument of said builtins. Affected AVX512 builtins: __builtin_ia32_cmpps128_mask __builtin_ia32_cmpps256_mask __builtin_ia32_cmpps512_mask __builtin_ia32_cmppd128_mask __builtin_ia32_cmppd256_mask __builtin_ia32_cmppd512_mask Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma Reviewed By: craig.topper, spatel, efriedma Differential Revision: https://reviews.llvm.org/D45616 llvm-svn: 335339	2018-06-22 11:59:16 +00:00
Chandler Carruth	16e6bc23a1	[x86] Teach the builtin argument range check to allow invalid ranges in dead code. This is important for C++ templates that essentially compute the valid input in a way that is constant and will cause all the invalid cases to be dead code that is deleted. Code in the wild actually does this and GCC also accepts these kinds of patterns so it is important to support it. To make this work, we provide a non-error path to diagnose these issues, and use a default-error warning instead. This keeps the relatively strict handling but prevents nastiness like SFINAE on these errors. It also allows us to safely use the system to diagnose this only when it occurs at runtime (in emitted code). Entertainingly, this required fixing the syntax in various other ways for the x86 test because we never bothered to diagnose that the returns were invalid. Since debugging these compile failures was super confusing, I've also improved the diagnostic to actually say what the value was. Most of the checks I've made ignore this to simplify maintenance, but I've checked it in a few places to make sure the diagnsotic is working. Depends on D48462. Without that, we might actually crash some part of the compiler after bypassing the error here. Thanks to Richard, Ben Kramer, and especially Craig Topper for all the help here. Differential Revision: https://reviews.llvm.org/D48464 llvm-svn: 335309	2018-06-21 23:46:09 +00:00
Craig Topper	342b095689	[X86] Update handling in CGBuiltin to be tolerant of out of range immediates. D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled. This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled. llvm-svn: 335308	2018-06-21 23:39:47 +00:00
Evgeniy Stepanov	fb762b27f2	Ignore blacklist when generating __cfi_check_fail. Summary: Fixes PR37898. Reviewers: pcc, vlad.tsyrklevich Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D48454 llvm-svn: 335305	2018-06-21 23:22:37 +00:00
Tobias Edler von Koch	e597a2cf81	Revert "[LTO] Enable module summary emission by default for regular LTO" This is breaking a couple of buildbots. We need to run the NameAnonGlobal pass for regular LTO now as well (since we're producing a summary). I'll post a separate patch for review to make this happen and then re-commit. This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd. llvm-svn: 335291	2018-06-21 21:24:30 +00:00
Tobias Edler von Koch	9a8be606f3	[LTO] Enable module summary emission by default for regular LTO Summary: With D33921, we gained the ability to have module summaries in regular LTO modules without triggering ThinLTO compilation. Module summaries in regular LTO allow garbage collection (dead stripping) before LTO compilation and thus open up additional optimization opportunities. This patch enables summary emission in regular LTO for all targets except ld64-based ones (which use the legacy LTO API). Reviewers: pcc, tejohnson, mehdi_amini Subscribers: inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D34156 llvm-svn: 335284	2018-06-21 20:20:41 +00:00
Craig Topper	1763dbb278	[X86] Correct the inline assembly implementations of __movsb/w/d/q and __stosw/d/q to mark registers/memory as modified The inline assembly for these didn't mark that edi, esi, ecx are modified by movs/stos instruction. It also didn't mark that memory is modified. This issue was reported to llvm-dev last year http://lists.llvm.org/pipermail/cfe-dev/2017-November/055863.html but no bug was ever filed. Differential Revision: https://reviews.llvm.org/D48448 llvm-svn: 335270	2018-06-21 18:56:30 +00:00
Anastasis Grammenos	dfe8fe503c	[DebugInfo] Inline for without DebugLocation Summary: This test is a strip down version of a function inside the amalgamated sqlite source. When converted to IR clang produces a phi instruction without debug location. This patch fixes the above issue. Differential Revision: https://reviews.llvm.org/D47720 llvm-svn: 335255	2018-06-21 16:53:48 +00:00
Craig Topper	ddfe69cc99	[X86] Rewrite the add/mul/or/and reduction intrinsics to make better use of other intrinsics and remove undef shuffle indices. Similar to what was done to max/min recently. These already reduced the vector width to 256 and 128 bit as we go unlike the original max/min code. Differential Revision: https://reviews.llvm.org/D48346 llvm-svn: 335253	2018-06-21 16:41:28 +00:00
Craig Topper	2da60bc231	[X86] Remove masking from the 512-bit floating point max/min builtins. Use select in IR instead. llvm-svn: 335200	2018-06-21 05:01:01 +00:00
Sjoerd Meijer	5c4c998a54	[SPIR] Prevent SPIR targets from using half conversion intrinsics The SPIR target currently allows for half precision floating point types to be emitted using the LLVM intrinsic functions which convert half types to floats and doubles. However, this is illegal in SPIR as the only intrinsic allowed by SPIR is memcpy, as per section 3 of the SPIR specification. Currently this is leading to an assert being hit in the Clang CodeGen when attempting to emit a constant or literal _Float16 type in a comparison operation on a SPIR or SPIR64 target. This assert stems from the CodeGen attempting to emit a constant half value as an integer because the backend has specified that it is using these half conversion intrinsics (which represents half as i16). This patch prevents SPIR targets from using these intrinsics by overloading the responsible target info method, marks SPIR targets as having a legal half type and provides additional regression testing for the _Float16 type on SPIR targets. Patch by: Stephen McGroarty Differential Revision: https://reviews.llvm.org/D48188 llvm-svn: 335111	2018-06-20 09:49:40 +00:00
Craig Topper	61495b3650	Recommit r335070 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible."" Test has been updated to reflect the IRGen. llvm-svn: 335075	2018-06-19 21:00:30 +00:00
Craig Topper	79b13a0348	Revert r335070 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible." The test changes are failing the buildbot and its going to take me some time to fix it. llvm-svn: 335072	2018-06-19 19:37:07 +00:00
Craig Topper	873afd0262	[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible. We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX. For v16i32 and floating point we have legacy 128/256 bit instructions we can use. I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization. Differential Revision: https://reviews.llvm.org/D47401 llvm-svn: 335070	2018-06-19 19:13:54 +00:00
Tomasz Krupa	f1792bb3d6	[X86] Lowering sqrt intrinsics to native IR Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, cfe-commits Differential Revision: https://reviews.llvm.org/D41168 llvm-svn: 334850	2018-06-15 18:05:59 +00:00
Luke Geeson	da2b2e8c26	[AArch64] Reverted rC334696 with Clang VCVTA test fix llvm-svn: 334820	2018-06-15 10:10:45 +00:00
Craig Topper	b521dc3acf	[X86] Add inline assembly versions of _InterlockedExchange_HLEAcquire/Release and _InterlockedCompareExchange_HLEAcquire/Release for MSVC compatibility. Clang/LLVM doesn't have a way to pass an HLE hint through to the X86 backend to emit HLE prefixed instructions. So this is a good short term fix. Differential Revision: https://reviews.llvm.org/D47672 llvm-svn: 334751	2018-06-14 18:43:52 +00:00
Tomasz Krupa	82aa42af49	[X86] Lowering Mask Scalar intrinsics to native IR (Clang part) Summary: Lowering add, sub, mul, and div mask scalar intrinsic calls to native IR. Reviewers: craig.topper, RKSimon, spatel, sroland Reviewed By: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47979 llvm-svn: 334741	2018-06-14 17:36:23 +00:00
Luke Geeson	bb399f8013	[AArch64] reverting rC334693 due to build failures llvm-svn: 334696	2018-06-14 08:59:33 +00:00
Luke Geeson	010bbbf390	[AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A llvm-svn: 334693	2018-06-14 08:28:56 +00:00
Mandeep Singh Grang	2d28383097	[COFF] Add ARM64 intrinsics: __yield, __wfe, __wfi, __sev, __sevl Summary: These intrinsics result in hint instructions. They are provided here for MSVC ARM64 compatibility. Reviewers: mstorsjo, compnerd, javed.absar Reviewed By: mstorsjo Subscribers: kristof.beyls, chrib, cfe-commits Differential Revision: https://reviews.llvm.org/D48132 llvm-svn: 334639	2018-06-13 18:49:35 +00:00
Sanjay Patel	1d7ed94439	[CodeGen] make nan builtins pure rather than const (PR37778) https://bugs.llvm.org/show_bug.cgi?id=37778 ...shows a miscompile resulting from marking nan builtins as 'const'. The nan libcalls/builtins take a pointer argument: http://www.cplusplus.com/reference/cmath/nan-function/ ...and the chars dereferenced by that arg are used to fill in the NaN constant payload bits. "const" means that the pointer argument isn't dereferenced. That's translated to "readnone" in LLVM. "pure" means that the pointer argument may be dereferenced. That's translated to "readonly" in LLVM. This change prevents the IR optimizer from killing the lead-up to the nan call here: double a() { char buf[4]; buf[0] = buf[1] = buf[2] = '9'; buf[3] = '\0'; return __builtin_nan(buf); } ...the optimizer isn't currently able to simplify this to a constant as we might hope, but this patch should solve the miscompile. Differential Revision: https://reviews.llvm.org/D48134 llvm-svn: 334628	2018-06-13 17:54:52 +00:00
Craig Topper	2527c378c6	[X86] Remove masking from avx512vbmi2 concat and shift by immediate builtins. Use select builtins instead. llvm-svn: 334577	2018-06-13 07:19:28 +00:00
Luke Geeson	dc54b37414	[AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests Summary: This fixes the ranges for the vcvth family of FP16 intrinsics in the clang front end. Previously it was accepting incorrect ranges -Changed builtin range checking in SemaChecking -added tests SemaCheck changes - included in their own file since no similar one exists -modified existing tests to reflect new ranges Reviewers: SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, cfe-commits Differential Revision: https://reviews.llvm.org/D47592 llvm-svn: 334489	2018-06-12 09:54:27 +00:00
Mikhail Maltsev	5434047862	[Driver] Add aliases for -Qn/-Qy This patch adds aliases for -Qn (-fno-ident) and -Qy (-fident) which look less cryptic than -Qn/-Qy. The aliases are compatible with GCC. Differential Revision: https://reviews.llvm.org/D48021 llvm-svn: 334414	2018-06-11 16:10:06 +00:00
Craig Topper	91bbe98757	[X86] Remove masking from dbpsadbw builtins, use select builtin instead. llvm-svn: 334385	2018-06-11 06:18:29 +00:00
Craig Topper	3cce6a7ed9	[X86] Use target independent masked expandload and compressstore intrinsics to implement expandload/compressstore builtins. Summary: We've had these target independent intrinsics for at least a year and a half. Looks like they do exactly what we need here and the backend already supports them. Reviewers: RKSimon, delena, spatel, GBuella Reviewed By: RKSimon Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D47693 llvm-svn: 334366	2018-06-10 17:27:05 +00:00
Ivan A. Kosarev	73c76c35a5	[NEON] Support VST1xN intrinsics in AArch32 mode (Clang part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47446 llvm-svn: 334362	2018-06-10 09:28:10 +00:00
Craig Topper	3614b41a8e	[X86] Remove masking from the 512-bit packed floating point add/sub/mul/div builtins. Use select in IR instead. llvm-svn: 334359	2018-06-10 06:01:42 +00:00
Craig Topper	03f4f04b91	[X86] Add builtins for vpermq/vpermpd instructions to enable target feature checking. llvm-svn: 334311	2018-06-08 18:00:25 +00:00
Craig Topper	03de166ccd	[X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking. llvm-svn: 334265	2018-06-08 06:13:16 +00:00
Craig Topper	3428beeb2f	[X86] Add subvector insert and extract builtins to enable target feature checking and immediate range checking. Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc. llvm-svn: 334261	2018-06-08 03:24:47 +00:00
Craig Topper	acf5601961	[X86] Add builtins for vpermilps/pd instructions to enable target feature checking. llvm-svn: 334256	2018-06-08 00:59:27 +00:00
Craig Topper	7d17d7278b	[X86] Add builtins for blend with immediate control to enforce target feature requirements and check immediate range. llvm-svn: 334249	2018-06-08 00:00:21 +00:00
Craig Topper	9392136414	[X86] Add builtins for shuff32x4/shuff64x2/shufi32x4/shuff64x2 to enable target feature checking and immediate range checking. llvm-svn: 334244	2018-06-07 23:03:08 +00:00
Shoaib Meenai	d8d1547387	[Frontend] Disallow non-MSVC exception models for windows-msvc targets The windows-msvc target is used for MSVC ABI compatibility, including the exceptions model. It doesn't make sense to pair a windows-msvc target with a non-MSVC exception model. This would previously cause an assertion failure; explicitly error out for it in the frontend instead. This also allows us to reduce the matrix of target/exception models a bit (see the modified tests), and we can possibly simplify some of the personality code in a follow-up. Differential Revision: https://reviews.llvm.org/D47853 llvm-svn: 334243	2018-06-07 22:54:54 +00:00
Reid Kleckner	aa46ed9278	[MS] Re-add support for the ARM interlocked bittest intrinscs Adds support for these intrinsics, which are ARM and ARM64 only: _interlockedbittestandreset_acq _interlockedbittestandreset_rel _interlockedbittestandreset_nf _interlockedbittestandset_acq _interlockedbittestandset_rel _interlockedbittestandset_nf Refactor the bittest intrinsic handling to decompose each intrinsic into its action, its width, and its atomicity. llvm-svn: 334239	2018-06-07 21:39:04 +00:00
Craig Topper	d3623155a2	[X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics. We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits. This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously. llvm-svn: 334208	2018-06-07 17:28:03 +00:00
Gabor Buella	1a83d06768	[CodeGen] Improve diagnostics related to target attributes Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are explicitly mentioned in the attributes. This makes such messages more useful, e.g. printing avx512f instead of avx2 in the following scenario: ``` $ cat foo.c static inline void __attribute__((__always_inline__, __target__("avx512f"))) x(void) { } int main(void) { x(); } $ clang foo.c foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2' x(); ^ 1 error generated. ``` bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338 Reviewers: craig.topper, echristo, dblaikie Reviewed By: craig.topper, echristo Differential Revision: https://reviews.llvm.org/D46541 llvm-svn: 334174	2018-06-07 08:48:36 +00:00
Craig Topper	b92c77d176	[X86] Add back _mask, _maskz, and _mask3 builtins for some 512-bit fmadd/fmsub/fmaddsub/fmsubadd builtins. Summary: We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't. I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim. The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform. I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it. Reviewers: tkrupa, RKSimon, spatel, GBuella Reviewed By: tkrupa Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47724 llvm-svn: 334159	2018-06-07 02:46:02 +00:00
Evandro Menezes	0804523bd5	[PATCH 2/2] [test] Add support for Samsung Exynos M4 (NFC) Add test cases for Exynos M4. llvm-svn: 334116	2018-06-06 18:58:01 +00:00
Reid Kleckner	11c99ed05f	[MS][ARM64]: Promote _setjmp to_setjmpex as there is no _setjmp in the ARM64 libvcruntime.lib Factor out the common setjmp call emission code. Based on a patch by Chris January Differential Revision: https://reviews.llvm.org/D47784 llvm-svn: 334112	2018-06-06 18:39:47 +00:00
Reid Kleckner	368d52b7e0	Implement bittest intrinsics generically for non-x86 platforms I tested these locally on an x86 machine by disabling the inline asm codepath and confirming that it does the same bitflips as we do with the inline asm. Addresses code review feedback. llvm-svn: 334059	2018-06-06 01:35:08 +00:00
Craig Topper	f3914b74c1	[X86] Add builtins for vector element insert and extract for different 128 and 256 bit vector types. Use them to implement the extract and insert intrinsics. Previously we were just using extended vector operations in the header file. This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load. By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count. User code still has the option to use extended vector operations themselves if they need non-constant indexing. llvm-svn: 334057	2018-06-06 00:24:55 +00:00
Reid Kleckner	1d9c249db5	Reimplement the bittest intrinsic family as builtins with inline asm We need to implement _interlockedbittestandset as a builtin for windows.h, so we might as well do the whole family. It reduces code duplication anyway. Fixes PR33188, a long standing bug in our bittest implementation encountered by Chakra. llvm-svn: 333978	2018-06-05 01:33:40 +00:00
Teresa Johnson	e2e630b01b	[ThinLTO] Add testing of new summary index format to a couple CFI tests Summary: Adds testing of combined index summary entries in disassembly format to CFI tests that were already testing the bitcode format. Depends on D46699. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D46700 llvm-svn: 333966	2018-06-04 23:05:24 +00:00
Reid Kleckner	89fbd55145	Revert r333791 "Cap "voluntary" vector alignment at 16 for all Darwin platforms." Adding __attribute__((aligned(32))) to __m256 breaks the implementation of _mm256_loadu_ps on Windows. On Windows, alignment attributes have higher precedence than packing attributes. We also might want to carefully consider the consequences of changing our vector typedefs, since many users copy them and invent their own new, non-Intel specific vector type names. llvm-svn: 333958	2018-06-04 21:39:20 +00:00
Craig Topper	95ed88a1a9	[X86] Avoid passing _mm_undefined* to builtin_shufflevector if we are able to pass the first input a second time. This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later. llvm-svn: 333944	2018-06-04 19:28:09 +00:00
Craig Topper	6fb26f93ef	[X86] Replace __builtin_ia32_vbroadcastf128_pd256 and __builtin_ia32_vbroadcastf128_ps256 with an unaligned load intrinsics and a __builtin_shufflevector call. llvm-svn: 333853	2018-06-03 19:42:59 +00:00
Ivan A. Kosarev	9c40c0ad0c	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333829	2018-06-02 17:42:59 +00:00
John McCall	280c656031	Cap "voluntary" vector alignment at 16 for all Darwin platforms. This fixes two major problems: - We were not capping vector alignment as desired on 32-bit ARM. - We were using different alignments based on the AVX settings on Intel, so we did not have a consistent ABI. This is an ABI break, but we think we can get away with it because vectors tend to be used mostly in inline code (which is why not having a consistent ABI has not proven disastrous on Intel). Intel's AVX types are specified as having 32-byte / 64-byte alignment, so align them explicitly instead of relying on the base ABI rule. Note that this sort of attribute is stripped from template arguments in template substitution, so there's a possibility that code templated over vectors will produce inadequately-aligned objects. The right long-term solution for this is for alignment attributes to be interpreted as true qualifiers and thus preserved in the canonical type. llvm-svn: 333791	2018-06-01 21:34:26 +00:00
Dan Gohman	9f8ee03772	[WebAssembly] Update to the new names for the memory builtin functions. The WebAssembly committee has decided on the names `memory.size` and `memory.grow` for the memory intrinsics, so update the clang builtin functions to follow those names, keeping both sets of old names in place for compatibility. llvm-svn: 333712	2018-06-01 00:05:51 +00:00
Peter Collingbourne	3aa30e8062	IRGen: Write .dwo files when -split-dwarf-file is used together with -fthinlto-index. Differential Revision: https://reviews.llvm.org/D47597 llvm-svn: 333677	2018-05-31 18:25:59 +00:00
Craig Topper	a6dd2faaea	[X86] Make 512-bit unmasked load/store builtins more like their 128/256-bit equivalents. Previously we were just passing -1 mask to the masked builtin. This changes it to the more generic way that the 128/256 bit use. llvm-svn: 333626	2018-05-31 05:02:08 +00:00
Craig Topper	c633867944	[X86] Remove __extension__ from macro intrinsics when its not needed. I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added. Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load. It also removed a bogus error message on another test. llvm-svn: 333613	2018-05-31 00:51:20 +00:00
Eric Christopher	5b91350b4a	Add fopen to the list of builtins that we check and whitelist. llvm-svn: 333594	2018-05-30 21:11:45 +00:00
Craig Topper	c5ec55e921	[X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss. We don't need the insertion back into the original vector at the end. The builtin already understands that. This is different than _mm_sqrt_sd which takes two arguments and we do need to insert. llvm-svn: 333572	2018-05-30 18:27:07 +00:00
Craig Topper	dff5b311af	[X86] Reduce the number of setzero intrinsics to just the set defined by the Intel Intrinsics Guide. We had quite a few for different element sizes of integers sometimes with strange target features attached to them. We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types. llvm-svn: 333568	2018-05-30 18:02:11 +00:00
Gabor Buella	70d8d51073	[X86] Lowering FMA intrinsics to native IR (Clang part) This patch replaces all packed (and scalar without rounding mode) fused intrinsics with fmadd/fmaddsub variations. Then fmadd/fmaddsub are lowered to native IR. Patch by tkrupa Reviewers: craig.topper, sroland, spatel, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D47444 llvm-svn: 333555	2018-05-30 15:27:49 +00:00
Simon Tatham	89e31fa7fc	Support __iso_volatile_load8 etc on aarch64-win32. These intrinsics are used by MSVC's header files on AArch64 Windows as well as AArch32, so we should support them for both targets. I've factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate functions that EmitAArch64BuiltinExpr can call as well. Reviewers: javed.absar, mstorsjo Reviewed By: mstorsjo Subscribers: kristof.beyls, cfe-commits Differential Revision: https://reviews.llvm.org/D47476 llvm-svn: 333513	2018-05-30 07:54:05 +00:00
Daniel Cederman	8cc53aecad	[Sparc] Add floating-point register names Reviewers: jyknight Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, cfe-commits Differential Revision: https://reviews.llvm.org/D47137 llvm-svn: 333510	2018-05-30 06:02:18 +00:00
Craig Topper	f6e79c6d3f	[X86] Remove masking from the AVX512VNNI builtins. Use a select in IR instead. llvm-svn: 333509	2018-05-30 05:26:04 +00:00
Craig Topper	3d9305f28b	[X86] Fix the names of a bunch of icelake intrinsics. Mostly this fixes the names of all the 128-bit intrinsics to start with _mm_ instead of _mm128_ as is the convention and what the Intel docs say. This also fixes the name of the bitshuffle intrinsics to say epi64 for 128 and 256 bit versions. llvm-svn: 333497	2018-05-30 03:38:15 +00:00
Craig Topper	68a272d501	[X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins to a single version without masking. Use select builtins with appropriate operand instead. llvm-svn: 333387	2018-05-29 03:26:38 +00:00
Craig Topper	f99532faee	Revert r333347 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible." This wasn't supposed to be commited yet. llvm-svn: 333349	2018-05-26 18:57:41 +00:00
Craig Topper	387b1423db	[X86] Remove mask from avx512ifma builtins. Use a select instruction instead. This reduces from 12 builtins to 6 since we no longer need a mask and maskz version. llvm-svn: 333348	2018-05-26 18:55:26 +00:00
Craig Topper	e091523c82	[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible. Summary: We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX. For v16i32 and floating point we have legacy 128/256 bit instructions we can use. I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization. Reviewers: RKSimon, spatel, GBuella Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47401 llvm-svn: 333347	2018-05-26 18:55:24 +00:00
Paul Robinson	76178632a2	Revert "[DebugInfo] Don't bother with MD5 checksums of preprocessed files." This reverts commit d734f2aa3f76fbf355ecd2bbe081d0c1f49867ab. Also known as r333311. A very small but nonzero number of bots fail. llvm-svn: 333319	2018-05-25 22:35:59 +00:00
Paul Robinson	638d606f83	[DebugInfo] Don't bother with MD5 checksums of preprocessed files. The checksum will not reflect the real source, so there's no clear reason to include them in the debug info. Also this was causing a crash on the DWARF side. Differential Revision: https://reviews.llvm.org/D47260 llvm-svn: 333311	2018-05-25 20:59:29 +00:00
Gabor Buella	078bb99a90	[x86] invpcid intrinsic An intrinsic for an old instruction, as described in the Intel SDM. Reviewers: craig.topper, rnk Reviewed By: craig.topper, rnk Differential Revision: https://reviews.llvm.org/D47142 llvm-svn: 333256	2018-05-25 06:34:42 +00:00
Gabor Buella	5219ed89be	[X86] NFC Include immintrin.h in CodeGen tests Following r333110: "Move all Intel defined intrinsic includes into immintrin.h" llvm-svn: 333160	2018-05-24 07:09:08 +00:00
Eric Christopher	85f0e505e7	Add Builtins.def support for fread and fwrite to ensure that -fno-builtin- works with them and test accordingly. llvm-svn: 333156	2018-05-24 06:09:28 +00:00
Eric Christopher	c27ad9bbc8	Migrate libcalls-fno-builtin.c test from checking optimized assembly to checking for attributes on the call site - and fix up builtin functions that we were testing for but not ensuring wouldn't be optimized by the backend. Leave one set of asm tests to make sure that we're also communicating builtin-ness to TLI. llvm-svn: 333154	2018-05-24 06:00:50 +00:00
Richard Smith	3e268632cf	Use zeroinitializer for (trailing zero portion of) large array initializers more reliably. This re-commits r333044 with a fix for PR37560. llvm-svn: 333141	2018-05-23 23:41:38 +00:00
Hans Wennborg	156349fa10	Revert r333044 "Use zeroinitializer for (trailing zero portion of) large array initializers" It caused asserts, see PR37560. > Use zeroinitializer for (trailing zero portion of) large array initializers > more reliably. > > Clang has two different ways it emits array constants (from InitListExprs and > from APValues), and both had some ability to emit zeroinitializer, but neither > was able to catch all cases where we could use zeroinitializer reliably. In > particular, emitting from an APValue would fail to notice if all the explicit > array elements happened to be zero. In addition, for large arrays where only an > initial portion has an explicit initializer, we would emit the complete > initializer (which could be huge) rather than emitting only the non-zero > portion. With this change, when the element would have a suffix of more than 8 > zero elements, we emit the array constant as a packed struct of its initial > portion followed by a zeroinitializer constant for the trailing zero portion. > > In passing, I found a bug where SemaInit would sometimes walk the entire array > when checking an initializer that only covers the first few elements; that's > fixed here to unblock testing of the rest. > > Differential Revision: https://reviews.llvm.org/D47166 llvm-svn: 333067	2018-05-23 08:24:01 +00:00
Craig Topper	39e0347e6a	[X86] In the floating point max reduction intrinsics, negate infinity before feeding it to set1. Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant. llvm-svn: 333062	2018-05-23 05:51:52 +00:00
Craig Topper	f2043b08b4	[X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead. llvm-svn: 333061	2018-05-23 04:51:54 +00:00
Richard Smith	9062bbf419	Use zeroinitializer for (trailing zero portion of) large array initializers more reliably. Clang has two different ways it emits array constants (from InitListExprs and from APValues), and both had some ability to emit zeroinitializer, but neither was able to catch all cases where we could use zeroinitializer reliably. In particular, emitting from an APValue would fail to notice if all the explicit array elements happened to be zero. In addition, for large arrays where only an initial portion has an explicit initializer, we would emit the complete initializer (which could be huge) rather than emitting only the non-zero portion. With this change, when the element would have a suffix of more than 8 zero elements, we emit the array constant as a packed struct of its initial portion followed by a zeroinitializer constant for the trailing zero portion. In passing, I found a bug where SemaInit would sometimes walk the entire array when checking an initializer that only covers the first few elements; that's fixed here to unblock testing of the rest. Differential Revision: https://reviews.llvm.org/D47166 llvm-svn: 333044	2018-05-23 00:09:29 +00:00
Sanjay Patel	74c7fb002f	[CodeGen] use nsw negation for builtin abs The clang builtins have the same semantics as the stdlib functions. The stdlib functions are defined in section 7.20.6.1 of the C standard with: "If the result cannot be represented, the behavior is undefined." That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would be UB/poison. Differential Revision: https://reviews.llvm.org/D47202 llvm-svn: 333038	2018-05-22 23:02:13 +00:00
Peter Collingbourne	91d02844a3	Reland r332885, "CodeGen, Driver: Start using direct split dwarf emission in clang." As well as two follow-on commits r332906, r332911 with a fix for test clang/test/CodeGen/split-debug-filename.c. llvm-svn: 333013	2018-05-22 18:52:37 +00:00
Sanjay Patel	1ff6b27940	[CodeGen] produce the LLVM canonical form of abs We chose the 'slt' form as canonical in IR with: rL332819 ...so we should generate that form directly for efficiency. llvm-svn: 332989	2018-05-22 15:36:50 +00:00
Sanjay Patel	b6e5d4ead1	[CodeGen] add tests for abs builtins; NFC llvm-svn: 332988	2018-05-22 15:11:59 +00:00
Brock Wyma	8557ec5d64	[CodeView] Enable debugging of captured variables within C++ lambdas This change will help Visual Studio resolve forward references to C++ lambda routines used by captured variables. Differential Revision: https://reviews.llvm.org/D45438 llvm-svn: 332975	2018-05-22 12:41:19 +00:00
Amara Emerson	f528bcc32a	Revert "CodeGen, Driver: Start using direct split dwarf emission in clang." This reverts commit r332885 as it broke several greendragon buildbots. llvm-svn: 332973	2018-05-22 11:18:58 +00:00
Craig Topper	9efb77e25f	[X86] Remove a builtin that should have been removed in r332882. llvm-svn: 332909	2018-05-21 22:10:02 +00:00
Craig Topper	288bd2e5a0	[X86] Remove masking from pternlog llvm intrinsics and use a select instruction instead. Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions. To avoid that just generate IR directly in CGBuiltin.cpp. Differential Revision: https://reviews.llvm.org/D47125 llvm-svn: 332891	2018-05-21 20:58:23 +00:00
Richard Smith	3f1d6de4f7	Revert r332847; it caused us to miscompile certain forms of reference initialization. llvm-svn: 332886	2018-05-21 20:36:58 +00:00
Peter Collingbourne	47bc01786d	CodeGen, Driver: Start using direct split dwarf emission in clang. Fixes PR37466. Differential Revision: https://reviews.llvm.org/D47093 llvm-svn: 332885	2018-05-21 20:31:59 +00:00
Craig Topper	842171de36	[X86] Use __builtin_convertvector to implement some of the packed integer to packed float conversion intrinsics. I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions. We already do something similar for scalar conversions. Differential Revision: https://reviews.llvm.org/D46863 llvm-svn: 332882	2018-05-21 20:19:17 +00:00
Serge Pavlov	9f8068420a	[CodeGen] Recognize more cases of zero initialization If a variable has an initializer, codegen tries to build its value. If the variable is large in size, building its value requires substantial resources. It causes strange behavior from user viewpoint: compilation of huge zero initialized arrays like: char data_1[2147483648u] = { 0 }; consumes enormous amount of time and memory. With this change codegen tries to determine if variable initializer is equivalent to zero initializer. In this case variable value is not constructed. This change fixes PR18978. Differential Revision: https://reviews.llvm.org/D46241 llvm-svn: 332847	2018-05-21 16:09:54 +00:00
Craig Topper	55b4067350	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all the builtins. llvm-svn: 332825	2018-05-20 23:34:10 +00:00
Alexander Ivchenko	0fb8c877c4	This patch aims to match the changes introduced in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The -mibt feature flag is being removed, and the -fcf-protection option now also defines a CET macro and causes errors when used on non-X86 targets, while X86 targets no longer check for -mibt and -mshstk to determine if -fcf-protection is supported. -mshstk is now used only to determine availability of shadow stack intrinsics. Comes with an LLVM patch (D46882). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46881 llvm-svn: 332704	2018-05-18 11:56:21 +00:00
Peter Smith	84a9c481f5	[AArch64] Correct inline assembly test case for S modifier [NFC] The existing test for the AArch64 inline assembly constraint S uses the A and L modifiers. These modifiers were implemented in the original AArch64 backend but were not carried forward to the merged backend. The A is associated with ADRP and does nothing, the L is associated with :lo12: . Given that A and L are not supported by GCC and not supported by the new implementation of constraint S in LLVM (see D46745) I've altered the test to put :lo12: directly in the string so that A and L are not needed. Differential Revision: https://reviews.llvm.org/D46932 llvm-svn: 332606	2018-05-17 13:17:33 +00:00
Craig Topper	9d146bbaf7	[X86] Revert part of r332266: Use __builtin_convertvector to replace some of the avx512 truncate builtins. The masking doesn't work right in the backend for the ones that produce byte or word elements without avx512bw. llvm-svn: 332322	2018-05-15 03:17:52 +00:00
Craig Topper	25de41cfbc	[X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins. As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend. Differential Revision: https://reviews.llvm.org/D46742 llvm-svn: 332266	2018-05-14 17:50:40 +00:00
Craig Topper	8cb261e353	[X86] Use select instrution and fpextend in the implementation of _mm512_mask_cvtps_pd and _mm512_maskz_cvtps_pd. llvm-svn: 332213	2018-05-14 04:57:46 +00:00
Craig Topper	daaf105f86	[X86] Use __builtin_convertvector to implement _mm512_cvtps_pd. If we're using default rounding mode we can let __builtin_convertvector to generate an fpextend. This matches 128 and 256 bit. If we're using the version that takes an explicit rounding mode argument we would need to look at the immediate to see if its CUR_DIRECTION. llvm-svn: 332210	2018-05-14 04:05:06 +00:00
Craig Topper	6fa91254e4	[X86] Emit better code for _mm_cvtu32_sd, _mm_cvtu64_sd, _mm_cvtu32_ss, and _mm_cvtu64_ss. We can use direct C code for these that will use uitofp and insertelement instructions. For the versions that take an explicit rounding mode we can't do this. llvm-svn: 332203	2018-05-13 23:03:30 +00:00
Elena Demikhovsky	d31327d505	Added atomic_fetch_min, max, umin, umax intrinsics to clang. These intrinsics work exactly as all other atomic_fetch_* intrinsics and allow to create atomicrmw with ordering. Updated the clang-extensions document. Differential Revision: https://reviews.llvm.org/D46386 llvm-svn: 332193	2018-05-13 07:45:58 +00:00
Krzysztof Parzyszek	458506871a	[Hexagon] Implement checking arguments of builtin calls llvm-svn: 332105	2018-05-11 16:41:51 +00:00
Gabor Buella	9cd4f16601	[X86] Assume alignment of movdir64b dst argument Reviewers: craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46683 llvm-svn: 332091	2018-05-11 14:22:04 +00:00
Gabor Buella	3a7571259e	[X86] ptwrite intrinsic Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46540 llvm-svn: 331962	2018-05-10 07:28:54 +00:00
Craig Topper	74ac0eda68	[X86] Change the implementation of scalar masked load/store intrinsics to not use a 512-bit intermediate vector. This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs. Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well. llvm-svn: 331958	2018-05-10 05:43:43 +00:00
Craig Topper	2b248849ae	[Builtins] Improve the IR emitted for MSVC compatible rotr/rotl builtins to match what the middle and backends understand Previously we emitted something like rotl(x, n) { n &= bitwidth-1; return n != 0 ? ((x << n) \| (x >> (bitwidth - n)) : x; } We use a select to avoid the undefined behavior on the (bitwidth - n) shift. The middle and backend don't really recognize this as a rotate and end up emitting a cmov or control flow because of the select. A better pattern is (x << (n & mask)) \| (x << (-n & mask)) where mask is bitwidth - 1. Fixes the main complaint in PR37387. There's still some work to be done if the user writes that sequence directly on a short or char where type promotion rules can prevent it from being recognized. The builtin is emitting direct IR with unpromoted types so that isn't a problem for it. Differential Revision: https://reviews.llvm.org/D46656 llvm-svn: 331943	2018-05-10 00:05:13 +00:00
Manoj Gupta	4fbf84c173	[Clang] Implement function attribute no_stack_protector. Summary: This attribute tells clang to skip this function from stack protector when -stack-protector option is passed. GCC option for this is: __attribute__((__optimize__("no-stack-protector"))) and the equivalent clang syntax would be: __attribute__((no_stack_protector)) This is used in Linux kernel to selectively disable stack protector in certain functions. Reviewers: aaron.ballman, rsmith, rnk, probinson Reviewed By: aaron.ballman Subscribers: probinson, srhines, cfe-commits Differential Revision: https://reviews.llvm.org/D46300 llvm-svn: 331925	2018-05-09 21:41:18 +00:00
Hans Wennborg	ef2f6948be	Revert r331843 "[DebugInfo] Generate debug information for labels." It broke the Chromium build (see reply on the review). > Generate DILabel metadata and call llvm.dbg.label after label > statement to associate the metadata with the label. > > Differential Revision: https://reviews.llvm.org/D45045 > > Patch by Hsiangkai Wang. This doesn't revert the change to backend-unsupported-error.ll that seems to correspond to an llvm-side change. llvm-svn: 331861	2018-05-09 09:29:58 +00:00
JF Bastien	801fca259e	_Atomic of empty struct shouldn't assert Summary: An _Atomic of an empty struct is pretty silly. In general we just widen empty structs to hold a byte's worth of storage, and we represent size and alignment as 0 internally and let LLVM figure out what to do. For _Atomic it's a bit different: the memory model mandates concrete effects occur when atomic operations occur, so in most cases actual instructions need to get emitted. It's really not worth trying to optimize empty struct atomics by figuring out e.g. that a fence would do, even though sane compilers should do optimize atomics. Further, wg21.link/p0528 will fix C++20 atomics with padding bits so that cmpxchg on them works, which means that we'll likely need to do the zero-init song and dance for empty atomic structs anyways (and I think we shouldn't special-case this behavior to C++20 because prior standards are just broken). This patch therefore makes a minor change to r176658 "Promote atomic type sizes up to a power of two": if the width of the atomic's value type is 0, just use 1 byte for width and leave alignment as-is (since it should never be zero, and over-aligned zero-width structs are weird but fine). This fixes an assertion: (NumBits >= MIN_INT_BITS && "bitwidth too small"), function get, file ../lib/IR/Type.cpp, line 241. It seems like this has run into other assertions before (namely the unreachable Kind check in ImpCastExprToType), but I haven't reproduced that issue with tip-of-tree. <rdar://problem/39678063> Reviewers: arphaman, rjmccall Subscribers: aheejin, cfe-commits Differential Revision: https://reviews.llvm.org/D46613 llvm-svn: 331845	2018-05-09 03:51:12 +00:00
Shiva Chen	667fbe2cb0	[DebugInfo] Generate debug information for labels. Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. Differential Revision: https://reviews.llvm.org/D45045 Patch by Hsiangkai Wang. llvm-svn: 331843	2018-05-09 02:41:56 +00:00
Craig Topper	45fc2c83e6	[X86] Use target feature defines in tests instead of defining our own flag on the command line. NFCI llvm-svn: 331683	2018-05-07 21:47:13 +00:00
Teresa Johnson	fedd39045f	Add -target to address errors in test from r331592 The error turns out to be: Assertion failed: (Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n"), function init, file /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/lib/CodeGen/MachineFunction.cpp, line 180. Add -target to address this. Also re-enable the test I had temporarily commented, and move it further down in case there is still a failure (since it pipes stderr to FileCheck). llvm-svn: 331597	2018-05-05 16:37:31 +00:00
Teresa Johnson	1237b3acc9	Skip part of test added in r331592 to help debug bot failures Trying to debug why/where a few bots getting exit code 256 e.g. http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/48471/testReport/Clang/CodeGen/thinlto_diagnostic_handler_remarks_with_hotness_ll/ and a few windows bots getting no output from that RUN line e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11865/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Athinlto-diagnostic-handler-remarks-with-hotness.ll llvm-svn: 331596	2018-05-05 15:54:57 +00:00
Teresa Johnson	259f8ddff5	Add required target to address bot failures from r331592 Failing on non-x86 bots, needs x86 target for code gen. llvm-svn: 331593	2018-05-05 15:15:04 +00:00
Teresa Johnson	66744f8137	[ThinLTO] Support opt remarks options with distributed ThinLTO backends Summary: Passes down the necessary code ge options to the LTO Config to enable -fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO backend for a distributed build. Also, remove warning about not having PGO when the input is IR. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D46464 llvm-svn: 331592	2018-05-05 14:37:29 +00:00
Chandler Carruth	9325c38fdb	[gcov] Make the CLang side coverage test work with the new instrumentation codegeneration strategy of using a data structure and a loop. Required some finesse to get the critical things being tested to surface in a nice way for FileCheck but I think this preserves the original intent of the test. llvm-svn: 331411	2018-05-02 22:57:20 +00:00
Shoaib Meenai	c4cf3daad8	[ARM] Remove redundant #if in test. NFC Both sides of this #if #include the same file. Drop the #if, leaving only the #include. Patch by Matt Glazar. Differential Revision: https://reviews.llvm.org/D45779 llvm-svn: 331305	2018-05-01 20:38:05 +00:00
Danil Malyshev	cd3fd82da3	Update existed CodeGen TBAA tests Reviewers: hfinkel, kosarev, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D44616 llvm-svn: 331292	2018-05-01 18:14:36 +00:00
Gabor Buella	a51e0c2243	[X86] directstore and movdir64b intrinsics Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45984 llvm-svn: 331249	2018-05-01 10:05:42 +00:00
Nirav Dave	6c0665e221	[MC] Change AsmParser to leverage Assembler during evaluation Teach AsmParser to check with Assembler for when evaluating constant expressions. This improves the handing of preprocessor expressions that must be resolved at parse time. This idiom can be found as assembling-time assertion checks in source-level assemblers. Note that this relies on the MCStreamer to keep sufficient tabs on Section / Fragment information which the MCAsmStreamer does not. As a result the textual output may fail where the equivalent object generation would pass. This can most easily be resolved by folding the MCAsmStreamer and MCObjectStreamer together which is planned for in a separate patch. Currently, this feature is only enabled for assembly input, keeping IR compilation consistent between assembly and object generation. Reviewers: echristo, rnk, probinson, espindola, peter.smith Reviewed By: peter.smith Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45164 llvm-svn: 331218	2018-04-30 19:22:40 +00:00
Sanjay Patel	c81450e29b	[Driver, CodeGen] rename options to disable an FP cast optimization As suggested in the post-commit thread for rL331056, we should match these clang options with the established vocabulary of the corresponding sanitizer option. Also, the use of 'strict' is well-known for these kinds of knobs, and we can improve the descriptive text in the docs. So this intends to match the logic of D46135 but only change the words. Matching LLVM commit to match this spelling of the attribute to follow shortly. Differential Revision: https://reviews.llvm.org/D46236 llvm-svn: 331209	2018-04-30 18:19:03 +00:00
Sanjay Patel	d175476566	[Driver, CodeGen] add options to enable/disable an FP cast optimization As discussed in the post-commit thread for: rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html ) We need a way to opt-out of a float-to-int-to-float cast optimization because too much existing code relies on the platform-specific undefined result of those casts when the float-to-int overflows. The LLVM changes associated with adding this function attribute are here: rL330947 rL330950 rL330951 Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that catches this problem: rL330958 Differential Revision: https://reviews.llvm.org/D46135 llvm-svn: 331041	2018-04-27 14:22:48 +00:00
Oliver Stannard	2fcee8bd52	[ARM,AArch64] Add intrinsics for dot product instructions The ACLE spec which describes these intrinsics hasn't been published yet, but this is based on the final draft which will be published soon, and these have already been implemented by GCC. Differential revision: https://reviews.llvm.org/D46109 llvm-svn: 331039	2018-04-27 14:03:32 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00
Craig Topper	e95bde33df	[X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 intrinsics to match icc. On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld. Fixes PR37140. llvm-svn: 330923	2018-04-26 05:38:39 +00:00
Eli Friedman	e54d0ff400	[TargetInfo] Sort target features before passing them to the backend Passing the features in random order will lead to unpredictable results when some of the features are related (like the architecture-version features on ARM). It might be possible to fix this particular case in the ARM target code, to avoid adding overlapping target features. But we should probably be sorting in any case: the behavior shouldn't depend on StringMap's hashing algorithm. Differential Revision: https://reviews.llvm.org/D46030 llvm-svn: 330861	2018-04-25 19:14:05 +00:00
Paul Semel	80daae2736	add check for long double for __builtin_dump_struct llvm-svn: 330808	2018-04-25 10:09:20 +00:00
Craig Topper	ce281a41b5	[X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics. The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either. llvm-svn: 330681	2018-04-24 03:36:08 +00:00
Mikhail Maltsev	4a4e7a31ad	[CodeGen] Reland r330442: Add an option to suppress output of llvm.ident The test case in the original patch was overly contrained and failed on PPC targets. llvm-svn: 330575	2018-04-23 10:08:46 +00:00
Tim Northover	9dc1d0c74e	[Atomics] warn about atomic accesses using libcalls If an atomic variable is misaligned (and that suspicion is why Clang emits libcalls at all) the runtime support library will have to use a lock to safely access it, with potentially very bad performance consequences. There's a very good chance this is unintentional so it makes sense to issue a warning. Also give it a named group so people can promote it to an error, or disable it if they really don't care. llvm-svn: 330566	2018-04-23 08:16:24 +00:00
Gabor Buella	eba6c42e66	[X86] WaitPKG intrinsics Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45254 llvm-svn: 330463	2018-04-20 18:44:33 +00:00
Mikhail Maltsev	42b2a0e162	Revert r330442, CodeGen/no-ident-version.c is failing on PPC llvm-svn: 330451	2018-04-20 17:14:39 +00:00
Mikhail Maltsev	6550c13912	[CodeGen] Add an option to suppress output of llvm.ident Summary: By default Clang outputs its version (including git commit hash, in case of trunk builds) into object and assembly files. It might be useful to have an option to disable this, especially for debugging purposes. This patch implements new command line flags -Qn and -Qy (the names are chosen for compatibility with GCC). -Qn disables output of the 'llvm.ident' metadata string and the 'producer' debug info. -Qy (enabled by default) does the opposite. Reviewers: faisalv, echristo, aprantl Reviewed By: aprantl Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01 Differential Revision: https://reviews.llvm.org/D45255 llvm-svn: 330442	2018-04-20 16:29:03 +00:00
Hans Wennborg	a417362c28	Fix some tests that were failing on Windows llvm-svn: 330441	2018-04-20 15:33:44 +00:00
Saleem Abdulrasool	3fe5b7a497	Implement proper support for `-falign-functions` This implements support for the previously ignored flag `-falign-functions`. This allows the frontend to request alignment on function definitions in the translation unit where they are not explicitly requested in code. This is compatible with the GCC behaviour and the ICC behaviour. The scalar value passed to `-falign-functions` aligns functions to a power-of-two boundary. If flag is used, the functions are aligned to 16-byte boundaries. If the scalar is specified, it must be an integer less than or equal to 4096. If the value is not a power-of-two, the driver will round it up to the nearest power of two. llvm-svn: 330378	2018-04-19 23:14:57 +00:00
Ivan A. Kosarev	9b20c245ca	[NEON] Define vfma_n_f32() and vfmaq_n_f32() intrinsics in AArch32 mode Differential Revision: https://reviews.llvm.org/D45670 llvm-svn: 330336	2018-04-19 15:27:28 +00:00
Erich Keane	b127a39404	Fix __attribute__((force_align_arg_pointer)) misalignment bug The force_align_arg_pointer attribute was using a hardcoded 16-byte alignment value which in combination with -mstack-alignment=32 (or larger) would produce a misaligned stack which could result in crashes when accessing stack buffers using aligned AVX load/store instructions. Fix the issue by using the "stackrealign" function attribute instead of using a hardcoded 16-byte alignment. Patch By: Gramner Differential Revision: https://reviews.llvm.org/D45812 llvm-svn: 330331	2018-04-19 14:27:05 +00:00
Alexander Ivchenko	d96ddccdb4	Lowering x86 adds/addus/subs/subus intrinsics (clang) This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44786 llvm-svn: 330323	2018-04-19 12:15:11 +00:00
Artem Belevich	0ae8590354	[NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. The new instructions were added added for sm_70+ GPUs in CUDA-9.1. Differential Revision: https://reviews.llvm.org/D45068 llvm-svn: 330296	2018-04-18 21:51:48 +00:00
Ivan A. Kosarev	1243ebdcdb	Revert r330195 "[NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only". Differential Revision: https://reviews.llvm.org/D45668 llvm-svn: 330248	2018-04-18 12:02:49 +00:00
Keith Wyss	f437e35671	[XRay] Add clang builtin for xray typed events. Summary: A clang builtin for xray typed events. Differs from __xray_customevent(...) by the presence of a type tag that is vended by compiler-rt in typical usage. This allows xray handlers to expand logged events with their type description and plugins to process traced events based on type. This change depends on D45633 for the intrinsic definition. Reviewers: dberris, pelikan, rnk, eizan Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D45716 llvm-svn: 330220	2018-04-17 21:32:43 +00:00
Teresa Johnson	005aadaa0d	Require shell for test Attempt to fix windows bot which doesn't like the "(cd .." invocation added in r330194: http://lab.llvm.org:8011/builders/clang-with-thin-lto-windows/builds/8704/steps/stage%202%20check/logs/stdio llvm-svn: 330212	2018-04-17 20:36:51 +00:00
Akira Hatanaka	617e26152d	Add a command line option 'fregister_global_dtors_with_atexit' to register destructor functions annotated with __attribute__((destructor)) using __cxa_atexit or atexit. Register destructor functions annotated with __attribute__((destructor)) calling __cxa_atexit in a synthesized constructor function instead of emitting references to the functions in a special section. The primary reason for adding this option is that we are planning to deprecate the __mod_term_funcs section on Darwin in the future. This feature is enabled by default only on Darwin. Users who do not want this can use command line option 'fno_register_global_dtors_with_atexit' to disable it. rdar://problem/33887655 Differential Revision: https://reviews.llvm.org/D45578 llvm-svn: 330199	2018-04-17 18:41:52 +00:00
Ivan A. Kosarev	b3b87c3314	[NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only Differential Revision: https://reviews.llvm.org/D45668 llvm-svn: 330195	2018-04-17 16:43:07 +00:00
Teresa Johnson	9e4321c12d	[ThinLTO] Pass -save-temps to LTO backend for distributed ThinLTO builds Summary: The clang driver option -save-temps was not passed to the LTO config, so when invoking the ThinLTO backends via clang during distributed builds there was no way to get LTO to save temp files. Getting this to work with ThinLTO distributed builds also required changing the driver to avoid a separate compile step to emit unoptimized bitcode when the input was already bitcode under -save-temps. Not only is this unnecessary in general, it is problematic for ThinLTO backends since the temporary bitcode file to the backend would not match the module path in the combined index, leading to incorrect ThinLTO backend index-based optimizations. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D45217 llvm-svn: 330194	2018-04-17 16:39:25 +00:00
Aaron Ballman	fe93546b11	Add modifiers for unsigned char and signed char field printing for __builtin_dump_struct. Patch by Paul Semel. llvm-svn: 330188	2018-04-17 14:00:06 +00:00
Aaron Ballman	b6a7702297	Add checks for format specifiers used by __builtin_dump_struct and added a new specifier for null-terminated constant strings. Patch by Paul Semel. llvm-svn: 330185	2018-04-17 11:57:47 +00:00
Eli Friedman	642a5ee1c1	[ARM] Compute a target feature which corresponds to the ARM version. Currently, the interaction between the triple, the CPU, and the supported features is a mess: the driver edits the triple to indicate the supported architecture version, and the LLVM backend uses this to figure out what instructions are legal. This makes it difficult to understand what's happening, and makes it impossible to LTO together two modules with different computed architectures. Instead of relying on triple rewriting to get the correct target features, we should add the right target features explicitly. Differential Revision: https://reviews.llvm.org/D45240 llvm-svn: 330169	2018-04-16 23:52:58 +00:00
Andrey Konovalov	1ba9d9c6ca	hwasan: add -fsanitize=kernel-hwaddress flag This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables -hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff. Differential Revision: https://reviews.llvm.org/D45046 llvm-svn: 330044	2018-04-13 18:05:21 +00:00
Ivan A. Kosarev	9cdb2c75d9	[NEON] Support vrndns_f32 intrinsic Differential Revision: https://reviews.llvm.org/D45515 llvm-svn: 330012	2018-04-13 12:46:02 +00:00
Gabor Buella	b220dd2b6c	[X86] Introduce cldemote intrinsic Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45257 llvm-svn: 329993	2018-04-13 07:37:24 +00:00
Dean Michael Berris	488f7c2b67	[XRay][clang] Add flag to choose instrumentation bundles Summary: This change addresses http://llvm.org/PR36926 by allowing users to pick which instrumentation bundles to use, when instrumenting with XRay. In particular, the flag `-fxray-instrumentation-bundle=` has four valid values: - `all`: the default, emits all instrumentation kinds - `none`: equivalent to -fnoxray-instrument - `function`: emits the entry/exit instrumentation - `custom`: emits the custom event instrumentation These can be combined either as comma-separated values, or as repeated flag values. Reviewers: echristo, kpw, eizan, pelikan Reviewed By: pelikan Subscribers: mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D44970 llvm-svn: 329985	2018-04-13 02:31:58 +00:00
Eli Friedman	01d349bab1	Remove -cc1 option "-backend-option". It means the same thing as -mllvm; there isn't any reason to have two options which do the same thing. Differential Revision: https://reviews.llvm.org/D45109 llvm-svn: 329965	2018-04-12 22:21:36 +00:00
Gabor Buella	e708a09e21	[X86] Introduce wbinvd intrinsic A previously missing intrinsic for an old instruction. Reviewers: craig.topper, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45311 llvm-svn: 329937	2018-04-12 18:42:02 +00:00
Gabor Buella	a052016ef2	[x86] wbnoinvd intrinsic The WBNOINVD instruction writes back all modified cache lines in the processor’s internal cache to main memory but does not invalidate (flush) the internal caches. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43817 llvm-svn: 329848	2018-04-11 20:09:09 +00:00
Shoaib Meenai	34aa13169b	[CodeGen] Handle __func__ inside __finally When we enter a __finally block, the CGF's CurCodeDecl will be null (because CodeGenFunction::StartFunction is given an empty GlobalDecl for a __finally block), and so the dyn_cast here will result in an assertion failure. Change it to dyn_cast_or_null to handle this case. Differential Revision: https://reviews.llvm.org/D45523 llvm-svn: 329836	2018-04-11 18:17:35 +00:00
Artem Belevich	24e8a680e5	[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829	2018-04-11 17:51:19 +00:00
Ivan A. Kosarev	2f326d453f	[NEON] Support vfma_n and vfms_n intrinsics Differential Revision: https://reviews.llvm.org/D45483 llvm-svn: 329814	2018-04-11 14:43:11 +00:00
Craig Topper	2575454fe9	[X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked intrinsic and a select. This makes it consistent with the 128/256-bit functions. Someday maybe we'll have all the masking moved to selects. llvm-svn: 329775	2018-04-11 04:55:10 +00:00
Aaron Ballman	0652534131	Introduce a new builtin, __builtin_dump_struct, that is useful for dumping structure contents at runtime in circumstances where debuggers may not be easily available (such as in kernel work). Patch by Paul Semel. llvm-svn: 329762	2018-04-10 21:58:13 +00:00
Craig Topper	298e1712d8	[X86] Add test case for llvm change r329734 This test ensures the popfd instruction in MS inline assembly can properly find a clobber name for the dirflag register. Previously the register was named 'DF', but it needs to be named 'dirflag' to match the name in the GCC register name list. llvm-svn: 329738	2018-04-10 18:43:44 +00:00
Gabor Buella	58fe46d99f	CodeGen tests - typo fixes NFC llvm-svn: 329689	2018-04-10 11:20:05 +00:00
Vitaly Buka	69a2e18b4a	asan: kernel: make no_sanitize("address") attribute work with -fsanitize=kernel-address Summary: Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address") disable both ASan and KASan instrumentation. Also remove redundant test. Patch by Andrey Konovalov Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D44981 llvm-svn: 329612	2018-04-09 20:10:29 +00:00
Craig Topper	304edc1e75	[X86] Emit native IR for pmuldq/pmuludq builtins. I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts. Differential Revision: https://reviews.llvm.org/D45421 llvm-svn: 329605	2018-04-09 19:17:54 +00:00
Dean Michael Berris	20dc6ef746	[XRay][llvm+clang] Consolidate attribute list files Summary: This change consolidates the always/never lists that may be provided to clang to externally control which functions should be XRay instrumented by imbuing attributes. The files follow the same format as defined in https://clang.llvm.org/docs/SanitizerSpecialCaseList.html for the sanitizer blacklist. We also deprecate the existing `-fxray-instrument-always=` and `-fxray-instrument-never=` flags, in favour of `-fxray-attr-list=`. This fixes http://llvm.org/PR34721. Reviewers: echristo, vlad.tsyrklevich, eugenis Reviewed By: vlad.tsyrklevich Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D45357 llvm-svn: 329543	2018-04-09 04:02:09 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Manoj Gupta	4b3eefa5e8	Disable -fmerge-all-constants as default. Summary: "-fmerge-all-constants" is a non-conforming optimization and should not be the default. It is also causing miscompiles when building Linux Kernel (https://lkml.org/lkml/2018/3/20/872). Fixes PR18538. Reviewers: rjmccall, rsmith, chandlerc Reviewed By: rsmith, chandlerc Subscribers: srhines, cfe-commits Differential Revision: https://reviews.llvm.org/D45289 llvm-svn: 329300	2018-04-05 15:29:52 +00:00
Vlad Tsyrklevich	e55aa03ad4	Add the -fsanitize=shadow-call-stack flag Summary: Add support for the -fsanitize=shadow-call-stack flag which causes clang to add ShadowCallStack attribute to functions compiled with that flag enabled. Reviewers: pcc, kcc Reviewed By: pcc, kcc Subscribers: cryptoad, cfe-commits, kcc Differential Revision: https://reviews.llvm.org/D44801 llvm-svn: 329122	2018-04-03 22:33:53 +00:00
Rafael Espindola	b2c47fbf94	Set dso_local on cfi_slowpath. llvm-svn: 328836	2018-03-29 22:08:01 +00:00
Manoj Gupta	cb668d8512	[AArch64]: Add support for parsing rN registers. Summary: Allow rN registers to be simply parsed as correspoing xN registers. The "register ... asm("rN")" is an command to the compiler's register allocator, not an operand to any individual assembly instruction. GCC documents this syntax as "...the name of the register that should be used." This is needed to support the changes in Linux kernel (see https://lkml.org/lkml/2018/3/1/268 ) Note: This will add support only for the limited use case of register ... asm("rN"). Any other uses that make rN leak into assembly are not supported. Reviewers: kristof.beyls, rengolin, peter.smith, t.p.northover Reviewed By: peter.smith Subscribers: javed.absar, eraman, cfe-commits, srhines Differential Revision: https://reviews.llvm.org/D44815 llvm-svn: 328829	2018-03-29 21:11:15 +00:00
Rafael Espindola	54d44bf14c	Mark __cfi_check as dso_local. llvm-svn: 328825	2018-03-29 20:51:30 +00:00
Akira Hatanaka	673af7a688	Generalize NRVO to cover C structs. This commit generalizes NRVO to cover C structs (both trivial and non-trivial structs). rdar://problem/33599681 Differential Revision: https://reviews.llvm.org/D44968 llvm-svn: 328809	2018-03-29 17:56:24 +00:00
Rafael Espindola	7e9b87648b	Add a dllimport test. Thanks to rnk for the suggestion. llvm-svn: 328800	2018-03-29 16:35:52 +00:00
Krzysztof Parzyszek	790e422be9	[Hexagon] Aid bit-reverse load intrinsics lowering with bitcode The conversion of operatios to bitcode helps to eliminate an additional store in certain cases. We used to lower these load intrinsics in DAG to DAG conversion by which time, the "Dead Store Elimination" pass is already run. There is an associated LLVM patch. Patch by Sumanth Gundapaneni. llvm-svn: 328776	2018-03-29 13:54:31 +00:00
Krzysztof Parzyszek	1ef2a1f414	[Hexagon] Add support for "new" circular buffer intrinsics These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" vesrions use the CSx register for the start of the buffer instead of the K field in the Mx register. There is a related llvm patch. Patch by Brendon Cahoon. llvm-svn: 328725	2018-03-28 19:40:57 +00:00
Matt Arsenault	b130ea5605	AMDGPU: Update datalayout for stack alignment llvm-svn: 328657	2018-03-27 19:26:51 +00:00
Krzysztof Parzyszek	0aead04325	Update test after r328635 in LLVM llvm-svn: 328641	2018-03-27 17:17:39 +00:00
Pirama Arumuga Nainar	fbfba29d74	[CodeGen] Mark fma as const for Android Summary: r318093 sets fma, fmaf, fmal as const for Gnu and MSVC. Android also does not set errno for these functions. So mark these const for Android. Reviewers: spatel, efriedma, srhines, chh, enh Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D44852 llvm-svn: 328552	2018-03-26 17:03:34 +00:00
Abderrazek Zaafrani	b5ac56fb81	[ARM] Add ARMv8.2-A FP16 vector intrinsic Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM. Differential revision https://reviews.llvm.org/D43650 llvm-svn: 328277	2018-03-23 00:08:40 +00:00
Rafael Espindola	1193c370b4	Set dso_local on builtin functions. The difference between CreateRuntimeFunction and CreateBuiltinFunction is that CreateBuiltinFunction would not set dllimport or dso_local. To keep the current semantics, just forward to CreateRuntimeFunction with Local=true so it doesn't add dllimport. llvm-svn: 328224	2018-03-22 18:03:13 +00:00
Jonas Devlieghere	f070268701	[CodeGen] Emit DWARF "constructor" calling convention Now that LLVM has support for emitting calling conventions in DWARF (see r328191) have clang emit them. Patch by: Adrien Guinet Differential revision: https://reviews.llvm.org/D42351 llvm-svn: 328196	2018-03-22 13:53:30 +00:00
Artem Belevich	30512869ff	[NVPTX] Make tensor shape part of WMMA intrinsic's name. This is needed for the upcoming implementation of the new 8x32x16 and 32x8x16 variants of WMMA instructions introduced in CUDA 9.1. Differential Revision: https://reviews.llvm.org/D44719 llvm-svn: 328158	2018-03-21 21:55:02 +00:00
Rafael Espindola	e28ff4d43f	Add CHECKs for a few declarations. NFC. We were just missing test coverage for this. llvm-svn: 328048	2018-03-20 21:54:14 +00:00
Rafael Espindola	0d40f12596	Set dso_local on string literals. llvm-svn: 328040	2018-03-20 20:42:55 +00:00
Abderrazek Zaafrani	585051ae74	[AArch64] Add vmulxh_lane fp16 vector intrinsic https://reviews.llvm.org/D44591 llvm-svn: 328038	2018-03-20 20:37:31 +00:00
Saleem Abdulrasool	29149d5cb7	Basic: support PreserveMost and PreserveAll on Windows ARM Do not ignore these calling conventions on Windows ARM. They are used by the swift runtime for certain calls. llvm-svn: 328007	2018-03-20 17:33:26 +00:00
Rafael Espindola	dca06024e8	Set dso_local for CFConstantStringClassReference. This one cannot use setGVProperties since it has special logic for when it is dllimport or not. llvm-svn: 327993	2018-03-20 15:48:00 +00:00
Oren Ben Simhon	220671a080	Adding nocf_check attribute for cf-protection fine tuning The patch adds nocf_check target independent attribute for disabling checks that were enabled by cf-protection flag. The attribute can be appertained to functions and function pointers. Attribute name follows GCC's similar attribute name. Differential Revision: https://reviews.llvm.org/D41880 llvm-svn: 327768	2018-03-17 13:31:35 +00:00
Reid Kleckner	fb93154bf1	[MS] Don't escape MS C++ names with \01 It is not needed after LLVM r327734. Now it will be easier to copy-paste IR symbol names from Clang. llvm-svn: 327738	2018-03-16 20:36:49 +00:00
Rafael Espindola	3c8a39cfbb	Set dso_local for NSConcreteStackBlock. llvm-svn: 327544	2018-03-14 18:19:26 +00:00
Sjoerd Meijer	95da875898	This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic" This is causing problems in testing, and PR36683 was raised. Reverting it until we have sorted out how to pass f16 vectors. llvm-svn: 327437	2018-03-13 19:38:56 +00:00
George Burgess IV	4deb75d2e8	[CodeGen] Eagerly emit lifetime.end markers for calls In C, we'll wait until the end of the scope to clean up aggregate temporaries used for returns from calls. This means in cases like: { // Assuming that `Bar` is large enough to warrant indirect returns struct Bar b = {}; b = foo(&b); b = foo(&b); b = foo(&b); b = foo(&b); } ...We'll allocate space for 5 Bars on the stack (`b`, and 4 temporaries). This becomes painful in things like large switch statements. If cleaning up sooner is trivial, we should do it. llvm-svn: 327229	2018-03-10 23:06:31 +00:00
Abderrazek Zaafrani	5bd68cf742	[ARM] Add ARMv8.2-A FP16 vector intrinsic Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document. Reviews in https://reviews.llvm.org/D43650 llvm-svn: 327189	2018-03-09 23:39:34 +00:00
Peter Collingbourne	a2f10056d1	Fix Clang test case. llvm-svn: 327166	2018-03-09 19:37:28 +00:00
Saleem Abdulrasool	3e70132753	CodeGen: simplify and validate exception personalities Simplify the dispatching for the personality routines. This really had no test coverage previously, so add test coverage for the various cases. This turns out to be pretty complicated as the various languages and models interact to change personalities around. You really should feel bad for the compiler if you are using exceptions. There is no reason for this type of cruelty. llvm-svn: 327105	2018-03-09 07:06:42 +00:00
George Burgess IV	5701606165	Fix a typo from r326844; NFC llvm-svn: 326845	2018-03-06 23:09:01 +00:00
George Burgess IV	7e03f350e8	[CodeGen] Don't emit lifetime.end without lifetime.start EmitLifetimeStart returns a non-null `size` pointer if it actually emits a lifetime.start. Later in this function, we use `tempSize`'s nullness to determine whether or not we should emit a lifetime.end. llvm-svn: 326844	2018-03-06 23:07:00 +00:00
Alexander Ivchenko	9d3b45301f	[x86][CET] Introduce _get_ssp, _inc_ssp intrinsics Summary: The _get_ssp intrinsic can be used to retrieve the shadow stack pointer, independent of the current arch -- in contract with the rdsspd and the rdsspq intrinsics. Also, this intrinsic returns zero on CPUs which don't support CET. The rdssp[d\|q] instruction is decoded as nop, essentially just returning the input operand, which is zero. Example result of compilation: ``` xorl %eax, %eax movl %eax, %ecx rdsspq %rcx # NOP when CET is not supported movq %rcx, %rax # return zero ``` Reviewers: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D43814 llvm-svn: 326689	2018-03-05 11:30:28 +00:00
Manoj Gupta	886b4505f2	Do not generate calls to fentry with __attribute__((no_instrument_function)) Summary: Currently only calls to mcount were suppressed with no_instrument_function attribute. Linux kernel requires that calls to fentry should also not be generated. This is an extended fix for PR PR33515. Reviewers: hfinkel, rengolin, srhines, rnk, rsmith, rjmccall, hans Reviewed By: rjmccall Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D43995 llvm-svn: 326639	2018-03-02 23:52:44 +00:00
Martin Storsjo	87c2ad29ee	[RecordLayout] Only assert that fundamental type sizes are power of two on MSVC Make types with sizes that aren't a power of two an error (that can be disabled) in structs with ms_struct layout, except on mingw where the situation is quite likely to occur and GCC handles it silently. Differential Revision: https://reviews.llvm.org/D43908 llvm-svn: 326476	2018-03-01 20:22:57 +00:00
Martin Storsjo	96b01bcfb6	[RecordLayout] Don't align to non-power-of-2 sizes when using -mms-bitfields When targeting GNU/MinGW for i386, the size of the "long double" data type is 12 bytes (while it is 8 bytes in MSVC). When building with -mms-bitfields to have struct layouts match MSVC, data types are laid out in a struct with alignment according to their size. However, this doesn't make sense for the long double type, since it doesn't match MSVC at all, and aligning to a non-power-of-2 size triggers other asserts later. This matches what GCC does, aligning a long double to 4 bytes in structs on i386 even when -mms-bitfields is specified. This fixes asserts when using the max_align_t data type when building for MinGW/i386 with the -mms-bitfields flag. Differential Revision: https://reviews.llvm.org/D43734 llvm-svn: 326173	2018-02-27 06:27:06 +00:00

... 4 5 6 7 8 ...

5094 Commits