llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexander Shaposhnikov	42b5ef0269	[llvm-objcopy] Add support for static libraries This diff adds support for handling static libraries to llvm-objcopy and llvm-strip. Test plan: make check-all Differential revision: https://reviews.llvm.org/D48413 llvm-svn: 336455	2018-07-06 17:51:03 +00:00
Sanjay Patel	5739587735	[InstCombine] add more tests for potentially poisonous shifts; NFC llvm-svn: 336454	2018-07-06 17:44:57 +00:00
Nico Weber	038dbf3c24	Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084. llvm-svn: 336453	2018-07-06 17:37:24 +00:00
Vedant Kumar	ba0c876597	[Debugify] Allow unsigned values narrower than their variables Suppress the diagnostic for mis-sized dbg.values when a value operand is narrower than the unsigned variable it describes. Assume that a debugger would implicitly zero-extend these values. llvm-svn: 336452	2018-07-06 17:32:40 +00:00
Vedant Kumar	6379a62250	[Local] replaceAllDbgUsesWith: Update debug values before RAUW The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451	2018-07-06 17:32:39 +00:00
Sanjay Patel	a212b0bc18	[InstCombine] add more tests with poison and undef; NFC As discussed in D48987 and D48893, there are many different ways to go wrong depending on the binop (and as shown here we already do go wrong in some cases). llvm-svn: 336450	2018-07-06 17:24:32 +00:00
Tom Stellard	ec4feae1b6	AMDGPU: Fix UBSan error caused by r335942 Summary: Fixes PR38071. Reviewers: arsenm, dstenb Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48979 llvm-svn: 336448	2018-07-06 17:16:17 +00:00
Sanjay Patel	e85a300a77	[Constants] extend getBinOpIdentity(); NFC The enhanced version will be used in D48893 and related patches and an almost identical (fadd is different) version is proposed in D28907, so adding this as a preliminary step. llvm-svn: 336444	2018-07-06 15:18:58 +00:00
Sanjay Patel	e6dda2fee7	[Constant] add undef element query for vector constants; NFC This is likely to be used in D48987 and similar patches, so adding it as an NFC preliminary step. llvm-svn: 336442	2018-07-06 14:52:36 +00:00
Sjoerd Meijer	b3e06faa28	[ARM] ParallelDSP: added statistics, NFC. Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441	2018-07-06 14:47:09 +00:00
Diogo N. Sampaio	81e9dd1ed7	Commit rL336426 cause buildbot failures http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/ This removes the comments of the function label causing this error. llvm-svn: 336440	2018-07-06 14:41:09 +00:00
Benjamin Kramer	3687ac52a9	[LoopSink] Make the enforcement of determinism deterministic. LoopBlockNumber is a DenseMap<BasicBlock, int>, comparing the result of find() will compare a pair<BasicBlock, int>. That's of course depending on pointer ordering which varies from run to run. Reverse iteration doesn't find this because we're copying to a vector first. This bug has been there since 2016 but only recently showed up on clang selfhost with FDO and ThinLTO, which is also why I didn't manage to get a reasonable test case for this. Add an assert that would've caught this. llvm-svn: 336439	2018-07-06 14:20:58 +00:00
Andrea Di Biagio	bb25e27f58	[llvm-mca] A write latency cannot be a negative value. NFC llvm-svn: 336437	2018-07-06 13:46:10 +00:00
Sjoerd Meijer	35bd8f5d1e	[AArch64] Armv8.4-A: TLB support This adds: - outer shareable TLB Maintenance instructions, and - TLB range maintenance instructions. llvm-svn: 336434	2018-07-06 13:00:16 +00:00
Jonas Devlieghere	7f19d0160b	[dsymutil] Emit label at the begin of a CU When emitting a CU, store the MCSymbol pointing to the beginning of the CU. We'll need this information later when emitting the .debug_names section (DWARF5 accelerator table). llvm-svn: 336433	2018-07-06 12:49:54 +00:00
Sjoerd Meijer	a3dad801b7	Recommit: [AArch64] Armv8.4-A: Flag manipulation instructions Now with the asm operand definition included. llvm-svn: 336432	2018-07-06 12:32:33 +00:00
Diogo N. Sampaio	17be994942	Added missing semicolon llvm-svn: 336428	2018-07-06 10:09:04 +00:00
Diogo N. Sampaio	742bf1a255	[SelectionDAG] https://reviews.llvm.org/D48278 D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426	2018-07-06 09:42:25 +00:00
Sjoerd Meijer	8203177e5e	Revert [AArch64] Armv8.4-A: Flag manipulation instructions It's causing build errors. llvm-svn: 336422	2018-07-06 08:39:43 +00:00
Sjoerd Meijer	6f5f6d5b2e	[AArch64] Armv8.4-A: Flag manipulation instructions These instructions are added to AArch64 only. Differential Revision: https://reviews.llvm.org/D48926 llvm-svn: 336421	2018-07-06 08:12:20 +00:00
Andrea Di Biagio	61c52af9d9	[llvm-mca] improve the instruction issue logic implemented by the Scheduler. This patch modifies the Scheduler heuristic used to select the next instruction to issue to the pipelines. The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for that test should have been ~2.00. It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be predicted by a Scheduler that only prioritizes instructions based on their "age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s, for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00. Instructions in the ReadyQueue are now ranked based on two factors: - The "age" of an instruction. - The number of unique users of writes associated with an instruction. The new logic still prioritizes older instructions over younger instructions to minimize the pressure on the reorder buffer. However, the number of users of an instruction now also affects the overall rank. This potentially increases the ability of the Scheduler to extract instruction level parallelism. This patch fixes the problem with the wrong IPC reported for test add-sequence.s and test dependent-pmuld-paddd.s. llvm-svn: 336420	2018-07-06 08:08:30 +00:00
Tim Northover	7ee46ed992	CallGraphSCCPass: iterate over all functions. Previously we only iterated over functions reachable from the set of external functions in the module. But since some of the passes under this (notably the always-inliner and coroutine lowerer) are required for correctness, they need to run over everything. This just adds an extra layer of iteration over the CallGraph to keep track of which functions we've already visited and get the next batch of SCCs. Should fix PR38029. llvm-svn: 336419	2018-07-06 08:04:47 +00:00
Sjoerd Meijer	2a57b357a3	[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418	2018-07-06 08:03:12 +00:00
Craig Topper	c60e1807b3	[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead. The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416	2018-07-06 07:14:41 +00:00
Sam McCall	8ca99100ba	[Support] Make support types more easily printable. Summary: Error's new operator<< is the first way to print an error without consuming it. formatv() can now print objects with an operator<< that works with raw_ostream. Reviewers: bkramer Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48966 llvm-svn: 336412	2018-07-06 05:45:45 +00:00
Dave Lee	390abe4a75	Reapply: "objdump: Support newer ObjC image info flags" Summary: Add support for two additional ObjC image info flags: `IS_SIMULATED` and `HAS_CATEGORY_CLASS_PROPERTIES`. `IS_SIMULATED` indicates a Mach-O binary built for iOS simulator. `HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler that supports class properties in categories. Reviewers: enderby, compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48568 llvm-svn: 336411	2018-07-06 05:11:35 +00:00
Max Kazantsev	20da7e467a	Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done" llvm-svn: 336410	2018-07-06 04:04:13 +00:00
Craig Topper	7b35585ff1	[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409	2018-07-06 03:42:09 +00:00
Craig Topper	4ea8949697	[X86] Cleanup some of the avx512 masked fma tests to prepare for removing and autoupgrading. -Split cases that call 2 intrinsics in the same case. -Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade. -Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet. llvm-svn: 336408	2018-07-06 03:42:06 +00:00
Zachary Turner	457cc34e48	[llvm-pdbutil] Dump more info about globals. We add an option to dump the entire global / public symbol record stream. Previously we would dump globals or publics, but not both. And when we did dump them, we would always dump them in the order they were referenced by the corresponding hash streams, not in the order they were serialized in. This patch adds a lower level mode that just dumps the whole stream in serialization order. Additionally, when dumping global-extras, we now dump the hash bitmap as well as the record offset instead of dumping all zeros for the offsets. llvm-svn: 336407	2018-07-06 02:59:25 +00:00
Stefan Pintilie	b351f09c9e	[Power9] Add __float128 library call for frem Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406	2018-07-06 02:47:02 +00:00
Zachary Turner	1f200adfa7	[PDB] Sort globals symbols by name in GSI hash buckets. It seems like the debugger first computes a symbol's bucket, and then does a binary search of entries in the bucket using the symbol's name in order to find it. If the bucket entries are not in sorted order, this obviously won't work. After this patch a couple of simple test cases show that we generate an exactly identical GSI hash stream, which is very nice. llvm-svn: 336405	2018-07-06 02:33:58 +00:00
Easwaran Raman	4832b9ea6b	[x86]Add a test case to show missed vfnmadd generation. llvm-svn: 336404	2018-07-06 00:31:33 +00:00
Dave Lee	e6de96410b	Revert "objdump: Support newer ObjC image info flags" This reverts commit 8c4cc472e7a67bd3b2b20cc4cf32d31af29bc7e9. llvm-svn: 336402	2018-07-06 00:13:21 +00:00
Mandeep Singh Grang	083f4d7da4	[OpenEmbedded] Add OpenEmbedded vendor Summary: The lib paths are not correctly picked up for OpenEmbedded sysroots (like arm-oe-linux-gnueabi). I fix this in a follow-up clang patch. But in order to add the correct libs I need to detect if the vendor is oe. For this reason, it is first necessary to teach llvm to detect oe vendor, which is what this patch does. Reviewers: chandlerc, compnerd, rengolin, javed.absar Reviewed By: compnerd Subscribers: kristof.beyls, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D48861 llvm-svn: 336401	2018-07-05 23:41:17 +00:00
Maksim Panchenko	89e4abe7b7	[X86][Disassembler] Fix LOCK prefix disassembler support Summary: If LOCK prefix is not the first prefix in an instruction, LLVM disassembler silently drops the prefix. The fix is to select a proper instruction with a builtin LOCK prefix if one exists. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49001 llvm-svn: 336400	2018-07-05 23:32:42 +00:00
Dave Lee	9e412ec8f2	objdump: Support newer ObjC image info flags Summary: Add support for two additional ObjC image info flags: `IS_SIMULATED` and `HAS_CATEGORY_CLASS_PROPERTIES`. `IS_SIMULATED` indicates a Mach-O binary built for iOS simulator. `HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler that supports class properties in categories. Reviewers: enderby, compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48568 llvm-svn: 336399	2018-07-05 23:32:15 +00:00
Michael Zolotukhin	a5f2c52a1e	Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."" There were a couple of issues reported (PR38047, PR37929) - I'll reland the patch when I figure out and fix the rootcause. llvm-svn: 336393	2018-07-05 22:10:31 +00:00
Heejin Ahn	80d9f1708f	[WebAssembly] Add missing _S opcodes of atomic stores to InstPrinter Summary: This was missing in D48839 (rL336145). Reviewers: aardappel Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48992 llvm-svn: 336390	2018-07-05 21:27:09 +00:00
Heejin Ahn	e69ba6e6d5	[ORC] Add BitReader/BitWriter to target_link_libraries Summary: CompileOnDemandLayer.cpp uses function in these libraries, and builds with `-DSHARED_LIB=ON` fail without this. Reviewers: lhames Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48995 llvm-svn: 336389	2018-07-05 21:23:15 +00:00
Sander de Smalen	e2c10f8f47	This is a recommit of r336322, previously reverted in r336324 due to a deficiency in TableGen that has been addressed in r336334. [AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336387	2018-07-05 20:21:21 +00:00
Lang Hames	7bd8970743	[ORC] In CompileOnDemandLayer2, clone modules on to different contexts by writing them to a buffer and re-loading them. Also introduces a multithreaded variant of SimpleCompiler (MultiThreadedSimpleCompiler) for compiling IR concurrently on multiple threads. These changes are required to JIT IR on multiple threads correctly. No test case yet. I will be looking at how to modify LLI / LLJIT to test multithreaded JIT support soon. llvm-svn: 336385	2018-07-05 19:01:27 +00:00
Diogo N. Sampaio	734cfd11fe	Testing commit permision llvm-svn: 336384	2018-07-05 18:49:32 +00:00
Craig Topper	88d361e976	[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to 'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383	2018-07-05 18:43:58 +00:00
Craig Topper	4fe321d1ce	[X86] Add SHUF128 to target shuffle decoding. Differential Revision: https://reviews.llvm.org/D48954 llvm-svn: 336376	2018-07-05 17:10:17 +00:00
Matt Arsenault	24ce89b717	Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375	2018-07-05 17:05:36 +00:00
Matt Arsenault	2d47310071	AMDGPU: Don't use spir_kernel in a test Also use verify-machineinstrs. llvm-svn: 336374	2018-07-05 17:01:29 +00:00
Matt Arsenault	29f303799b	AMDGPU/GlobalISel: Implement custom kernel arg lowering Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373	2018-07-05 17:01:20 +00:00
Simon Pilgrim	8c3765dc6b	[CostModel][X86] Add UDIV/UREM by pow2 costs Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc. llvm-svn: 336371	2018-07-05 16:56:28 +00:00
Paul Semel	91c9d4251c	[llvm-objdump] Removed archive-headers-disas test This test is failing because of the disas part. For the moment, I will juste remove it. I will add it again tomorrow with a proper fix. llvm-svn: 336370	2018-07-05 16:49:46 +00:00
Andrea Di Biagio	fa2d16f4ab	[llvm-mca] Fix RegisterFile debug prints. NFC llvm-svn: 336367	2018-07-05 16:13:49 +00:00
Paul Semel	63e4008718	[llvm-objcopy] Fix timezone dependant tests llvm-svn: 336363	2018-07-05 15:24:11 +00:00
Lei Huang	5612b90694	[Power9] Add lib calls for float128 operations with no equivalent PPC instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2\|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361	2018-07-05 15:21:37 +00:00
Simon Pilgrim	efe84b9d12	[X86][SSE] Add srem x, (1 << c) combine tests Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases. llvm-svn: 336360	2018-07-05 15:15:47 +00:00
Paul Semel	0dc92f6a74	[llvm-objdump] Add --archive-headers (-a) option llvm-svn: 336357	2018-07-05 14:43:29 +00:00
Clement Courbet	f9a0bb330d	[llvm-exegesis] Add uop computation for more X87 instruction classes. Summary: This allows measuring comparisons (UCOM_FpIr32,UCOM_Fpr32,...), conditional moves (CMOVBE_Fp32,...) Reviewers: gchatelet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D48713 llvm-svn: 336352	2018-07-05 13:54:51 +00:00
Simon Pilgrim	9e987e0917	Fix comment typo. NFCI. llvm-svn: 336351	2018-07-05 13:51:35 +00:00
Sanjay Patel	bce899ff59	[AArch64, PowerPC, x86] add tests for signbit bit hacks; NFC llvm-svn: 336348	2018-07-05 13:16:46 +00:00
Simon Pilgrim	dafd828c97	[SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI. This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism. Differential Revision: https://reviews.llvm.org/D48945 llvm-svn: 336344	2018-07-05 12:30:44 +00:00
Clement Courbet	2c278cdd98	[llvm-exegesis][NFC]clang-format llvm-svn: 336343	2018-07-05 12:26:12 +00:00
Ryan Taylor	5f04458a61	[AMDGPU] Add VALU to V_INTERP Instructions Wait states are not properly being inserted after buffer_store for v_interp instructions. Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can check and insert the appropriate wait states when needed. Differential Revision: https://reviews.llvm.org/D48772 Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64 llvm-svn: 336339	2018-07-05 12:02:07 +00:00
Chandler Carruth	f26e7d1fe9	[ADT] Switch to indirect even the trivial case through an object pointer that has required alignment. This avoids issues that keep coming up with function pointers being less aligned. I'm pretty annoyed that we can't take advantage of function alignment even on platforms where they are aligned, but build modes and other things make taking advantage of it somewhere between hard and impossible. The best case scenario would still embed various build modes into the ABI causing really hard to debug issues if you compiled one object file differently from another. =/ This should at least bring the bots back that were having trouble with this. llvm-svn: 336337	2018-07-05 11:56:34 +00:00
Krasimir Georgiev	004f9d400f	Partially revert r336268 in address-offsets.ll Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285. Reviewers: bkramer Reviewed By: bkramer Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D48962 llvm-svn: 336336	2018-07-05 11:30:15 +00:00
Sander de Smalen	13f9425e3a	[TableGen] Increase the number of supported decoder fix-ups. The vast number of added instructions for SVE causes TableGen to fail with an assertion: Assertion `Delta < 65536U && "disassembler decoding table too large!"' This patch increases the number of supported decoder fix-ups. Reviewers: dmgreen, stoklund, petpav01 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D48937 llvm-svn: 336334	2018-07-05 10:39:15 +00:00
Simon Pilgrim	97e6afec2a	[X86][SSE] Add extra v16i16 shl x,c -> pmullw test We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm) llvm-svn: 336333	2018-07-05 09:54:53 +00:00
Simon Pilgrim	6dc45e6ca0	Try to fix -Wimplicit-fallthrough warning. NFCI. llvm-svn: 336331	2018-07-05 09:48:01 +00:00
Aleksandar Beserminji	3239ba8c0e	[mips] Fix atomic operations at O0, v3 Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the stores can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This version addresses issues with the initial implementation and covers all atomic operations. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Patch By: Simon Dardis Differential Revision: https://reviews.llvm.org/D31287 llvm-svn: 336328	2018-07-05 09:27:05 +00:00
Ivan A. Kosarev	466037900c	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325	2018-07-05 08:59:49 +00:00
Sander de Smalen	097ab704c9	Reverting r336322 for now, as it causes an assert failure in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324	2018-07-05 08:52:03 +00:00
Mikael Holmen	8505f34b29	Partial revert of "NFC - Various typo fixes in tests" This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323	2018-07-05 08:42:16 +00:00
Sander de Smalen	ef44226c4f	[AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322	2018-07-05 08:38:30 +00:00
Sjoerd Meijer	27be58b307	[ARM] ParallelDSP: only support i16 loads for now We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319	2018-07-05 08:21:40 +00:00
Sander de Smalen	592718f906	[AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317	2018-07-05 07:54:10 +00:00
Lei Huang	66e22c21c3	[Power9] Optimize codgen for conversions of int to float128 Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316	2018-07-05 07:46:01 +00:00
Craig Topper	350c5f1881	[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement. llvm-svn: 336315	2018-07-05 06:52:55 +00:00
Lei Huang	9d70afbb31	[Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by value Tests to verify that we are passing fp128 via VSX registers as per ABI. These are related to clang commit rL336308. Differential Revision: https://reviews.llvm.org/D48310 llvm-svn: 336314	2018-07-05 06:51:38 +00:00
Lei Huang	5bab646c26	[Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates Add missing testcase for rL336310 llvm-svn: 336313	2018-07-05 06:29:28 +00:00
Serge Pavlov	892bd81a62	[demangler] Avoid alignment warning The alignment specified by a constant for the field `BumpPointerAllocator::InitialBuffer` exceeded the alignment guaranteed by `malloc` and `new` on Windows. This change set the alignment value to that of `long double`, which is defined by the used platform. It fixes https://bugs.llvm.org/show_bug.cgi?id=37944. Differential Revision: https://reviews.llvm.org/D48889 llvm-svn: 336311	2018-07-05 06:22:39 +00:00
Lei Huang	a855e17f09	[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310	2018-07-05 06:21:37 +00:00
Lei Huang	d17c39ccaa	[Power9]Legalize and emit code for quad-precision convert from single-precision Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307	2018-07-05 04:18:37 +00:00
Lei Huang	a26f3be454	[Power9] Implement float128 parameter passing and return values This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306	2018-07-05 04:10:15 +00:00
Craig Topper	2db909cfae	[X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked for the v1i1 mask to have come from a scalar_to_vector from GR8. We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them. llvm-svn: 336305	2018-07-05 03:01:29 +00:00
Craig Topper	95eb88abfe	[X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg input. Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)). This patch fixes that so we can also combine things like (fmsub (fneg)). llvm-svn: 336304	2018-07-05 02:52:56 +00:00
Craig Topper	e4b9257b69	[X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303	2018-07-05 02:52:54 +00:00
Lei Huang	6270ab6ce4	[Power9]Legalize and emit code for round & convert quad-precision values Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299	2018-07-04 21:59:16 +00:00
Aaron Ballman	ea9c3f25a7	Silence an MSVC C4189 warning about a local variable being initialized but not used; NFC. llvm-svn: 336298	2018-07-04 21:22:28 +00:00
Vladimir Stefanovic	87b60a0e00	[mips] Warn when crc, ginv, virt flags are used with too old revision CRC and GINV ASE require revision 6, Virtualization requires revision 5. Print a warning when revision is older than required. Differential Revision: https://reviews.llvm.org/D48843 llvm-svn: 336296	2018-07-04 19:26:31 +00:00
Stefan Pintilie	cb4f0c5c07	[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler We want to run the Machine Scheduler instead of the List Scheduler after RA. Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler. Differential Revision: https://reviews.llvm.org/D45265 llvm-svn: 336295	2018-07-04 18:54:25 +00:00
Jakub Kuderski	bea19a9493	[Dominators] Add DomTreeUpdater constructor from DT* and PDT* Summary: Previously, if a function accepts an optional DT pointer, ``` void Foo (.., DominatorTree * DT = nullptr) { ... if(DT) DomTreeUpdater(DT, ...).insertEdge(A, B); if(DT){ DomTreeUpdater DTU(DT, ...); ... // Construct the update vector and applyUpdates } ... if(DT){ DomTreeUpdater DTU(DT, ...); ... // Construct the update vector and applyUpdates } } ``` After this patch, it can be simplified as ``` void Foo (.., DominatorTree DT = nullptr) { DomTreeUpdater DTU(DT, ...); ... DTU.insertEdge(A, B); if(DT){ ... // Construct the update vector and applyUpdates } ... if(DT){ ... // Construct the update vector and applyUpdates } } ``` Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen Reviewed By: kuhar Author: NutshellySima Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48923 llvm-svn: 336294	2018-07-04 18:37:15 +00:00
Sanjay Patel	9c2e7ceb1a	[InstCombine] allow narrowing of min/max/abs We have bailout hacks based on min/max in various places in instcombine that shouldn't be necessary. The affected test was added for: D48930 ...which is a consequence of the improvement in: D48584 (https://reviews.llvm.org/rL336172) I'm assuming the visitTrunc bailout in this patch was added specifically to avoid a change from SimplifyDemandedBits, so I'm just moving that below the EvaluateInDifferentType optimization. A narrow min/max is still a min/max. llvm-svn: 336293	2018-07-04 17:44:04 +00:00
Roman Lebedev	0dd27042c6	[X86][BtVer2][MCA][NFC] Add CMPEQ dependency-breaking one-idioms tests Summary: As per `Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions)`, these, like zero-idioms, are dependency-breaking, although they produce ones and still consume resources. FIXME: as discussed in D48877, llvm-mca handling is broken for these. Reviewers: andreadb Reviewed By: andreadb Subscribers: gbedwell, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D48876 llvm-svn: 336292	2018-07-04 17:32:44 +00:00
Simon Pilgrim	ae1c4dcc6e	Fix some irregular whitespace/indentation. NFCI. llvm-svn: 336291	2018-07-04 17:24:05 +00:00
Sanjay Patel	ad32d22e13	[InstCombine] add value names to test; NFC That makes it easier to mix and match lines into other tests. llvm-svn: 336289	2018-07-04 16:56:35 +00:00
Volodymyr Turanskyy	17c0c4e742	[ARM] [Assembler] Support negative immediates: cover few missing cases Support for negative immediates was implemented in https://reviews.llvm.org/rL298380, however few instruction options were missing. This change adds negative immediates support and respective tests for the following: ADD ADDS ADDS.W AND.W ANDS BIC.W BICS BICS.W SUB SUBS SUBS.W Differential Revision: https://reviews.llvm.org/D48649 llvm-svn: 336286	2018-07-04 16:11:15 +00:00
Yvan Roux	eaececf5e0	[MachineOutliner] Fix typo in getOutliningCandidateInfo function name getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285	2018-07-04 15:37:08 +00:00
Paul Semel	d2af4d6f1b	[llvm-objdump] Add --file-headers (-f) option llvm-svn: 336284	2018-07-04 15:25:03 +00:00
Simon Pilgrim	0c3b421b1a	[X86][SSE] Add v16i16 shl x,c -> pmullw test llvm-svn: 336277	2018-07-04 14:20:58 +00:00
Andrew Ng	089303d8ff	[ThinLTO] Update ThinLTO cache file atimes when on Windows ThinLTO cache file access times are used for expiration based pruning and since Vista, file access times are not updated by Windows by default: https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance This means on Windows, cache files are currently being pruned from creation time. This change manually updates cache files that are accessed by ThinLTO, when on Windows. Patch by Owen Reynolds. Differential Revision: https://reviews.llvm.org/D47266 llvm-svn: 336276	2018-07-04 14:17:10 +00:00
Sander de Smalen	1e4dc2e97d	[AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction. This patch adds both a vector and an immediate form, e.g. - Vector form: subr z0.h, p0/m, z0.h, z1.h subtract active elements of z0 from z1, and store the result in z0. - Immediate form: subr z0.h, z0.h, #255 subtract elements of z0, and store the result in z0. llvm-svn: 336274	2018-07-04 14:05:33 +00:00
Simon Pilgrim	6a99ce27f6	[X86][SSE] Add SSE2 target to some shift tests Show the difference in behaviour cf SSE41 (no PMULLD, PBLENDW etc.) Raised by D48936 llvm-svn: 336271	2018-07-04 13:58:13 +00:00
Gabor Buella	da4a966e1c	NFC - Various typo fixes in tests llvm-svn: 336268	2018-07-04 13:28:39 +00:00
Sander de Smalen	ab2b0530d9	[AArch64][SVE] Asm: Support for instructions to set/read FFR. Includes instructions to read the First-Faulting Register (FFR): - RDFFR (unpredicated) rdffr p0.b - RDFFR (predicated) rdffr p0.b, p0/z - RDFFRS (predicated, sets condition flags) rdffr p0.b, p0/z Includes instructions to set/write the FFR: - SETFFR (no arguments, sets the FFR to all true) setffr - WRFFR (unpredicated) wrffr p0.b llvm-svn: 336267	2018-07-04 12:58:46 +00:00
Clement Courbet	e945fad250	[llvm-exegesis] Remove dead comment. llvm-svn: 336266	2018-07-04 12:31:00 +00:00
Sander de Smalen	80283b2af4	[AArch64][SVE] Asm: Support for FP conversion instructions. The variants added are: - fcvt (FP convert precision) - scvtf (signed int -> FP) - ucvtf (unsigned int -> FP) - fcvtzs (FP -> signed int (round to zero)) - fcvtzu (FP -> unsigned int (round to zero)) For example: fcvt z0.h, p0/m, z0.s (single- to half-precision FP) scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP) ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP) fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int) fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int) llvm-svn: 336265	2018-07-04 12:13:17 +00:00
Max Kazantsev	bee9ca6568	[NFC] Add test that shows that InstCombine can do better llvm-svn: 336258	2018-07-04 10:32:02 +00:00
Anastasis Grammenos	204726b345	[DebugInfo][LoopVectorize] Preserve DL in generated phi instruction When creating `phi` instructions to resume at the scalar part of the loop, copy the DebugLoc from the original phi over to the new one. Differential Revision: https://reviews.llvm.org/D48769 llvm-svn: 336256	2018-07-04 10:16:55 +00:00
Anastasis Grammenos	509d79789f	[DebugInfo][InstCombine] Preserve DI after combining zext When zext is EvaluatedInDifferentType, InstCombine drops the dbg.value intrinsic. This patch tries to preserve said DI, by inserting the zext's old DI in the resulting instruction. (Only for integer type for now) Differential Revision: https://reviews.llvm.org/D48331 llvm-svn: 336254	2018-07-04 09:55:46 +00:00
Simon Pilgrim	c3e1617bf9	[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED) We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247. llvm-svn: 336250	2018-07-04 09:12:48 +00:00
Simon Pilgrim	61fdf3b33c	[X86][SSE] Add reduced crash test case for r336113 - [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values The patch was reverted at r336189 due to crashes llvm-svn: 336247	2018-07-04 08:55:23 +00:00
Sander de Smalen	e31e6d46dd	[AArch64][SVE] Asm: Support for SVE condition code aliases SVE overloads the AArch64 PSTATE condition flags and introduces a set of condition code aliases for the assembler. The details are described in section 2.2 of the architecture reference manual supplement for SVE. In short: SVE alias => AArch64 name -------------------------- NONE => EQ ANY => NE NLAST => HS LAST => LO FIRST => MI NFRST => PL PMORE => HI PLAST => LS TCONT => GE TSTOP => LT Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48869 llvm-svn: 336245	2018-07-04 08:50:49 +00:00
Max Kazantsev	e8e01143ec	[ImplicitNullChecks] Check for rewrite of register used in 'test' instruction The following code pattern: mov %rax, %rcx test %rax, %rax %rax = .... je throw_npe mov(%rcx), %r9 mov(%rax), %r10 gets transformed into the following incorrect code after implicit null check pass: mov %rax, %rcx %rax = .... faulting_load_op("movl (%rax), %r10", throw_npe) mov(%rcx), %r9 For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D48627 Reviewed By: skatkov llvm-svn: 336241	2018-07-04 08:01:26 +00:00
Fangrui Song	5e4ca9fc9f	[Support] Remove SaveOr which is no longer used llvm-svn: 336237	2018-07-03 23:31:19 +00:00
Jacques Pienaar	0b1e1a4cd4	[lanai] Handle atomic load of i8 like regular load. Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded. llvm-svn: 336236	2018-07-03 22:57:51 +00:00
Fangrui Song	78ab286aa0	[X86][AsmParser] Fix inconsistent declaration parameter name in r336218 llvm-svn: 336232	2018-07-03 21:40:03 +00:00
Benjamin Kramer	2f0dd1405e	[NVPTX] Expand v2f16 INSERT_VECTOR_ELT Vectorization can create them. llvm-svn: 336227	2018-07-03 20:40:04 +00:00
Craig Topper	e317533dcf	[X86] Remove repeated 'the' from multiple comments that have been copy and pasted. NFC llvm-svn: 336226	2018-07-03 20:39:55 +00:00
Roman Lebedev	93357f52c6	[X86] Add tests for low/high bit clearing with different attributes. D48768 may turn some of these into shifts. Reviewers: spatel Reviewed By: spatel Subscribers: spatel, RKSimon, llvm-commits, craig.topper Differential Revision: https://reviews.llvm.org/D48767 llvm-svn: 336224	2018-07-03 19:12:37 +00:00
Fangrui Song	68169343a5	[ARM] Fix inconsistent declaration parameter name in r336195 llvm-svn: 336223	2018-07-03 19:12:27 +00:00
Fangrui Song	bc5c7f2ef0	[AArch64] Make function parameter names in declarations match those of definitions llvm-svn: 336222	2018-07-03 19:07:53 +00:00
Sanjay Patel	181aa26eb8	[InstCombine] add tests for shuffle+binop with constant op1; NFC This adds coverage for a planned enhancement for ConstantExpr::getBinOpIdentity() noted in D48830. llvm-svn: 336220	2018-07-03 18:43:46 +00:00
Craig Topper	adc51ae425	[X86][AsmParser] Rework the in/out (%dx) hack one more time. This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions. This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address. llvm-svn: 336218	2018-07-03 18:07:30 +00:00
Craig Topper	bc598f0d61	[X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode. This might make the error message added in r335668 unneeded, but I'm not sure yet. The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit. llvm-svn: 336217	2018-07-03 17:40:51 +00:00
Vladimir Stefanovic	beb9d9799f	Fix typo in lib/Support/Path.cpp to test commit access llvm-svn: 336216	2018-07-03 17:26:43 +00:00
Sanjay Patel	8307bc407b	[Constants] add identity constants for fadd/fmul As the test diffs show, the current users of getBinOpIdentity() are InstCombine and Reassociate. SLP vectorizer is a candidate for using this functionality too (D28907). The InstCombine shuffle improvements are part of the planned enhancements noted in D48830. InstCombine actually has several other uses of getBinOpIdentity() via SimplifyUsingDistributiveLaws(), but we don't call that for any FP ops. Fixing that might be another part of removing the custom reassociation in InstCombine that is only done for fadd+fmul. llvm-svn: 336215	2018-07-03 17:12:59 +00:00
Sanjay Patel	2c38b7fd8b	[Reassociate] add tests for binop with identity constant; NFC llvm-svn: 336214	2018-07-03 16:44:18 +00:00
Sanjay Patel	5b4a003088	[Reassociate] regenerate checks; NFC llvm-svn: 336211	2018-07-03 16:01:41 +00:00
Sander de Smalen	128fdfa23f	[AArch64][SVE] Asm: Support for FP Complex ADD/MLA. The variants added in this patch are: - Predicated Complex floating point ADD with rotate, e.g. fcadd z0.h, p0/m, z0.h, z1.h, #90 - Predicated Complex floating point MLA with rotate, e.g. fcmla z0.h, p0/m, z1.h, z2.h, #180 - Unpredicated Complex floating point MLA with rotate (indexed operand), e.g. fcmla z0.h, p0/m, z1.h, z2.h[0], #180 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48824 llvm-svn: 336210	2018-07-03 16:01:27 +00:00
Amara Emerson	d912ffaba5	[AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to unselectable stores. r336120 resulted in falling back to SelectionDAG more often due to the G_STORE MMOs not matching the vreg size. This fixes that by explicitly any-extending the value. llvm-svn: 336209	2018-07-03 15:59:26 +00:00
Sanjay Patel	5a6ba018d7	[Reassociate] add test for missing FP constant analysis; NFC llvm-svn: 336208	2018-07-03 15:56:04 +00:00
Teresa Johnson	fc801f5f18	Rename lazy initialization functions to reflect behavior (NFC) Suggested in review for D48698. llvm-svn: 336207	2018-07-03 15:52:57 +00:00
Sander de Smalen	8cd1f53334	[AArch64][SVE] Asm: Support for FMUL (indexed) Unpredicated FP-multiply of SVE vector with a vector-element given by vector[index], for example: fmul z0.s, z1.s, z2.s[0] which performs an unpredicated FP-multiply of all 32-bit elements in 'z1' with the first element from 'z2'. This patch adds restricted register classes for SVE vectors: ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements. ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48823 llvm-svn: 336205	2018-07-03 15:31:04 +00:00
Sander de Smalen	cbd224941f	[AArch64][SVE] Asm: Support for predicated unary operations. The patch includes support for the following instructions: ABS z0.h, p0/m, z0.h NEG z0.h, p0/m, z0.h (S\|U)XTB z0.h, p0/m, z0.h (S\|U)XTB z0.s, p0/m, z0.s (S\|U)XTB z0.d, p0/m, z0.d (S\|U)XTH z0.s, p0/m, z0.s (S\|U)XTH z0.d, p0/m, z0.d (S\|U)XTW z0.d, p0/m, z0.d llvm-svn: 336204	2018-07-03 14:57:48 +00:00
Simon Pilgrim	74cc4cfa94	[DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code llvm-svn: 336198	2018-07-03 14:11:32 +00:00
Sanjay Patel	3074b9e53f	[InstCombine] fold shuffle-with-binop and common value This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196	2018-07-03 13:44:22 +00:00
Sam Parker	ffc1681620	[ARM][NFC] Refactor sequential access for DSP With a view to support parallel operations that have their results stored to memory, refactor the consecutive access helper out so it could support stores instructions. Differential Revision: https://reviews.llvm.org/D48872 llvm-svn: 336195	2018-07-03 12:44:16 +00:00
Bjorn Pettersson	aa02580935	[IR] Strip trailing whitespace. NFC llvm-svn: 336194	2018-07-03 12:39:52 +00:00
Sjoerd Meijer	173b7f0ec7	[AArch64] Armv8.4-A: system registers This adds the following system registers: - RAS registers, - MPAM registers, - Activitiy monitor registers, - Trace Extension registers, - Timing insensitivity of data processing instructions, - Enhanced Support for Nested Virtualization. Differential Revision: https://reviews.llvm.org/D48871 llvm-svn: 336193	2018-07-03 12:09:20 +00:00
Hans Wennborg	2ac1205162	build_llvm_package.bat: Re-try the build steps The build on Windows has been extra flaky recently; retrying helps. llvm-svn: 336192	2018-07-03 11:30:01 +00:00
Bjorn Pettersson	8dd6cf711f	[DebugInfo] Corrections for salvageDebugInfo Summary: When salvaging a dbg.declare/dbg.addr we should not add DW_OP_stack_value to the DIExpression (see test/Transforms/InstCombine/salvage-dbg-declare.ll). Consider this example %vla = alloca i32, i64 2 call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression()) Instcombine will turn it into %vla1 = alloca [2 x i32] %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0 call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression()) If the GEP can be eliminated, then the dbg.declare will be salvaged and we should get %vla1 = alloca [2 x i32] call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression()) The problem was that salvageDebugInfo did not recognize dbg.declare as being indirect (%vla1 points to the value, it does not hold the value), so we incorrectly got call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value)) I also made sure that llvm::salvageDebugInfo and DIExpression::prependOpcodes do not add DW_OP_stack_value to the DIExpression in case no new operands are added to the DIExpression. That way we avoid to, unneccessarily, turn a register location expression into an implicit location expression in some situations (see test11 in test/Transforms/LICM/sinking.ll). Reviewers: aprantl, vsk Reviewed By: aprantl, vsk Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D48837 llvm-svn: 336191	2018-07-03 11:29:00 +00:00
Benjamin Kramer	fd171f2f89	Revert "[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values" This reverts commit r336113. It causes crashes. llvm-svn: 336189	2018-07-03 11:15:17 +00:00
John Brawn	b371ccc661	[llvm-exegesis] Adjust AArch64 unit test The signature of setRegToConstant changed in r336171, so adjust the AArch64 unit test in a similar way to how the X86 unit test was changed in that commit. llvm-svn: 336188	2018-07-03 10:52:20 +00:00
John Brawn	c4ed60042f	[llvm-exegesis] Add an AArch64 target The target does just enough to be able to run llvm-exegesis in latency mode for at least some opcodes. Differential Revision: https://reviews.llvm.org/D48780 llvm-svn: 336187	2018-07-03 10:10:29 +00:00
Sander de Smalen	7fc8543208	[AArch64][SVE] Asm: Support for saturing ADD/SUB instructions. The variants added are: signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42 unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42 signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h llvm-svn: 336186	2018-07-03 09:48:22 +00:00
Petar Jovanovic	226e6117ae	[MIPS GlobalISel] Lower arguments using stack Lower more than 4 arguments using stack. This patch targets MIPS32. It supports only functions with arguments of type i32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D47934 llvm-svn: 336185	2018-07-03 09:31:48 +00:00
Chandler Carruth	3897ded691	[PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when unswitching loops. Original patch trying to address this was sent in D47624, but that didn't quite handle things correctly. There are two key principles used to select whether and how to invalidate SCEV-cached information about loops: 1) We must invalidate any info SCEV has cached before unswitching as we may change (or destroy) the loop structure by the act of unswitching, and make it hard to recover everything we want to invalidate within SCEV. 2) We need to invalidate all of the loops whose CFGs are mutated by the unswitching. Notably, this isn't the entire loop nest, this is every loop contained by the outermost loop reached by an exit block relevant to the unswitch. And we need to do this even when doing trivial unswitching. I've added more focused tests that directly check that SCEV starts off with imprecise information and after unswitching (and simplifying instructions) re-querying SCEV will produce precise information. These tests also specifically work to check that an outer loop's information becomes precise. However, the testing here is still a bit imperfect. Crafting test cases that reliably fail to be analyzed by SCEV before unswitching and succeed afterward proved ... very, very hard. It took me several hours and careful work to build these, and I'm not optimistic about necessarily coming up with more to cover more elaborate possibilities. Fortunately, the code pattern we are testing here in the pass is really straightforward and reliable. Thanks to Max Kazantsev for the initial work on this as well as the review, and to Hal Finkel for helping me talk through approaches to test this stuff even if it didn't come to much. Differential Revision: https://reviews.llvm.org/D47624 llvm-svn: 336183	2018-07-03 09:13:27 +00:00
Sander de Smalen	8fcc3f5feb	[AArch64][SVE] Asm: Support for vector element FP compare. Contains the following variants: - Compare with (elements from) other vector instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo. aliases: fcmle, fcmlt. e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h - Compare absolute values with (absolute values from) other vector. instructions: facge, facgt. aliases: facle, faclt. e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h - Compare vector elements with #0.0 instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne. e.g. fcmle p0.h, p0/z, z0.h, #0.0 llvm-svn: 336182	2018-07-03 09:07:23 +00:00
Chandler Carruth	7f3feec6bb	[ADT] Disable the single callback optimization on Windows. It appears that the function pointer we use there isn't reliably 4-byte aligned. I have no idea why or how we could correct this, so for now we just regress the Windows performance some. Someone with access to Windows could try working on a fix. At the very least we could use a double indirection rather than a table, but maybe there is some way to fully restore this optimization. I don't want to play too much with this when I don't have access to the platform and this at least should restore the last bots. llvm-svn: 336178	2018-07-03 08:19:10 +00:00
Shiva Chen	a0a52bf195	[DebugInfo] Fix PR37395. DbgLabelInst has no address as its operands. Differential Revision: https://reviews.llvm.org/D46738 Patch by Hsiangkai Wang. llvm-svn: 336176	2018-07-03 07:56:04 +00:00
Chandler Carruth	9e0108d90c	[Support] This sanity check in the test only works with certain versions of libstdc++, not just certain versions of GCC. The original macros broke when using Clang + libstdc++4.9 sadly. Sadly, testing for versions of libstdc++ has been extremely problematic in the past, so I'm just narrowing this down to Windows and when using libc++ as that seems at least very unlikely to keep build bots broken. llvm-svn: 336174	2018-07-03 07:51:01 +00:00
Max Kazantsev	3097b76e8c	[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done This patch changes order of transform in InstCombineCompares to avoid performing transforms based on ranges which produce complex bit arithmetics before more simple things (like folding with constants) are done. See PR37636 for the motivating example. Differential Revision: https://reviews.llvm.org/D48584 Reviewed By: spatel, lebedev.ri llvm-svn: 336172	2018-07-03 06:23:57 +00:00
Clement Courbet	e785169fce	[llvm-exegesis] ExegisX86Target::setRegToConstant() should depend on the subtarget features. Summary: This fixes PR38008. Reviewers: gchatelet, RKSimon Subscribers: tschuett, craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D48820 llvm-svn: 336171	2018-07-03 06:17:05 +00:00
Chandler Carruth	83e5f81d26	[ADT] Try to work around a crash in MSVC. Putting `sizeof(T) <= 16` into the parameter of a `std::conditional` causes every version of MSVC I've tried to crash: https://godbolt.org/g/eqVULL Really frustrating, but an extra layer of indirection through an instantiated type gives a working way to access this computed constant. llvm-svn: 336170	2018-07-03 05:46:20 +00:00
Craig Topper	6121699b11	[X86] Add avx512vl command line to break-false-dep.ll llvm-svn: 336169	2018-07-03 04:43:49 +00:00
Chandler Carruth	dc62c17dd6	[ADT] Switch another place to `llvm::is_trivially_move_constructible`. I missed this the first time around, sorry. llvm-svn: 336166	2018-07-03 04:07:26 +00:00
Jakub Kuderski	5e3ab7a940	Reappl "[Dominators] Add the DomTreeUpdater class" Summary: This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html \| RFC - A new dominator tree updater for LLVM ]]. This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process. —Prior to the patch— - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily. - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated. - Functions receiving DT/DDT need to branch a lot which is currently necessary. - Functions using both DomTree and PostDomTree need to call the update function separately on both trees. - People need to construct an additional DeferredDominance class to use functions only receiving DDT. —After the patch— Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Author: NutshellySima Subscribers: vsk, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48383 llvm-svn: 336163	2018-07-03 02:06:23 +00:00
Erik Pilkington	988a16af92	Revert r336159, r336157. Some bots failed on qualified std::max_align_t, and other on unqualified max_align_t. I'll take another stab at this tomorrow. Any ideas for fixing this would be appreciated! http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/23071/steps/build_Lld/logs/stdio http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/11185/steps/build-stage1-compiler/logs/stdio llvm-svn: 336162	2018-07-03 01:30:53 +00:00
Chandler Carruth	f814ad8932	[Support] Fix llvm::unique_function when building with GCC 4.9 by introducing llvm::trivially_{copy,move}_constructible type traits. This uses a completely portable implementation of these traits provided by Richard Smith. You can see it on compiler explorer in all its glory: https://godbolt.org/g/QEDZjW I have transcribed it, clang-formatted it, added some comments, and made the tests fit into a unittest file. I have also switched llvm::unique_function over to use these new, much more portable traits. =D Hopefully this will fix the build bot breakage from my prior commit. llvm-svn: 336161	2018-07-03 01:18:21 +00:00
Teresa Johnson	f8182f1aef	[ThinLTO] Fix printing of aliases for distributed backend indexes Summary: When we import an alias (which will import a copy of the aliasee), but aren't going to import the aliasee directly, the distributed backend index will not contain the aliasee summary. Handle this in the summary assembly printer by printing "null" as the aliasee. Reviewers: davidxl, dexonsmith Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D48699 llvm-svn: 336160	2018-07-03 01:11:43 +00:00
Erik Pilkington	0409a8ade7	Some buildbots were choking on std::max_align_t, try using the global alias. llvm-svn: 336159	2018-07-03 00:48:27 +00:00
Erik Pilkington	d26ace3955	[demangler] Fix a MSVC alignment warning. This should fix llvm.org/PR37944 llvm-svn: 336157	2018-07-03 00:23:18 +00:00
Chandler Carruth	aa60b3fd87	[ADT] Add llvm::unique_function which is like std::function but supporting move-only closures. Most of the core optimizations for std::function are here plus a potentially novel one that detects trivially movable and destroyable functors and implements those with fewer indirections. This is especially useful as we start trying to add concurrency primitives as those often end up with move-only types (futures, promises, etc) and wanting them to work through lambdas. As further work, we could add better support for things like const-qualified operator()s to support more algorithms, and r-value ref qualified operator()s to model call-once. None of that is here though. We can also provide our own llvm::function that has some of the optimizations used in this class, but with copy semantics instead of move semantics. This is motivated by increasing usage of things like executors and the task queue where it is useful to embed move-only types like a std::promise within a type erased function. That isn't possible without this version of a type erased function. Differential Revision: https://reviews.llvm.org/D48349 llvm-svn: 336156	2018-07-02 23:57:29 +00:00
Teresa Johnson	50615c72b4	Remove absolute path in test My test change in r336148 accidentally included an absolute path, clean that up to fix bot failures. llvm-svn: 336151	2018-07-02 23:02:07 +00:00
Lang Hames	adae9bfa24	[ORC] Verify modules when running LLLazyJIT in LLI, and deal with fallout. The verifier identified several modules that were broken due to incorrect linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction has been updated to change decls to external linkage. llvm-svn: 336150	2018-07-02 22:30:18 +00:00
Teresa Johnson	8fc766681d	[ThinLTO] Fix printing of module paths for distributed backend indexes Summary: In the individual index files emitted for distributed ThinLTO backends, the module path ids are not contiguous. Assign slots to module paths in order to handle this better and also to get contiguous numbering in the summary assembly. Reviewers: davidxl, dexonsmith Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu Differential Revision: https://reviews.llvm.org/D48698 llvm-svn: 336148	2018-07-02 22:09:23 +00:00
Heejin Ahn	402b490843	[WebAssembly] Support for atomic stores Summary: Add support for atomic store instructions. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48839 llvm-svn: 336145	2018-07-02 21:22:59 +00:00
Vadzim Dambrouski	fd10286e04	[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m. Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144	2018-07-02 21:05:26 +00:00
Andrea Di Biagio	9b3cb081f3	[llvm-mca] Clear the content of map VariantDescriptors in InstrBuilder before we start analyzing a new CodeBlock. NFCI. Different CodeBlocks don't overlap. The same MCInst cannot appear in more than one code block because all blocks are instantiated before the simulation is run. We should always clear the content of map VariantDescriptors before every simulation, since VariantDescriptors cannot possibly store useful information for the next blocks. It is also "safer" to clear its content because `MCInst*` is used as the key type for map VariantDescriptors. llvm-svn: 336142	2018-07-02 20:39:57 +00:00
Tim Shen	c7cef4bcc4	[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428). Summary: Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change SCEV is able to prove that the loop doesn't wrap-self (due to zext i16 to i64), disabling the entire loop versioning pass. Removed the zext and just use i64. Reviewers: sanjoy Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48409 llvm-svn: 336140	2018-07-02 20:01:54 +00:00
Dan Gohman	b01d87622b	[WebAssembly] Fix fast-isel optimization of branch conditions. LLVM doesn't guarantee anything about the high bits of a register holding an i1 value at the IR level, so don't translate LLVM IR i1 values directly into WebAssembly conditional branch operands. WebAssembly's conditional branches do demand all 32 bits be valid. Fixes PR38019. llvm-svn: 336138	2018-07-02 19:45:57 +00:00
Krzysztof Parzyszek	fd97494984	[X86] Add phony registers for high halves of regs with low halves Add registers still missing after r328016 (D43353): - for bits 15-8 of SI, DI, BP, SP (H), and R8-R15 (BH), - for bits 31-16 of R8-R15 (*WH). Thanks to Craig Topper for pointing it out. llvm-svn: 336134	2018-07-02 19:05:09 +00:00
Alina Sbirlea	0e15501fa7	Replace "Replacable" with "Replaceable". [NFC] llvm-svn: 336133	2018-07-02 18:53:40 +00:00
Fangrui Song	f50ad6c311	Replace unused output filenames with /dev/null in tests Similar to rLLD336129 llvm-svn: 336131	2018-07-02 18:16:44 +00:00
Farhana Aleen	3b416db19b	[SLP] Recognize min/max pattern using instructions producing same values. Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization. %1 = extractelement <2 x i32> %a, i32 0 %2 = extractelement <2 x i32> %a, i32 1 %cond = icmp sgt i32 %1, %2 %3 = extractelement <2 x i32> %a, i32 0 %4 = extractelement <2 x i32> %a, i32 1 %select = select i1 %cond, i32 %3, i32 %4 Author: FarhanaAleen Reviewed By: ABataev, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D47608 llvm-svn: 336130	2018-07-02 17:55:31 +00:00
Sanjay Patel	b999d74132	[InstCombine] reverse canonicalization of add --> or to allow more shuffle folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 llvm-svn: 336128	2018-07-02 17:42:29 +00:00
Francis Visoiu Mistrih	4d5b1073ba	[MC] Error on a .zerofill directive in a non-virtual section On darwin, all virtual sections have zerofill type, and having a .zerofill directive in a non-virtual section is not allowed. Instead of asserting, show a nicer error. In order to use the equivalent of .zerofill in a non-virtual section, the usage of .zero of .space is required. This patch replaces the assert with an error. Differential Revision: https://reviews.llvm.org/D48517 llvm-svn: 336127	2018-07-02 17:29:43 +00:00
Dave Lee	d4f77a523b	nm: Add -no-weak flag for hiding weak symbols Summary: This adds a new -no-weak flag to nm to hide weak symbols in its output. This also adds a -W alias for this which is analogous to -U. Patch by Keith Smiley Reviewers: kastiglione, enderby, compnerd Reviewed By: kastiglione Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48751 llvm-svn: 336126	2018-07-02 17:24:37 +00:00
Simon Pilgrim	35f196c179	[SLPVectorizer][X86] Begin adding alternate tests for call operators Alternate opcode handling only supports binary operators, these tests demonstrate a missed opportunity to vectorize ceil/floor calls llvm-svn: 336125	2018-07-02 17:23:45 +00:00
Vedant Kumar	9b6c096fb5	Tighten up a test for -check-debugify, NFC Use an -implicit-check-not to make sure an error which should not occur in fact does not occur before the first CHECK line. Suggested by Paul Robinson in post-commit feedback for r335897. llvm-svn: 336123	2018-07-02 17:08:36 +00:00
Simon Pilgrim	ac193d4b5c	[CostModel][X86] Add cost tests for fp rounding intrinsics Add cost tests for fp ceil, floor, nearbyint, rint and trunc. llvm-svn: 336122	2018-07-02 17:07:01 +00:00
Craig Topper	56440b9745	[X86] Don't use aligned load/store instructions for fp128 if the load/store isn't aligned. Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions. Should fix PR38001 llvm-svn: 336121	2018-07-02 17:01:54 +00:00
Amara Emerson	846f2436e8	[AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin. We currently don't any-extend vararg parameters before storing them to the stack locations on Darwin. However, SelectionDAG however does this, and so user code is in the wild which inadvertently relies on this extension. This can manifest in cases where the value stored is (int)0, but the actual parameter is interpreted by va_arg as a pointer, and so not extending to 64 bits causes the callee to load additional undefined bits. llvm-svn: 336120	2018-07-02 16:39:09 +00:00
Jakub Kuderski	198f3b16dc	Revert "[Dominators] Add the DomTreeUpdater class" Temporary revert because of a failing test on some buildbots. This reverts commit r336114. llvm-svn: 336117	2018-07-02 16:10:49 +00:00
Sam Clegg	7fecdef5b2	[WebAssembly] Convert remaining tests from elf to wasm output format Differential Revision: https://reviews.llvm.org/D48748 llvm-svn: 336116	2018-07-02 16:03:49 +00:00
Sjoerd Meijer	b0004b834b	Follow up of r335953 - [ARM][AArch64] Armv8.4-A Enablement Imply dotprod for armv8.4-a, because it is mandatory from v8.4. llvm-svn: 336115	2018-07-02 15:38:37 +00:00
Jakub Kuderski	e813a9b380	[Dominators] Add the DomTreeUpdater class Summary: This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html \| RFC - A new dominator tree updater for LLVM ]]. This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process. —Prior to the patch— - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily. - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated. - Functions receiving DT/DDT need to branch a lot which is currently necessary. - Functions using both DomTree and PostDomTree need to call the update function separately on both trees. - People need to construct an additional DeferredDominance class to use functions only receiving DDT. —After the patch— Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Subscribers: vsk, mgorny, llvm-commits Author: NutshellySima Differential Revision: https://reviews.llvm.org/D48383 llvm-svn: 336114	2018-07-02 15:37:41 +00:00
Simon Pilgrim	2bc8e079f2	[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. llvm-svn: 336113	2018-07-02 15:14:07 +00:00
Simon Pilgrim	a6be2437e7	[X86][SSE] Add v8i16 shift test for 2 shift values that doesn't match basic blend We have special case support for 2 shift values for basic blends, but irregular shift patterns end up using the generic lowering, despite shuffle lowering being good enough to handle more complex blends. llvm-svn: 336112	2018-07-02 14:53:41 +00:00
Sanjay Patel	284ba0c18f	[ValueTracking] allow undef elements when matching vector abs llvm-svn: 336111	2018-07-02 14:43:40 +00:00
Yaron Keren	d414c6c131	Disable failing test on x86_64-pc-windows-gnu, see PR38006. llvm-svn: 336110	2018-07-02 14:39:32 +00:00
David Stenberg	23bba56fce	[CodeGen] Make block removal order deterministic in CodeGenPrepare Summary: Replace use of a SmallPtrSet with a SmallSetVector to make the worklist iteration order deterministic. This is done as the order the blocks are removed may affect whether or not PHI nodes in successor blocks are removed. For example, consider the following case where %bb1 and %bb2 are removed: bb1: br i1 undef, label %bb3, label %bb4 bb2: br i1 undef, label %bb4, label %bb3 bb3: pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ], [ pv1, %bb3 ], [ v0, %other ] If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2 to pv1 will be removed before %bb1 is removed as a predecessor to %bb4. The pv1 node will thus be optimized out (to v0) at the time %bb1 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb1 has been removed: bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] The pv2 PHI node will be optimized away by removePredecessor() as all incoming values are identical. In case %bb2 is removed after %bb1, pv1 will not be optimized out at the time %bb2 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb2 to pv2 has been removed: bb3: pv1 = phi type [ undef, %bb2 ], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ pv1, %bb3 ], [ v0, %other ] The pv2 PHI node will thus not be removed in this case, ultimately leading to the following output bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] I have not looked into changing DeleteDeadBlock() so that the redundant PHI nodes are removed. I have not added a test case, as I was not able to create a particularly small and (not messy) reproducer. This is likely due to SmallPtrSet behaving deterministically when in small mode. Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle Reviewed By: fhahn Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D48369 llvm-svn: 336109	2018-07-02 14:23:48 +00:00
Alex Bradbury	07ef10ccb6	[X86] Fix test/MC/AsmParser/exprs-invalid.s after rL336104 This was my mistake for only running test/MC/X86 and test/CodeGen/X86. Arguably .word should be removed from this test, as it is not supported universally. llvm-svn: 336107	2018-07-02 14:13:27 +00:00
John Brawn	346856dc6c	[llvm-exegesis] Change how the native architecture is determined Currently the llvm-exegesis native architecture is determined by comparing the llvm native architecture with X86, so to add a new target would mean adding a new check. Change this to building up a list of the targets llvm-exegesis supports then using that, as this means that when adding a new target you just add the target to the list of supported targets. Differential Revision: https://reviews.llvm.org/D48778 llvm-svn: 336105	2018-07-02 13:53:46 +00:00
Alex Bradbury	c48908781d	[X86] Use addAliasForDirective to support the .word directive (reland) The X86 asm parser currently has custom parsing logic for .word. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon (rL332607) backends. Differential Revision: https://reviews.llvm.org/D47004 This is a fixed reland of rL336100. This should have been caught in pre-commit testing so apologies for the noise. llvm-svn: 336104	2018-07-02 13:49:52 +00:00
Alex Bradbury	c000e4dcb5	Revert r336100 This was a bad change. .word == 2byte on x86. llvm-svn: 336103	2018-07-02 13:43:45 +00:00
Simon Pilgrim	d5fb50e3bf	[SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure. llvm-svn: 336102	2018-07-02 13:41:29 +00:00
Sanjay Patel	951f617e16	[InstCombine] adjust shuffle tests with IR flags; NFC Due to current limitations in constant analysis, we need flags on add or mul to show propagation for the potential transform suggested in these tests (no other binops currently report identity constants). llvm-svn: 336101	2018-07-02 13:40:54 +00:00
Alex Bradbury	42485ec9ca	[X86] Use addAliasForDirective to support the .word directive The X86 asm parser currently has custom parsing logic for .word. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon (rL332607) backends. Differential Revision: https://reviews.llvm.org/D47004 llvm-svn: 336100	2018-07-02 13:37:15 +00:00
John Brawn	8fc5ec78d5	[llvm-exegesis] Delegate the decision of cycle counter name to the target Currently the cycle counter is taken from the subtarget schedule model, which isn't any use if the subtarget doesn't have one. Delegate the decision to the target benchmark runner, as it may know better what to do in that case, with the default being the current behaviour. Differential Revision: https://reviews.llvm.org/D48779 llvm-svn: 336099	2018-07-02 13:14:49 +00:00
Florian Hahn	4ebba909a2	Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 336098	2018-07-02 12:44:04 +00:00
Sanjay Patel	d980084597	[InstCombine] add tests for shuffle-binop; NFC This is another pattern mentioned in PR37806. llvm-svn: 336096	2018-07-02 12:30:46 +00:00
Simon Pilgrim	265793d52a	[SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... llvm-svn: 336095	2018-07-02 11:28:01 +00:00
Simon Pilgrim	409bd5f487	[SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for getEntryCost/vectorizeTree. NFCI. Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators. llvm-svn: 336092	2018-07-02 10:54:19 +00:00
Sander de Smalen	8d4c01a702	[AArch64][SVE] Asm: Support for (SQ)INCP/DECP (scalar, vector) Increments/decrements the result with the number of active bits from the predicate. The inc/dec variants added are: - incp x0, p0.h (scalar) - incp z0.h, p0 (vector) The unsigned saturating inc/dec variants added are: - uqincp x0, p0.h (scalar) - uqincp w0, p0.h (scalar, 32bit) - uqincp z0.h, p0 (vector) The signed saturating inc/dec variants added are: - sqincp x0, p0.h (scalar) - sqincp x0, p0.h, w0 (scalar, 32bit) - sqincp z0.h, p0 (vector) llvm-svn: 336091	2018-07-02 10:08:36 +00:00
Sander de Smalen	c504101781	[AArch64][SVE] Asm: Support for (saturating) vector INC/DEC instructions. Increment/decrement vector by multiple of predicate constraint element count. The variants added by this patch are: - INCH, INCW, INC and (saturating): - SQINCH, SQINCW, SQINCD - UQINCH, UQINCW, UQINCW - SQDECH, SQINCW, SQINCD - UQDECH, UQINCW, UQINCW For example: incw z0.s, all, mul #4 llvm-svn: 336090	2018-07-02 09:31:11 +00:00
Simon Pilgrim	e389434a8a	[X86][BtVer2] Added Jaguar FPU Pipe0/1 uop counters to permit basic llvm-exegesis uop testing We don't have PMCs to cover many of the Jaguar resources but we can at least monitor the FPU issue pipes which give an indication of the fpu uop count, just not the execution resources. llvm-svn: 336089	2018-07-02 09:15:01 +00:00
Petar Jovanovic	3af2c992dc	[Mips][FastISel] Do not duplicate condition while lowering branches This change fixes the issue that arises when we duplicate condition from the predecessor block. If the condition's arguments are not considered alive across the blocks, fast regalloc gets confused and starts generating reloads from the slots that have never been spilled to. This change also leads to smaller code given that, unlike on architectures with condition codes, on Mips we can branch directly on register value, thus we gain nothing by duplication. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D48642 llvm-svn: 336084	2018-07-02 08:56:57 +00:00
Sander de Smalen	8eea4f1c7d	[AArch64][SVE] Asm: Support for vector element compares (immediate). Compare vector elements with a signed/unsigned immediate, e.g. cmpgt p0.s, p0/z, z0.s, #-16 cmphi p0.s, p0/z, z0.s, #127 llvm-svn: 336081	2018-07-02 08:20:59 +00:00
Sander de Smalen	0325e304b9	Reapply r334980 and r334983. These patches were previously reverted as they led to buildbot time-outs caused by large switch statement in printAliasInstr when using UBSan and O3. The issue has been addressed with a workaround (r335525). llvm-svn: 336079	2018-07-02 07:34:52 +00:00
Max Kazantsev	66da390506	[NFC] Test that shows unprofitability of instcombine with bit ranges llvm-svn: 336078	2018-07-02 06:55:00 +00:00
Craig Topper	e06dabd3ca	[X86] Put some cases in switch statements back on one line to be more compact and make it easier to see the similarities. NFC It looks like someone ran clang-format over this entire file which reformatted these switches into a multiline form. But I think the single line form is more useful here. llvm-svn: 336077	2018-07-02 06:42:42 +00:00
Clement Courbet	a53349251c	[llvm-exegesis][NFC] Cleanup useless braces. llvm-svn: 336076	2018-07-02 06:39:55 +00:00
Craig Topper	0661f67296	[X86] Remove FMA3Info DenseMap. Break into sorted tables that we can binary search. I separated out the rounding and broadcast groups into their own tables because it made the ordering in the main table easier. Further splitting of the tables might make it possible to directly index using bits from the TSFlags, but its probably not worth it right now. llvm-svn: 336075	2018-07-02 06:23:39 +00:00
QingShan Zhang	3b2aa2b4b4	[PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's multiple for i64 pre-inc load/store For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse. unsigned long long result = 0; unsigned long long foo(char p, unsigned long long n) { for (unsigned long long i = 0; i < n; i++) { unsigned long long x1 = (unsigned long long )(p - 50000 + i); unsigned long long x2 = (unsigned long long )(p - 61024 + i); unsigned long long x3 = (unsigned long long )(p - 62048 + i); unsigned long long x4 = (unsigned long long )(p - 64096 + i); result = x1 * x2 * x3 * x4; } return result; } Patch by jedilyn(Kewen Lin). Differential Revision: https://reviews.llvm.org/D48813 --This line, and those below, will be ignored-- M lib/Target/PowerPC/PPCLoopPreIncPrep.cpp A test/CodeGen/PowerPC/preincprep-i64-check.ll llvm-svn: 336074	2018-07-02 05:46:09 +00:00
Piotr Padlewski	5b3db45e8f	Implement strip.invariant.group Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073	2018-07-02 04:49:30 +00:00
Eric Christopher	53054141a7	Add an entry for rodata constant merge sections to the default section flags in the ELF assembler. This matches the defaults given in the rest of MC. Fixes PR37997 where we couldn't assemble our own assembly output without warnings. llvm-svn: 336072	2018-07-02 00:16:39 +00:00
Craig Topper	df99cdb95b	[X86] Fix a few test names in avx512-intrinsics-fast-isel.ll to match their clang intrinsic names. I thought I fixed these yesterday, but I guess I missed a few. llvm-svn: 336071	2018-07-01 23:49:06 +00:00
Craig Topper	c004aa6c5f	[X86] Remove the places that return nullptr from X86InstrInfo::commuteInstructionImpl. findCommutedOpIndices does the pre-checking for whether commuting is possible. There should be no reason left to fail in commuteInstructionImpl. There was a missing pre-check that I've added there and changed the check to an assert in commuteInstructionImpl. llvm-svn: 336070	2018-07-01 23:27:41 +00:00
Simon Pilgrim	3dafb553d9	[SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction instead of an opcode. NFCI. llvm-svn: 336069	2018-07-01 20:22:46 +00:00
Simon Pilgrim	ef9c97c343	[SLPVectorizer] Replace sameOpcodeOrAlt with InstructionsState.isOpcodeOrAlt helper. NFCI. This is a basic step towards matching more general instructions types than just opcodes. llvm-svn: 336068	2018-07-01 20:07:30 +00:00
Craig Topper	4d8ec92fb0	[X86][Disassembler] Remove TYPE_BNDR from translateImmediate. I've check the disassembler tables and this shouldn't be reachable. Which is good since if it was reachable there should have been a 'return' after the addOperand line. llvm-svn: 336066	2018-07-01 17:50:29 +00:00
Sanjay Patel	279a1a39ad	[InstCombine] add abs tests with undef elts; NFC llvm-svn: 336065	2018-07-01 17:14:37 +00:00
Sanjay Patel	a9fdb9fd37	[PatternMatch] allow undef elements in vectors with m_Neg This is similar to the m_Not change from D44076. llvm-svn: 336064	2018-07-01 13:42:57 +00:00
Simon Pilgrim	77d2067677	[SLPVectorizer] Use InstructionsState Op/Alt opcodes directly. NFCI. llvm-svn: 336063	2018-07-01 13:41:58 +00:00
David Green	963401d2be	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062	2018-07-01 12:47:30 +00:00
Paul Semel	8dabda70af	Revert "[llvm-readobj] Fix printing format" There is a problem with the formatting on windows build. I need to investigate on this. llvm-svn: 336061	2018-07-01 11:54:09 +00:00
Simon Pilgrim	84f77ecba9	[SLPVectorizer][X86] Add some alternate tests for cast operators Alternate opcode handling only supports binary operators, these tests demonstrate missed opportunities to vectorize some sitofp/uitofp and fptosi/fptoui style casts as well as some (successful) float bits manipulations llvm-svn: 336060	2018-07-01 11:29:46 +00:00
Eugene Leviant	6e4134459b	[Evaluator] Improve evaluation of call instruction Recommit of r335324 after buildbot failure fix llvm-svn: 336059	2018-07-01 11:02:07 +00:00
Paul Semel	49997adc88	[llvm-readobj] Fix printing format We were printing every character, even those that weren't printable. It doesn't really make sense for this option. The string content was sticked to its address, added two spaces in between. Differential Revision: https://reviews.llvm.org/D48271 llvm-svn: 336058	2018-07-01 09:51:59 +00:00
Craig Topper	a2d30b3134	[X86] Remove unnecessary include. NFC Leftover from when the pass contained a DenseMap before it switched to binary search. llvm-svn: 336057	2018-07-01 05:54:22 +00:00
Craig Topper	4e78213ae4	[X86] Move the memory unfolding table creation into its own class and make it a ManagedStatic. Also move the static folding tables, their search functions and the new class into new cpp/h files. The unfolding table is effectively static data. It's just a different ordering and a subset of the static folding tables. By putting it in a separate ManagedStatic we ensure we only have one copy instead of one per X86InstrInfo object. This way also makes it only get initialized when really needed. llvm-svn: 336056	2018-07-01 05:47:49 +00:00
Craig Topper	84199deb17	[X86] Move the X86InstrFMA3Info class into the cpp file. Expose only a getFMA3Group free function. NFCI The class only exists to hold a DenseMap and is only created as a ManagedStatic. It used to expose a single static method that outside code was expected to use. This patch moves that static function out of the class and moves it implementation into the cpp file. It can now access the ManagedStatic directly by name without the need for the other static method that accessed the ManagedStatic. llvm-svn: 336055	2018-06-30 22:38:42 +00:00
Craig Topper	731740744f	[X86] Remove the AsmName from the HAX,HDX,HCX,HBX,HSI,HDI,HBP,HSP,HIP artificial registers so they can't be parsed by the assembly parser. There are no instructions that use them so they weren't causing any bad matches. But they weren't being diagnosed as "invalid register name" if they were used and would instead trigger some form of invalid operand. llvm-svn: 336054	2018-06-30 22:38:41 +00:00
Craig Topper	1b7b9b8596	[X86] Use MVT::i8 for scalar shift amounts since that is what they ultimately need to legalize to. I believe all of these are constants so legalizing them should be pretty trivial, but this saves a step. In one case it looks like we may have been creating a shift amount larger than the shift input itself. llvm-svn: 336052	2018-06-30 18:30:31 +00:00
Craig Topper	5f28d50d27	[X86] When combining load to BZHI, make sure we create the shift instruction with an i8 type. This combine runs pretty late and causes us to introduce a shift after the op legalization phase has run. We need to be sure we create the shift with the proper type for the shift amount. If we don't do this, we will still re-legalize the operation properly, but we won't get a chance to fully optimize the truncate that gets inserted. So this patch adds the necessary truncate when the shift is created. I've also narrowed the subtract that gets created to always be an i32 type. The truncate would have trigered SimplifyDemandedBits to optimize it anyway. But using a more appropriate VT here is free and saves an optimization step. llvm-svn: 336051	2018-06-30 17:49:42 +00:00
Sanjay Patel	16a42ca274	[InstCombine] add tests for negate vector with undef elts; NFC llvm-svn: 336050	2018-06-30 14:11:46 +00:00
Simon Pilgrim	fdca80ddc9	Fix Wdocumentation compiler warning. NFCI. llvm-svn: 336049	2018-06-30 12:24:23 +00:00
Simon Pilgrim	fae337704e	[DAGCombiner] Handle correctly non-splat power of 2 -1 divisor (PR37119) The combine added in commit 329525 overlooked the case where one, but not all, of the divisor elements is -1, -1 is the only power of two value for which the sdiv expansion recipe breaks. Thanks to @zvi for the original patch. Differential Revision: https://reviews.llvm.org/D45806 llvm-svn: 336048	2018-06-30 12:22:55 +00:00
Craig Topper	50a10ba6e0	[X86] Update some avx512 fast-isel tests to match their real clang IRgen. Especially of note was the test_mm_mask_set1_epi64 and other set1 tests that were truncating the element to be broadcasted to i8 and broadcasting that instead of a whole 64 bit value. Some of the others were just correcting mask sizes on parameters due to bugs in the clang test case they were generated from that have now been fixed. Some were converting i8 to <4 x i1>/<2 x i1> by truncating to i4/i2 and then bitcasting. But the clang codegen is bitcast to <8 x i1>, then extract to <4 x i1>/<2 x i1>. This is likely to incur less trouble from the integer type legalizer in the backend. llvm-svn: 336045	2018-06-30 07:25:29 +00:00
Craig Topper	db1d7f2b16	[X86] Change some chec-prefixes from X32 to X86 to match the FileCheck command line. I think this test changed and these test cases were created around the same time and missed the change. llvm-svn: 336044	2018-06-30 06:45:10 +00:00
Craig Topper	8f6ace5bcd	[X86] Remove test cases from avx512vl-intrinsics-fast-isel.ll for intrinsics that don't really exist in clang. llvm-svn: 336043	2018-06-30 06:45:09 +00:00
Tom Stellard	eebbfc2809	AMDGPU/GlobalISel: Make IMPLICIT_DEF of all sizes < 512 legal. Summary: We could split sizes that are not power of two into smaller sized G_IMPLICIT_DEF instructions, but this ends up generating G_MERGE_VALUES instructions which we then have to handle in the instruction selector. Since G_IMPLICIT_DEF is really a no-op it's easier just to keep everything that can fit into a register legal. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48777 llvm-svn: 336041	2018-06-30 04:09:44 +00:00
Jessica Paquette	8bda1881ca	[MachineOutliner] Add support for target-default outlining. This adds functionality to the outliner that allows targets to specify certain functions that should be outlined from by default. If a target supports default outlining, then it specifies that in its TargetOptions. In the case that it does, and the user hasn't specified that they never want to outline, the outliner will be added to the pass pipeline and will run on those default functions. This is a preliminary patch for turning the outliner on by default under -Oz for AArch64. https://reviews.llvm.org/D48776 llvm-svn: 336040	2018-06-30 03:56:03 +00:00
Craig Topper	59f2f38fe0	[X86] Remove masking from avx512 rotate intrinsics. Use select in IR instead. llvm-svn: 336035	2018-06-30 01:32:04 +00:00
Chandler Carruth	7c557f804d	[instsimplify] Move the instsimplify pass to use more obvious file names and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028	2018-06-29 23:36:03 +00:00
Zachary Turner	68e1919d14	[CodeView] Correctly compute the name of S_PROCREF symbols. We have a function which switches on the type of a symbol record to return a hardcoded offset into the record that contains the symbol name. Not all symbols have names to begin with, and for those records we return -1 for the offset. Names are used for various things. Importantly for this particular bug, a hash of the record name is used as a key for certain hash tables which are serialied into the PDB file. One of these hash tables is for the global symbol stream, which is basically a collection of S_PROCREF symbols which contain the name of the symbol, a module, and an address offset. However, for S_PROCREF symbols, the function to return the offset of the name was returning -1: basically it wasn't implemented. As a result of this, all global symbols were hashing to the same value, essentially it was as if every single global symbol's name was the empty string. This manifests in the VS debugger when you try to call a function (global or member, doesn't matter) through the immediate window and the debugger simply reports an error because it can't find the function. This makes perfect sense, because it is hashing the name for real, looking in the global symbol hash table, and there is only 1 entry there which corresponds to a symbol whose name is the empty string. Fixing this fixes the MSVC debugger in this case. llvm-svn: 336024	2018-06-29 22:19:02 +00:00
Heejin Ahn	5cc0e25324	[WebAssembly] Update comments for non-splat pow2 vector test case Summary: After rL335727, (sdiv X, 1) is treated as a special case, so we can safely transform 'sdiv's in non-splat pow vectors into 'shr's even when some of its entries are '1'. The test expectations have been already fixed in rL335771, but the comments were out of date. Also changed the filename from `vector_sdiv.ll` to `vector-sdiv.ll` to be consistent with other test file names. Reviewers: RKSimon Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48692 llvm-svn: 336018	2018-06-29 21:27:20 +00:00
Heejin Ahn	a86152d0a7	[WebAssembly] Comment out a switch block in ISelDAGToDAG Summary: Fixes PR37977. Reviewers: RKSimon Subscribers: dschuff, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48737 llvm-svn: 336017	2018-06-29 21:19:22 +00:00
Alina Sbirlea	da1e80feb7	[MemorySSA] Add APIs to MemoryPhis to delete incoming blocks/values, and an updater API to remove blocks. Summary: MemoryPhis now have APIs analogous to BB Phis to remove an incoming value/block. The MemorySSAUpdater uses the above APIs when updating MemorySSA given a set of dead blocks about to be deleted. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D48396 llvm-svn: 336015	2018-06-29 20:46:16 +00:00
Alex Shlyapnikov	788764ca12	[HWASan] Do not retag allocas before return from the function. Summary: Retagging allocas before returning from the function might help detecting use after return bugs, but it does not work at all in real life, when instrumented and non-instrumented code is intermixed. Consider the following code: F_non_instrumented() { T x; F1_instrumented(&x); ... } { F_instrumented(); F_non_instrumented(); } - F_instrumented call leaves the stack below the current sp tagged randomly for UAR detection - F_non_instrumented allocates its own vars on that tagged stack, not generating any tags, that is the address of x has tag 0, but the shadow memory still contains tags left behind by F_instrumented on the previous step - F1_instrumented verifies &x before using it and traps on tag mismatch, 0 vs whatever tag was set by F_instrumented Reviewers: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D48664 llvm-svn: 336011	2018-06-29 20:20:17 +00:00
Vedant Kumar	69ee62cef8	[LLVMContext] Detecting leaked instructions with metadata When instructions with metadata are accidentally leaked, the result is a difficult-to-find memory corruption in ~LLVMContextImpl that leads to random crashes. Patch by Arvīds Kokins! llvm-svn: 336010	2018-06-29 20:13:13 +00:00

... 3 4 5 6 7 ...

166472 Commits