llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	0aa9b453c4	[GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8s This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s. We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic. This adds legalizer support for those 3, which gives us selection via the importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and add a GISel line to arm64-vabs.ll. Differential Revision: https://reviews.llvm.org/D60881 llvm-svn: 358715	2019-04-18 21:15:48 +00:00
Jessica Paquette	3b5119c684	[GlobalISel][AArch64] Legalize v8s8 loads Add legalizer support for loads of v8s8 and update legalize-load-store.mir. Differential Revision: https://reviews.llvm.org/D60877 llvm-svn: 358714	2019-04-18 21:13:58 +00:00
Nico Weber	a0ac65c98f	llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzz llvm-svn: 358708	2019-04-18 19:52:32 +00:00
Nico Weber	502cf4bd19	llvm-undname: Fix two asserts-on-invalid llvm-svn: 358707	2019-04-18 19:30:21 +00:00
Philip Reames	137995d8da	[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass llvm-svn: 358704	2019-04-18 19:17:14 +00:00
Quentin Colombet	ea3364bf85	[BlockExtractor] Extend the file format to support the grouping of basic blocks Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701	2019-04-18 18:28:30 +00:00
Roland Froese	a5dd08cac2	[PowerPC] Add some PPC vec cost tests to prep for D60160 NFC llvm-svn: 358699	2019-04-18 18:12:09 +00:00
Simon Pilgrim	4171a91e92	[X86] combineVectorTruncationWithPACKUS - remove split/concatenation of mask combineVectorTruncationWithPACKUS is currently splitting the upper bit bit masking into 128-bit subregs and then concatenating them back together. This was originally done to avoid regressions that caused existing subregs to be concatenated to the larger type just for the AND masking before being extracted again. This was fixed by @spatel (notably rL303997 and rL347356). This also lets SimplifyDemandedBits do some further improvements before it hits the recursive depth limit. My only annoyance with this is that we were broadcasting some xmm masks but we seem to have lost them by moving to ymm - but that's a known issue as the logic in lowerBuildVectorAsBroadcast isn't great. Differential Revision: https://reviews.llvm.org/D60375#inline-539623 llvm-svn: 358692	2019-04-18 17:23:09 +00:00
Philip Reames	adf288c5d9	[LoopPred] Fix a blatantly obvious bug in r358684 The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688	2019-04-18 17:01:19 +00:00
Sanjay Patel	51fa60bcbb	[x86] add tests for improved insertelement to index 0 (PR41512); NFC Patch proposal in D60852. llvm-svn: 358687	2019-04-18 16:58:50 +00:00
Philip Reames	92a7177e6b	[LoopPredication] Allow predication of loop invariant computations (within the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684	2019-04-18 16:33:17 +00:00
Nicolai Haehnle	523f90a2ba	[SDA] Bug fix: Use IPD outside the loop as divergence bound Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681	2019-04-18 16:17:35 +00:00
Pavel Labath	7429d86f36	MinidumpYAML: Add support for ModuleList stream Summary: This patch adds support for yaml (de)serialization of the minidump ModuleList stream. It's a fairly straight forward-application of the existing patterns to the ModuleList structures defined in previous patches. One thing, which may be interesting to call out explicitly is the addition of "new" allocation functions to the helper BlobAllocator class. The reason for this was, that there was an emerging pattern of a need to allocate space for entities, which do not have a suitable lifetime for use with the existing allocation functions. A typical example of that was the "size" of various lists, which is only available as a temporary returned by the .size() method of some container. For these cases, one can use the new set of allocation functions, which will take a temporary object, and store it in an allocator-managed buffer until it is written to disk. Reviewers: amccarth, jhenderson, clayborg, zturner Subscribers: lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60405 llvm-svn: 358672	2019-04-18 14:57:31 +00:00
Jordan Rupprecht	2b32902a88	[llvm-objcopy] Add -B mips llvm-svn: 358667	2019-04-18 14:22:37 +00:00
George Rimar	a630b34057	[yaml2elf/obj2yaml] - Allow normal parsing/dumping of the .rela.dyn section .rela.dyn is a section that has sh_info normally set to zero. And Info is an optional field in the description of the relocation section in YAML. But currently, yaml2obj would fail to produce the object when Info is not explicitly listed. The patch fixes the issue. Differential revision: https://reviews.llvm.org/D60820 llvm-svn: 358656	2019-04-18 11:02:07 +00:00
Simon Pilgrim	8f87e53462	[X86][SSE] Lower ICMP EQ(AND(X,C),C) -> SRA(SHL(X,LOG2(C)),BW-1) iff C is power-of-2. This replaces the MOVMSK combine introduced at D52121/rL342326 (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) with the more general icmp lowering so it can pick up more cases through bitcasts - notably vXi8 cases which use vXi16 shifts+masks, this patch can remove the mask and use pcmpgtb(0,x) for the sra. Differential Revision: https://reviews.llvm.org/D60625 llvm-svn: 358651	2019-04-18 09:58:59 +00:00
James Henderson	66a9d0f8c6	[llvm-objcopy][llvm-strip] Add switch to allow removing referenced sections llvm-objcopy currently emits an error if a section to be removed is referenced by another section. This is a reasonable thing to do, but is different to GNU objcopy. We should allow users who know what they are doing to have a way to produce the invalid ELF. This change adds a new switch --allow-broken-links to both llvm-strip and llvm-objcopy to do precisely that. The corresponding sh_link field is then set to 0 instead of an error being emitted. I cannot use llvm-readelf/readobj to test the link fields because they emit an error if any sections, like the .dynsym, cannot be properly loaded. Reviewed by: rupprecht, grimar Differential Revision: https://reviews.llvm.org/D60324 llvm-svn: 358649	2019-04-18 09:13:30 +00:00
Kang Zhang	009a21d2fd	[PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS() Summary: This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177 When the two operands for BUILD_VECTOR are same, we will get assert error. llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&): Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) && "The loads cannot be both consecutive and reverse consecutive."' failed. This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We should use `getScalarType().getStoreSize();` to get the ElemSize instread of `getScalarSizeInBits() / 8`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60811 llvm-svn: 358644	2019-04-18 07:24:15 +00:00
Tim Renouf	7c55c8d8c3	[AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0)) fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but creating the new fadd folds to just fneg(A). When A has multiple uses, this confuses it and you get an assert. Fixed. Differential Revision: https://reviews.llvm.org/D60633 Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c llvm-svn: 358640	2019-04-18 05:27:01 +00:00
Sanjay Patel	fb363a778f	[x86] try to widen 'shl' as part of LEA formation The test file has pairs of tests that are logically equivalent: https://rise4fun.com/Alive/2zQ %t4 = and i8 %t1, 8 %t5 = zext i8 %t4 to i16 %sh = shl i16 %t5, 2 %t6 = add i16 %sh, %t0 => %t4 = and i8 %t1, 8 %sh2 = shl i8 %t4, 2 %z5 = zext i8 %sh2 to i16 %t6 = add i16 %z5, %t0 ...so if we can fold the shift op into LEA in the 1st pattern, then we should be able to do the same in the 2nd pattern (unnecessary 'movzbl' is a separate bug I think). We don't want to do this any sooner though because that would conflict with generic transforms that try to narrow the width of the shift. Differential Revision: https://reviews.llvm.org/D60789 llvm-svn: 358622	2019-04-17 22:38:51 +00:00
Amara Emerson	daf6e66ac5	[GlobalISel] Add legalization support for non-power-2 loads and stores Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613	2019-04-17 21:30:07 +00:00
Kit Barton	3cdf87940f	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607	2019-04-17 18:53:27 +00:00
Nick Desaulniers	a2077bab40	[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603	2019-04-17 18:22:48 +00:00
Steven Wu	05a358cdcd	[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601	2019-04-17 17:38:09 +00:00
Nikita Popov	2039581002	[LVI][CVP] Constrain values in with.overflow branches If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597	2019-04-17 16:57:42 +00:00
Dmitry Preobrazhensky	394d0a1637	[AMDGPU][MC] Corrected handling of "-" before expressions See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60622 llvm-svn: 358596	2019-04-17 16:56:34 +00:00
Sanjay Patel	1964962b49	[ARM] tighten test checks; NFC llvm-svn: 358594	2019-04-17 16:51:09 +00:00
Rhys Perry	c2814e12e7	AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructions Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches. Reviewers: arsen, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60824 llvm-svn: 358592	2019-04-17 16:31:52 +00:00
Sanjay Patel	1f2c81af72	[ARM] make test checks more thorough; NFC This will change with the proposal in D60214. Unfortunately, the triple is not supported for auto-generation via script, and the multiple RUN lines have diffs on this test, but I can't tell exactly what is required by this test. PR7162 was an assert/crash, so hopefully, this is good enough. llvm-svn: 358587	2019-04-17 16:02:07 +00:00
Florian Hahn	893aea58ea	[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size. Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586	2019-04-17 15:57:43 +00:00
Dmitry Preobrazhensky	20d52e3aa2	[AMDGPU][MC] Corrected parsing of registers See bug 41280: https://bugs.llvm.org/show_bug.cgi?id=41280 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60621 llvm-svn: 358581	2019-04-17 14:44:01 +00:00
Tim Renouf	59e8bd3093	[AMDGPU] Flag new raw/struct atomic ops as source of divergence Differential Revision: https://reviews.llvm.org/D60731 Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d llvm-svn: 358579	2019-04-17 14:04:31 +00:00
Simon Pilgrim	9daacec816	[CostModel][X86] Add bool anyof/allof reduction costs On pre-AVX512 targets we can use MOVMSK to extract reduced boolean results. This is properly optimized, annoyingly AVX512 isn't and produces code that is almost as bad as the (unchanged) costs suggest...... Differential Revision: https://reviews.llvm.org/D60403 llvm-svn: 358574	2019-04-17 10:58:19 +00:00
Jordan Rupprecht	b0b65cae59	[llvm-objcopy] Support full list of bfd targets that lld uses. Summary: This change takes the full list of bfd targets that lld supports (see `ScriptParser.cpp`), including generic handling for `*-freebsd` targets (which uses the same settings but with a FreeBSD OSABI). In particular this adds mips support for `--output-target` (but not yet via `--binary-architecture`). lld and llvm-objcopy use their own different custom data structures, so I'd prefer to check this in as-is (add support directly in llvm-objcopy, including all the test coverage) and do a separate NFC patch(s) that consolidate the two by putting this mapping into libobject. See [[ https://bugs.llvm.org/show_bug.cgi?id=41462 \| PR41462 ]]. Reviewers: jhenderson, jakehehrlich, espindola, alexshap, arichardson Reviewed By: arichardson Subscribers: fedor.sergeev, emaste, sdardis, krytarowski, atanasyan, llvm-commits, MaskRay, arichardson Tags: #llvm Differential Revision: https://reviews.llvm.org/D60773 llvm-svn: 358562	2019-04-17 07:42:31 +00:00
Roman Lebedev	0080645846	[CVP] processOverflowIntrinsic(): don't crash if constant-holding happened As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559	2019-04-17 06:35:07 +00:00
Craig Topper	5ca2e04c7a	[X86] Autogenerate complete checks. NFC llvm-svn: 358556	2019-04-17 06:09:16 +00:00
Eric Christopher	e29874eaa0	Revert "Add basic loop fusion pass." Per request. This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553	2019-04-17 04:55:24 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Kit Barton	ab70da0728	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543	2019-04-17 01:37:00 +00:00
Sanjay Patel	d5bc5ca3e4	[x86] adjust LEA tests for better coverage; NFC The scale can 1, 2, or 3. llvm-svn: 358539	2019-04-16 23:10:41 +00:00
Sanjay Patel	e08783e2f5	[EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523	2019-04-16 20:41:20 +00:00
Nikita Popov	52b24ee932	[CVP] Simplify umulo and smulo that cannot overflow If a umul.with.overflow or smul.with.overflow operation cannot overflow, simplify it to a simple mul nuw / mul nsw. After the refactoring in D60668 this is just a matter of removing an explicit check against multiplications. Differential Revision: https://reviews.llvm.org/D60791 llvm-svn: 358521	2019-04-16 20:31:41 +00:00
Simon Pilgrim	82ffa88a04	[SLP] Refactoring of the operand reordering code. This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519	2019-04-16 19:27:00 +00:00
Nikita Popov	5a30177906	[CVP] Add tests for non-overflowing mulo; NFC Should be simplified to simple mul. llvm-svn: 358517	2019-04-16 19:25:35 +00:00
Simon Pilgrim	d769bb1e58	[X86][AVX] X86ISD::PERMV/PERMV3 node types can never fold index ops Improves codegen demonstrated by D60512 - instructions represented by X86ISD::PERMV/PERMV3 can never memory fold the operand used for their index register. This patch updates the 'isUseOfShuffle' helper into the more capable 'isFoldableUseOfShuffle' that recognises that the op is used for a X86ISD::PERMV/PERMV3 index mask and can't be folded - allowing us to use broadcast/subvector-broadcast ops to reduce the size of the mask constant pool data. Differential Revision: https://reviews.llvm.org/D60562 llvm-svn: 358516	2019-04-16 19:18:53 +00:00
Nikita Popov	5ecd6a48b9	[InstCombine] Prune fshl/fshr with masked operands If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515	2019-04-16 19:05:49 +00:00
Nikita Popov	f700081a7d	[InstCombine] Add tests for fshl/fshr with masked operands; NFC Baseline tests for D60660. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60688 llvm-svn: 358514	2019-04-16 19:05:40 +00:00
Sanjay Patel	f136c46bd6	[x86] add more tests for LEA formation; NFC Promoting the shift to the wider type should allow LEA. llvm-svn: 358513	2019-04-16 18:58:03 +00:00
Philip Reames	c44b68e2b7	[Tests] Add branch_weights to latches so that test is not effected by future profitability patch to LoopPredication llvm-svn: 358506	2019-04-16 16:32:59 +00:00
Fangrui Song	29cca27140	[llvm-objdump] Test tabs in disassemble-align.s with a more visible character Summary: Apply rupprecht's suggestion in D60376 Reviewers: rupprecht Reviewed By: rupprecht Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60777 llvm-svn: 358504	2019-04-16 15:58:42 +00:00
Luis Marques	20d2424016	[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS When not optimizing for minimum size (-Oz) we custom lower wide shifts (SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall. Differential Revision: https://reviews.llvm.org/D59477 llvm-svn: 358498	2019-04-16 14:38:32 +00:00
Ulrich Weigand	452060ab87	[SystemZ] Add missing intrinsics to intrinsics-immarg.ll As of r356091, support for the ImmArg intrinsics was added, including a SystemZ test case. However, that test case doesn't actually verify all SystemZ intrinsics with immediate arguments, only a subset. The rest of them actually works correctly, there's just no test for them. This patch add all missing intrinsics. llvm-svn: 358495	2019-04-16 14:35:18 +00:00
Nico Weber	c035c243da	llvm-undname: Fix nullptr deref on invalid structor names in template args Similar to r358421: A StructorIndentifierNode has a Class field which is read when printing it, but if the StructorIndentifierNode appears in a template argument then demangleFullyQualifiedSymbolName() which sets Class isn't called. Since StructorIndentifierNodes are always leaf names, we can just reject them as well. Found by oss-fuzz. llvm-svn: 358491	2019-04-16 14:10:34 +00:00
Nico Weber	aa18ae862d	llvm-undname: Tweak arena allocator - Make `allocUnalignedBuffer` look more like `allocArray` and `alloc`. No behavior change. - Change `Head->Used < Head->Capacity` to `Head->Used <= Head->Capacity` in `allocArray` and `alloc`. No intended behavior change, might be a minuscule memory usage improvement. Noticed this since it was the logic used in `allocUnalignedBuffer`. - Don't let `allocArray` alloc too small buffers for names that have more than 512 levels of nesting (in 64-bit builds). Fixes a heap buffer overflow found by oss-fuzz. Differential Revision: https://reviews.llvm.org/D60774 llvm-svn: 358489	2019-04-16 13:52:30 +00:00
Nico Weber	5961b0203a	llvm-undname: add a missing CHECK: to a passing test llvm-svn: 358488	2019-04-16 13:30:50 +00:00
Nico Weber	ff92e715d3	Fix llvm-undname tests after r358485 llvm-svn: 358487	2019-04-16 13:18:51 +00:00
Hans Wennborg	21eb771dcb	Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 358483	2019-04-16 12:13:25 +00:00
Hans Wennborg	6ae05777b8	Asan use-after-scope: don't poison allocas if there were untraced lifetime intrinsics in the function (PR41481) If there are any intrinsics that cannot be traced back to an alloca, we might have missed the start of a variable's scope, leading to false error reports if the variable is poisoned at function entry. Instead, if there are some intrinsics that can't be traced, fail safe and don't poison the variables in that function. Differential revision: https://reviews.llvm.org/D60686 llvm-svn: 358478	2019-04-16 07:54:20 +00:00
Fangrui Song	fa860ff733	[llvm-objdump] Align instructions to a tab stop in disassembly output This relands D60376/rL358405, with the difference: sed 'y/\t/ /' -> tr '\t' ' ' BSD sed doesn't support escape characters for the 'y' command. I didn't use it in rL358405 because it was not listed at https://llvm.org/docs/GettingStarted.html#software but it should be available. Original description: In GNU objdump, -w/--wide aligns instructions in the disassembly output. This patch does the same to llvm-objdump. However, we always use the wide format (-w/--wide is ignored), because the narrow format (instructions are misaligned) is probably not very useful. In llvm-readobj, we made a similar decision: always use the wide format, accept but ignore -W/--wide. To save some columns, we change the tab before hex bytes (controlled by --[no-]show-raw-insn) to a space. llvm-svn: 358474	2019-04-16 03:56:55 +00:00
Fangrui Song	051a699ed6	[llvm-objdump] Simplify PrintHelpMessage() logic This relands rL358418. It missed one test that should also use -macho Note, all the other -private-header -exports-trie tests are used together with -macho. llvm-svn: 358472	2019-04-16 02:37:29 +00:00
Alex Lorenz	d9d0c3e138	Revert r358405: "[llvm-objdump] Align instructions to a tab stop in disassembly output" The test fails on darwin due to a sed error: sed: 1: "y/\t/ /": transform strings are not the same length llvm-svn: 358459	2019-04-15 22:36:12 +00:00
Amara Emerson	02a90ea73d	[AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types. Since non-pow-2 types are going to get split up into multiple loads anyway, don't do the [SZ]EXTLOAD combine for those and save us trouble later in legalization. llvm-svn: 358458	2019-04-15 22:34:08 +00:00
Quentin Colombet	fda0426888	[LSR] Rewrite misses some fixup locations if it splits critical edge If LSR split critical edge during rewriting phi operands and phi node has other pending fixup operands, we need to update those pending fixups. Otherwise formulae will not be implemented completely and some instructions will not be eliminated. llvm.org/PR41445 Differential Revision: https://reviews.llvm.org/D60645 Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com> llvm-svn: 358457	2019-04-15 22:23:46 +00:00
Sanjay Patel	800a0c3e4b	[EarlyCSE] add more tests for double-negated select condition; NFC llvm-svn: 358454	2019-04-15 21:51:51 +00:00
Craig Topper	0495f29e42	[X86] Limit the 'x' inline assembly constraint to zmm0-15 when used for a 512 type. The 'v' constraint is used to select zmm0-31. This makes 512 bit consistent with 128/256-bit.a llvm-svn: 358450	2019-04-15 21:06:32 +00:00
Craig Topper	77439bb128	[X86] Fix a stack folding test to have a full xmm2-31 clobber list instead of stopping at xmm15. Add an additional dependency to keep instruction below inline asm block. llvm-svn: 358449	2019-04-15 21:06:23 +00:00
Matt Arsenault	101abd219b	AMDGPU: Fix unreachable when counting register usage of SGPR96 llvm-svn: 358447	2019-04-15 20:51:12 +00:00
Matt Arsenault	fbdd2a1887	AMDGPU: Fix printed format of SReg_96 These are artificial, so I think this should only come up with inline asm comments. llvm-svn: 358446	2019-04-15 20:42:18 +00:00
Sanjay Patel	5ae05d810c	[EarlyCSE] add test for select condition double-negation; NFC llvm-svn: 358444	2019-04-15 20:25:31 +00:00
Alex Lorenz	16256123d0	Revert r358418: "[llvm-objdump] Simplify PrintHelpMessage() logic" This reverts commit r358418 as it broke `test/Object/objdump-export-list` on Darwin. llvm-svn: 358443	2019-04-15 20:16:19 +00:00
Philip Reames	af808ee2ee	[Tests] Add a few more tests for LoopPredication w/invariant loads Making sure to cover an important legality cornercase. llvm-svn: 358439	2019-04-15 19:45:27 +00:00
Craig Topper	3d9b47c770	[X86] Block i32/i64 for 'k' and 'Yk' in getRegForInlineAsmConstraint without avx512bw. 32 and 64 bit k-registers require avx512bw. If we don't block this properly, it leads to a crash. llvm-svn: 358436	2019-04-15 18:39:45 +00:00
Sanjay Patel	8ae68f2648	[x86] update test checks; NFC llvm-svn: 358432	2019-04-15 17:38:47 +00:00
Wolfgang Pieb	4fe42214e2	[DEBUGINFO] Prevent Instcombine from dropping debuginfo when removing zexts Zexts can be treated like no-op casts when it comes to assessing whether their removal affects debug info. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D60641 llvm-svn: 358431	2019-04-15 17:36:29 +00:00
Don Hinton	b85f74a283	[CommandLineParser] Add DefaultOption flag Summary: Add DefaultOption flag to CommandLineParser which provides a default option or alias, but allows users to override it for some other purpose as needed. Also, add `-h` as a default alias to `-help`, which can be seamlessly overridden by applications like llvm-objdump and llvm-readobj which use `-h` as an alias for other options. (relanding after revert, r358414) Added DefaultOptions.clear() to reset(). Reviewers: alexfh, klimek Reviewed By: klimek Subscribers: kristina, MaskRay, mehdi_amini, inglorion, dexonsmith, hiraditya, llvm-commits, jhenderson, arphaman, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59746 llvm-svn: 358428	2019-04-15 17:18:10 +00:00
Craig Topper	8e364c680f	[X86] Restore the pavg intrinsics. The pattern we replaced these with may be too hard to match as demonstrated by PR41496 and PR41316. This patch restores the intrinsics and then we can start focusing on the optimizing the intrinsics. I've mostly reverted the original patch that removed them. Though I modified the avx512 intrinsics to not have masking built in. Differential Revision: https://reviews.llvm.org/D60674 llvm-svn: 358427	2019-04-15 17:17:35 +00:00
Sean Fertile	8d856488a8	Add slbfee instruction. llvm-svn: 358425	2019-04-15 17:08:43 +00:00
Hiroshi Yamauchi	09e539fcae	[PGO] Profile guided code size optimization. Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422	2019-04-15 16:49:00 +00:00
Nico Weber	64041d7b90	llvm-undname: Fix nullptr deref on invalid conversion operator names in template args A ConversionOperatorIdentifierNode has a TargetType which is read when printing it, but if the ConversionOperatorIdentifierNode appears in a template argument there's nothing that can provide the TargetType. Normally the COIN is a symbol (leaf) name and takes its TargetType from the symbol's type, but in a template argument context the COIN can only be either a non-leaf name piece or a type, and must hence be invalid. Similar to the COIN check in demangleDeclarator(). Found by oss-fuzz. llvm-svn: 358421	2019-04-15 16:42:44 +00:00
Sanjay Patel	0e0bb0e24a	[EarlyCSE] add tests for selects with commuted operands (PR41101); NFC llvm-svn: 358420	2019-04-15 16:01:05 +00:00
Philip Reames	fbe64a2cfb	[LoopPred] Hoist and of predicated checks where legal If we have multiple range checks which can be predicated, hoist the and of the results outside the loop. This minorly cleans up the resulting IR, but the main motivation is as a building block for D60093. llvm-svn: 358419	2019-04-15 15:53:25 +00:00
Fangrui Song	204339a234	[llvm-objdump] Simplify PrintHelpMessage() logic llvm-svn: 358418	2019-04-15 15:52:32 +00:00
Ilya Biryukov	70921d4a86	Revert r358337: "[CommandLineParser] Add DefaultOption flag" The change causes test failures under asan. Reverting to unbreak our integrate. llvm-svn: 358414	2019-04-15 14:43:50 +00:00
Sanjay Patel	c71433335a	[EarlyCSE] regenerate test checks; NFC llvm-svn: 358407	2019-04-15 14:02:37 +00:00
Fangrui Song	b688a200e4	[llvm-objdump] Align instructions to a tab stop in disassembly output Summary: In GNU objdump, -w/--wide aligns instructions in the disassembly output. This patch does the same to llvm-objdump. However, we always use the wide format (-w/--wide is ignored), because the narrow format (instructions are misaligned) is probably not very useful. In llvm-readobj, we made a similar decision: always use the wide format, accept but ignore -W/--wide. To save some columns, we change the tab before hex bytes (controlled by --[no-]show-raw-insn) to a space. Reviewers: rupprecht, jhenderson, grimar Reviewed By: jhenderson Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60376 llvm-svn: 358405	2019-04-15 13:32:41 +00:00
Sanjay Patel	5e13cd2e61	[InstCombine] canonicalize fdiv after fmul if reassociation is allowed (X / Y) * Z --> (X * Z) / Y This can allow other optimizations/reassociations as shown in the test diffs. llvm-svn: 358404	2019-04-15 13:23:38 +00:00
Eugene Leviant	4918738c07	[llvm-readelf] Correctly dump symbols whose section id is SHN_XINDEX Differential revision: https://reviews.llvm.org/D60614 llvm-svn: 358396	2019-04-15 11:21:47 +00:00
Stephen Tozer	19bb1d5739	[llvm-readobj] Reapply: Improve error message for --string-dump This is a resubmission of a previous patch that caused test failures, with the fixes for the relevant tests included. Fixes bug 40630: https://bugs.llvm.org/show_bug.cgi?id=40630 This patch changes the error message when the section specified by --string-dump cannot be found by including the name of the section in the error message and changing the prefix text to not imply that the file itself was invalid. As part of this change some uses of std::error_code have been replaced with the llvm Error class to better encapsulate the error info (rather than passing File strings around), and the WithColor class replaces string literal error prefixes. llvm-svn: 358395	2019-04-15 11:17:48 +00:00
Simon Tatham	301ed1cb49	[TableGen] Include schedule model name in diagnostic. If you have more than one schedule model in your TableGen target definitions, then the diagnostic "No schedule information for instruction 'foo'" is rather unhelpful, because it doesn't tell you _which_ schedule model is missing the necessary information (or, as it might be, missing the UnsupportedFeatures definition that would stop it thinking it needed it). Extended the message to include the name of the schedule model that it's complaining about. Reviewers: nhaehnle, hfinkel, javedabsar, efriedma, javed.absar Reviewed By: javed.absar Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60559 llvm-svn: 358389	2019-04-15 10:06:26 +00:00
Yevgeny Rouban	3992e9d229	Codegen: Fixed perf branch_weights in couple of tests. NFC. This is need to pass future checks of perf branch_weights metadata. llvm-svn: 358384	2019-04-15 09:30:31 +00:00
Serguei Katkov	f54328372b	[NewPM] Add Option handling for SimplifyCFG This patch enables passing options to SimplifyCFGPass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60675 llvm-svn: 358379	2019-04-15 08:57:53 +00:00
Bjorn Pettersson	60569363a5	[SelectionDAG] Use KnownBits::computeForAddSub/computeForAddCarry Summary: Use KnownBits::computeForAddSub/computeForAddCarry in SelectionDAG::computeKnownBits when doing value tracking for addition/subtraction. This should improve the precision of the known bits, as we only used to make a simple estimate of known zeroes. The KnownBits support functions are also able to deduce bits that are known to be one in the result. Reviewers: spatel, RKSimon, nikic, lebedev.ri Reviewed By: nikic Subscribers: nikic, javed.absar, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60460 llvm-svn: 358372	2019-04-15 07:19:11 +00:00
Craig Topper	abd87ff48b	[X86] Regenerate checks for domain-reassignment.mir Apparently there are some stray IMPLICIT_DEF operations that weren't in the checks. Not sure if they've always been there or something changed at some point. llvm-svn: 358371	2019-04-15 05:22:47 +00:00
Amara Emerson	946b1246d6	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test `3870380` 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Nico Weber	ae050d214b	llvm-undname: Fix oss-fuzz-foudn crash-on-invalid with incomplete special table nodes llvm-svn: 358367	2019-04-14 23:32:37 +00:00
Nico Weber	63fe2593ae	llvm-undname: Fix another crash-on-invalid found by oss-fuzz llvm-svn: 358363	2019-04-14 23:08:12 +00:00
Craig Topper	3c57976447	[X86] Move VPTESTM matching from the isel table to custom code in X86ISelDAGToDAG. We had many tablegen patterns for these instructions. And due to the commutability of the patterns, tablegen expands them to even more patterns. All together VPTESTMD patterns accounted for more the 50K of the 610K isel table. This had gotten bad when we stopped canonicalizing AND to vXi64. This required a pattern for every combination of bitcast input type. This change moves the matching to custom code where it is easier to look through the bitcasts without being concerned with the specific types. The test changes are because we are now stricter with one use checks as its required to make load folding legal. We now require the AND and any BITCAST to only have a single use. This prevents forming VPTESTM and a VPAND with the same inputs. We now support broadcast loads for 128/256 patterns without VLX. We'll widen to 512-bit like and still fold the broadcast since the amount of memory read doesn't change. There are a few tests that got slightly longer because are now prefering load + VPTESTM over XOR+VPCMPEQ for (seteq (load), allzeros). Previously we were able to share the XOR with multiple VPTESTM instructions. llvm-svn: 358359	2019-04-14 18:26:11 +00:00
Craig Topper	b17e5ec61b	[X86] Don't form masked vpcmp/vcmp/vptestm operations if the setcc node has more than one use. We're better of emitting a single compare + kand rather than a compare for the other use and a masked compare. I'm looking into using custom instruction selection for VPTESTM to reduce the ridiculous number of permutations of patterns in the isel table. Putting a one use check on all masked compare folding makes load fold matching in the custom code easier. llvm-svn: 358358	2019-04-14 18:26:06 +00:00
Craig Topper	476dd06854	[X86] Update bool_reduction_v8f32 test cases from vector-compare-any_of.ll and vector-compare-all_of.ll to be proper reductions. One of the shuffles was used twice. While the intended shuffle wasn't connected. llvm-svn: 358346	2019-04-14 04:20:42 +00:00
Philip Reames	0eeb2cd491	[Tests] Add tests for D60659, and make adjustments to others to make diff clear Three related changes: 1) auto-gen several test files 2) Add the new tests at the bottom of said files 3) Adjust a couple of other test files not to use stores to constants when trying to test constexpr address handling llvm-svn: 358344	2019-04-13 22:12:56 +00:00
Bill Wendling	191f1487b6	[X86] Use PC-relative mode for the kernel code model Summary: The Linux kernel uses PC-relative mode, so allow that when the code model is "kernel". Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits, kees, nickdesaulniers Tags: #llvm Differential Revision: https://reviews.llvm.org/D60643 llvm-svn: 358343	2019-04-13 21:39:28 +00:00
Nikita Popov	040871db48	[CVP] Add tests for range of with.overflow result; NFC Test range of with.overflow result in the no-overflow branch. llvm-svn: 358341	2019-04-13 19:43:51 +00:00
Don Hinton	7d2021defc	[CommandLineParser] Add DefaultOption flag Summary: Add DefaultOption flag to CommandLineParser which provides a default option or alias, but allows users to override it for some other purpose as needed. Also, add `-h` as a default alias to `-help`, which can be seamlessly overridden by applications like llvm-objdump and llvm-readobj which use `-h` as an alias for other options. Reviewers: alexfh, klimek Reviewed By: klimek Subscribers: MaskRay, mehdi_amini, inglorion, dexonsmith, hiraditya, llvm-commits, jhenderson, arphaman, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59746 llvm-svn: 358337	2019-04-13 16:55:28 +00:00
Nikita Popov	41e284b9c3	[CVP] Fix inverted predicates in test; NFC Checked the wrong direction in the umul tests... fix predicated to line up with the test name. llvm-svn: 358331	2019-04-13 11:47:36 +00:00
Nikita Popov	25c1aa15a7	[CVP] Add tests for with.overflow used as condition; NFC llvm-svn: 358330	2019-04-13 11:40:16 +00:00
Chen Zheng	87dd0e06dc	[InstCombine] Canonicalize (-X srem Y) to -(X srem Y). Differential Revision: https://reviews.llvm.org/D60647 llvm-svn: 358328	2019-04-13 09:21:22 +00:00
Chen Zheng	fc59a0326b	[InstCombine] [NFC] add testcases for canonicalizing (-X srem Y) to -(X srem Y). llvm-svn: 358327	2019-04-13 07:34:55 +00:00
Philip Reames	e03301a3b3	[StackMaps] Update llvm-readobj to parse V3 Stackmaps This updates the StackMap parser in the llvm-readobj tool to parse version 3 StackMaps, which were bumped in https://reviews.llvm.org/D32629. Version 3 StackMaps differ in that they have a uint16 sized "location size" field which was added to the Location block in a StackMap record. The record has additional padding for alignment. This was a backwards incompatible change resulting in a StackMap version bump. Patch By: jacob.hughes@kcl.ac.uk (with a rewrite of tests by me) Differential Revision: https://reviews.llvm.org/D59020 llvm-svn: 358325	2019-04-13 03:55:13 +00:00
Philip Reames	eea989a909	[StackMaps] Add location size to llvm-readobj -stackmap output The size field of a location can be different for each entry, so it is useful to have this displayed in the output of llvm-readobj -stackmap. Below is an example of how the output would look: Record ID: 2882400000, instruction offset: 16 3 locations: #1: Constant 1, size: 8 #2: Constant 2, size: 8 #3: Constant 3, size: 8 0 live-outs: [ ] Patch By: jacob.hughes@kcl.ac.uk (with heavy modification by me) Differential Revision: https://reviews.llvm.org/D59169 llvm-svn: 358324	2019-04-13 03:08:45 +00:00
Amara Emerson	93e58d2396	[AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318	2019-04-13 00:33:25 +00:00
Thomas Lively	fef8de66a6	[WebAssembly] Add DataCount section to object files Summary: This ensures that object files will continue to validate as WebAssembly modules in the presence of bulk memory operations. Engines that don't support bulk memory operations will not recognize the DataCount section and will report validation errors, but that's ok because object files aren't supposed to be run directly anyway. Reviewers: aheejin, dschuff, sbc100 Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60623 llvm-svn: 358315	2019-04-12 22:27:48 +00:00
Amara Emerson	bdb5e4e4ca	[GlobalISel] Fix a crash when handling an invalid MVT during call lowering. This crash was introduced in r358032 as we try to construct an EVT from an MVT in order to find the register type for the calling conv. Fall back instead of trying to do this with an invalid MVT coming from i256. llvm-svn: 358314	2019-04-12 22:05:46 +00:00
Alina Sbirlea	f9f073a861	[MemorySSA] Add previous def to cache when found, even if trivial. Summary: When inserting a new Def, MemorySSA may be have non-minimal number of Phis. While inserting, the walk to find the previous definition may cleanup minimal Phis. When the last definition is trivial to obtain, we do not cache it. It is possible while getting the previous definition for a Def to get two different answers: - one that was straight-forward to find when walking the first path (a trivial phi in this case), and - another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one. While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards. A way to fix this problem is to cache the straight-forward answer we got on the first walk. The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache. Resolves PR40749. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60634 llvm-svn: 358313	2019-04-12 21:58:52 +00:00
Amara Emerson	2806fd01a1	[AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an undef mask element. If a shufflevector's mask vector has an element with "undef" then the generic instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT. This fixes the selector to handle this case, and for now assumes that undef just means zero. In future we'll optimize this case properly. llvm-svn: 358312	2019-04-12 21:31:21 +00:00
Thomas Lively	9e27514996	[WebAssembly] Add mutable-globals to bleeding-edge CPU Summary: This brings the backend in line with Clang. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60594 llvm-svn: 358310	2019-04-12 20:39:53 +00:00
Alina Sbirlea	57769382b1	[MemorySSA] Small fix for the clobber limit. Summary: After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry. The test included triggered that assert. The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this: ``` while (true) { if (getBlockingAccess()) { // calls walkToPhiOrClobber } for (...) { walkToPhiOrClobber(); } } ``` The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1. This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced. The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60479 llvm-svn: 358303	2019-04-12 18:48:46 +00:00
Philip Reames	b091cc081d	[InstCombine] Fix a nasty miscompile introduced w/masked.gather demanded elts This fixes a miscompile which was introduced in r356510 (https://reviews.llvm.org/D57372). The problem is that the original patch removed pointer operands where the load results we're demanded, but without considering the legality of the load itself. If the masked.gather had active, but undemanded, lanes, then we could end up creating a load which loaded from an undef address. The result could be a segfault, or, in theory, an arbitrary read from a random memory location into an used register. llvm-svn: 358299	2019-04-12 18:26:56 +00:00
Nikita Popov	00a0d5d1de	[CVP] Set NSW/NUW flags when simplifying with.overflow When CVP determines that a with.overflow intrinsic cannot overflow, it currently inserts a simple add/sub. As we already determined that there can be no overflow, we should add the appropriate NUW/NSW flag. Differential Revision: https://reviews.llvm.org/D60585 llvm-svn: 358298	2019-04-12 18:18:17 +00:00
Philip Reames	7a60cd38af	[Tests] Checkin a test demonstrating a miscompile so that patch which fixes it shows a clear diff llvm-svn: 358296	2019-04-12 18:11:58 +00:00
Lang Hames	c7c1f21525	Simplify decoupling between RuntimeDyld/RuntimeDyldChecker, add 'got_addr' util. This patch reduces the number of functions in the interface between RuntimeDyld and RuntimeDyldChecker by combining "GetXAddress" and "GetXContent" functions into "GetXInfo" functions that return a struct describing both the address and content. The GetStubOffset function is also replaced with a pair of utilities, GetStubInfo and GetGOTInfo, that fit the new scheme. For RuntimeDyld both of these functions will return the same result, but for the new JITLink linker (https://reviews.llvm.org/D58704) these will provide the addresses of PLT stubs and GOT entries respectively. For JITLink's use, a 'got_addr' utility has been added to the rtdyld-check language, and the syntax of 'got_addr' and 'stub_addr' has been changed: both functions now take two arguments, a 'stub container name' and a target symbol name. For llvm-rtdyld/RuntimeDyld the stub container name is the object file name and section name, separated by a slash. E.g.: rtdyld-check: {8}(stub_addr(foo.o/__text, y)) = y For the upcoming llvm-jitlink utility, which creates stubs on a per-file basis rather than a per-section basis, the container name is just the file name. E.g.: jitlink-check: {8}(got_addr(foo.o, y)) = y llvm-svn: 358295	2019-04-12 18:07:28 +00:00
Brendon Cahoon	4df216cd62	[Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass The Hexagon Vector Loop Carried Reuse pass was allowing reuse between two shufflevectors with different masks. The reason is that the masks are not instruction objects, so the code that checks each operand just skipped over the operands. This patch fixes the bug by checking if the operands are the same when they are not instruction objects. If the objects are not the same, then the code assumes that reuse cannot occur. Differential Revision: https://reviews.llvm.org/D60019 llvm-svn: 358292	2019-04-12 16:37:12 +00:00
Sanjay Patel	5e4ad39af7	[DAGCombiner] narrow shuffle of concatenated vectors // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 llvm-svn: 358291	2019-04-12 16:31:56 +00:00
Simon Pilgrim	6c8f4ada36	[X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions. This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result. This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation. Differential Revision: https://reviews.llvm.org/D60610 llvm-svn: 358286	2019-04-12 14:22:57 +00:00
Hans Wennborg	4e6b857922	Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter." It causes clang to crash while building Chromium. See https://crbug.com/952230 for reproducer. > The PrologEpilogInserter need to insert a DW_OP_deref_size before > prepending a memory location expression to an already implicit > expression to avoid having the existing expression act on the memory > address instead of the value behind it. > > The reason for using DW_OP_deref_size and not plain DW_OP_deref is that > big-endian targets need to read the right size as simply truncating a > larger read would yield the wrong result (LSB bytes are not at the lower > address). > > Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358281	2019-04-12 12:54:52 +00:00
Eugene Leviant	88089fed9c	[llvm-objcopy] Fill .symtab_shndx section correctly Differential revision: https://reviews.llvm.org/D60555 llvm-svn: 358278	2019-04-12 11:59:30 +00:00
Kang Zhang	2446f843ae	[PowerPC] Add initialization for some ppc passes Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358271	2019-04-12 09:59:40 +00:00
Jeremy Morse	32afe6a1f8	[DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc Bug: https://bugs.llvm.org/show_bug.cgi?id=41175 In the bug test case the DSE pass is shortening the range of memory that a memset is working on. A getelementptr is generated so that the new starting address can be passed to memset. This instruction was not given a DebugLoc. To fix the bug, copy the DebugLoc from the memset instruction. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D60556 llvm-svn: 358270	2019-04-12 09:47:35 +00:00
Markus Lavin	138c76129b	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter. The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358268	2019-04-12 08:23:55 +00:00
Fangrui Song	d5c404246f	[ConstantFold] Don't evaluate FP or FP vector casts or truncations when simplifying icmp Fix PR41476 llvm-svn: 358262	2019-04-12 07:34:30 +00:00
Eric Christopher	b6926bdcff	Revert "[PowerPC] Add initialization for some ppc passes" This reverts commit `6f8f98ce8d` as it is breaking nearly every bot. llvm-svn: 358260	2019-04-12 07:16:58 +00:00
Craig Topper	3b1239d2a8	[TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes. If the upper bits of the SHL result aren't used, we might be able to use a narrower shift. For example, on X86 this can turn a 64-bit into 32-bit enabling a smaller encoding. Differential Revision: https://reviews.llvm.org/D60358 llvm-svn: 358257	2019-04-12 06:49:28 +00:00
Kang Zhang	6f8f98ce8d	[PowerPC] Add initialization for some ppc passes Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358256	2019-04-12 06:35:15 +00:00
Zi Xuan Wu	ac79ef8f0e	[PowerPC] More precise exploitation of P9 maddld instruction when operands are constant There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes they are constant. If there is constant operand, it takes extra li to materialize the operand, and one more extra register too. So it's not profitable to use maddld to optimize mul-add pattern. Differential Revision: https://reviews.llvm.org/D60181 llvm-svn: 358253	2019-04-12 05:21:31 +00:00
Nico Weber	03db625c13	llvm-undname: Fix out-of-bounds read on invalid intrinsic function code Found by inspection. llvm-svn: 358239	2019-04-11 23:11:33 +00:00
Nico Weber	e5b62654a5	llvm-undname: Don't crash on incomplete enum tag manglings Found by inspection. llvm-svn: 358238	2019-04-11 22:59:25 +00:00
Nico Weber	b4f33bbbb0	llvm-undname: Fix crash on incomplete virtual this adjusts Found by oss-fuzz. Also remove an else-after-return, this part has no behavior change. llvm-svn: 358237	2019-04-11 22:47:18 +00:00
Nico Weber	f2d8f09d5d	llvm-undname: Fix crash on invalid name in a template parameter pointer to member arg Found by oss-fuzz. llvm-svn: 358234	2019-04-11 22:23:35 +00:00
Brendon Cahoon	57c3d4bed3	[Pipeliner] Fix incorrect loop carried dependence calculation The isLoopCarriedDep function does not correctly compute loop carried dependences when the array index offset is negative or the stride is smallar than the access size. Patch by Denis Antrushin. Differential Revision: https://reviews.llvm.org/D60135 llvm-svn: 358233	2019-04-11 21:57:51 +00:00
Nikita Popov	6ffa1511ea	[CVP] Generate full test checks for overflows.ll; NFC llvm-svn: 358229	2019-04-11 21:10:39 +00:00
Rong Xu	959ef16859	[PGO] Better handling of profile hash mismatch We currently assume profile hash conflicts will be caught by an upfront check and we assert for the cases that escape the check. The assumption is not always true as there are chances of conflict. This patch prints a warning and skips annotating the function for the escaped cases,. Differential Revision: https://reviews.llvm.org/D60154 llvm-svn: 358225	2019-04-11 20:54:17 +00:00
Amara Emerson	7e9355f870	[AArch64][GlobalISel] Flesh out vector load/store support for more types. Some of these were legalizing into smaller vector types unnecessarily, others were simply not supported yet. llvm-svn: 358223	2019-04-11 20:40:01 +00:00
Amara Emerson	b956051415	[AArch64][GlobalISel] Legalization and ISel support for load/stores of vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221	2019-04-11 20:32:24 +00:00
Aaron Smith	994023a3f1	[DebugInfo] Combine Trivial and NonTrivial flags Summary: Companion to https://reviews.llvm.org/D59347 Reviewers: rnk, zturner, probinson, dblaikie, deadalnix Subscribers: aprantl, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59348 llvm-svn: 358220	2019-04-11 20:25:10 +00:00
Craig Topper	68a5d619a4	[X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre type legalization where the setcc result type is vXi1. If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1. The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand. llvm-svn: 358218	2019-04-11 19:57:44 +00:00
Craig Topper	a3635b94c4	[X86] Add 32-bit command line to extractelement-fp.ll so I can add a test case for a 32-bit only crasher. NFC This is a bit ugly for ABI reasons about how floats/doubles are returned. llvm-svn: 358217	2019-04-11 19:57:24 +00:00
Craig Topper	586fad50ac	[X86] Add patterns for using movss/movsd for atomic load/store of f32/64. Remove atomic fadd pseudos use isel patterns instead. This patch adds patterns for turning bitcasted atomic load/store into movss/sd. It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part. This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled. Differential Revision: https://reviews.llvm.org/D60394 llvm-svn: 358215	2019-04-11 19:19:52 +00:00
Craig Topper	f7e548c076	Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2" With correct test checks this time. If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ This matches what gcc and icc do for this case and removes an existing FIXME. llvm-svn: 358214	2019-04-11 19:19:42 +00:00
Craig Topper	8200880c9a	Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2" I seem to have messed up the test checks. llvm-svn: 358212	2019-04-11 19:04:38 +00:00
Craig Topper	1c2dfc3100	[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2 If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness. This matches what gcc and icc do for this case and removes an existing FIXME. Differential Revision: https://reviews.llvm.org/D60156 llvm-svn: 358211	2019-04-11 18:40:21 +00:00
Craig Topper	1fe5a9963d	[X86] Pre-commit i64 volatile test case for D60156. NFC llvm-svn: 358210	2019-04-11 18:40:08 +00:00
Simon Pilgrim	8d083c5e0b	[ConstantFold] ExtractConstantBytes - handle shifts on large integer types Use APInt instead of getZExtValue from the ConstantInt until we can confirm that the shift amount is in range. Reduced from OSS-Fuzz #14169 - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14169 llvm-svn: 358192	2019-04-11 16:39:31 +00:00
Simon Pilgrim	40b647ae8e	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV3 mask support Completes SimplifyDemandedVectorElts's basic variable shuffle mask support which should help D60512 + D60562 llvm-svn: 358186	2019-04-11 15:29:15 +00:00
Serge Guelton	3742bb89f8	Make llvm-nm -help great again Only display help from the llvm-nm category instead of all llvm options, which make it much more usable. There's still an issue with -s, which is probably a bug in llvm::cl and worth another commit. Differential Revision: https://reviews.llvm.org/D60411 llvm-svn: 358185	2019-04-11 15:22:48 +00:00
Roger Ferrer Ibanez	b621f04135	[RISCV] Diagnose invalid second input register operand when using %tprel_add RISCVMCCodeEmitter::expandAddTPRel asserts that the second operand must be x4/tp. As we are not currently checking this in the RISCVAsmParser, the assert is easy to trigger due to wrong assembly input. This patch does a late check of this constraint. An alternative could be using a singleton register class for x4/tp similar to the current one for sp. Unfortunately it does not result in a good diagnostic. Because add is an overloaded mnemonic, if no matching is possible, the diagnostic of the first failing alternative seems to be used as the diagnostic itself. This means that this case the %tprel_add is diagnosed as an invalid operand (because the real add instruction only has 3 operands). Differential Revision: https://reviews.llvm.org/D60528 llvm-svn: 358183	2019-04-11 15:13:12 +00:00
Simon Pilgrim	a41275a398	[X86][AVX] Tweak X86ISD::VPERMV3 demandedelts test Original test was too dependent on the order of the combines that could cause the inserted element being demanded after all llvm-svn: 358182	2019-04-11 15:09:03 +00:00
Simon Pilgrim	34686b6e97	[X86][AVX] Add X86ISD::VPERMV3 demandedelts test llvm-svn: 358175	2019-04-11 14:48:46 +00:00
Simon Pilgrim	8a25154fa7	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV mask support llvm-svn: 358174	2019-04-11 14:35:45 +00:00
Simon Pilgrim	b237b54c2d	[X86][AVX] Add X86ISD::VPERMV demandedelts test llvm-svn: 358173	2019-04-11 14:26:32 +00:00
Sanjay Patel	c0f4a35e68	[DAGCombiner][x86] scalarize inserted vector FP ops // bo (build_vec ...undef, x, undef...), (build_vec ...undef, y, undef...) --> // build_vec ...undef, (bo x, y), undef... The lifetime of the nodes in these examples is different for variables versus constants, but they are all build vectors briefly, so I'm proposing to catch them in this form to handle all of the leading examples in the motivating test file. Before we have build vectors, we might have insert_vector_element. After that, we might have scalar_to_vector and constant pool loads. It's going to take more work to ensure that FP vector operands are getting simplified with undef elements, so this transform can apply more widely. In a non-loose FP environment, we are likely simplifying FP elements to NaN values rather than undefs. We also need to allow more opcodes down this path. Eg, we don't handle FP min/max flavors yet. Differential Revision: https://reviews.llvm.org/D60514 llvm-svn: 358172	2019-04-11 14:21:57 +00:00
Diogo N. Sampaio	8ddfd46c61	[AArch64] Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64 Summary: Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64 Reviewers: pbarrio, DavidSpickett, LukeGeeson Reviewed By: LukeGeeson Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60259 llvm-svn: 358171	2019-04-11 14:19:43 +00:00
Simon Pilgrim	6f3866c6fb	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMILPV mask support llvm-svn: 358170	2019-04-11 14:15:01 +00:00
Simon Pilgrim	886e32e0f2	[X86][AVX] Add X86ISD::VPERMILPV demandedelts tests llvm-svn: 358168	2019-04-11 14:09:35 +00:00
Simon Pilgrim	cb5218ad48	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMIL2 mask support llvm-svn: 358167	2019-04-11 14:04:19 +00:00
Simon Pilgrim	7021dec26e	[X86][XOP] Add X86ISD::VPERMIL2 demandedelts test llvm-svn: 358166	2019-04-11 13:52:43 +00:00
Simon Pilgrim	e468cc7f14	[X86] SimplifyDemandedVectorElts - add VPPERM support We need to add support for all variable shuffle mask ops, but VPPERM is the only one that already has test coverage. llvm-svn: 358165	2019-04-11 13:30:38 +00:00
Andrea Di Biagio	2050dff996	[MCA] Remove wrong comments from a test. NFC llvm-svn: 358160	2019-04-11 10:15:04 +00:00
Roman Lebedev	fbb823891d	[llvm-exegesis] Fix serialization/deserialization of special NoRegister register (PR41448) Summary: A lot of instructions have this special register. It seems this never really worked, but i finally noticed it only because it happened to break for `CMOV16rm` instruction. We serialized that register as "" (empty string), which is naturally 'ignored' during deserialization, so we re-create a `MCInst` with too few operands. And when we then happened to try to resolve variant sched class for this mis-serialized instruction, and the variant predicate tried to read an operand that was out of bounds since we got less operands, we crashed. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=41448 \| PR41448 ]]. Reviewers: craig.topper, courbet Reviewed By: courbet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60517 llvm-svn: 358153	2019-04-11 07:20:50 +00:00
Shiva Chen	7cc03bd064	[RISCV] Put data smaller than eight bytes to small data section Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset could covert most of the small data section. Linker relaxation could transfer the multiple data accessing instructions to a gp base with signed twelve-bit offset instruction. Differential Revision: https://reviews.llvm.org/D57493 llvm-svn: 358150	2019-04-11 04:59:13 +00:00
Fangrui Song	6a285dfe71	[DWARF] Set discriminator to 0 for DW_LNS_copy Summary: Make DW_LNS_copy set the discriminator register to 0, to conform to DWARF 4 & 5: "Then it sets the discriminator register to 0, and sets the basic_block, prologue_end and epilogue_begin registers to false." Because all of DW_LNE_end_sequence, DN_LNS_copy, and special opcodes reset discriminator to 0, we can move discriminator=0 to appendRowToMatrix. Also, make DW_LNS_copy print before appending the row, as it is similar to a address+=0,line+=0 special opcode, which prints before appending the row. Reviewers: dblaikie, probinson, aprantl Reviewed By: dblaikie Subscribers: danielcdh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60364 llvm-svn: 358148	2019-04-11 02:02:44 +00:00
Erik Pilkington	cb5c7bd9eb	Fix a hang when lowering __builtin_dynamic_object_size If the ObjectSizeOffsetEvaluator fails to fold the object size call, then it may litter some unused instructions in the function. When done repeatably in InstCombine, this results in an infinite loop. Fix this by tracking the set of instructions that were inserted, then removing them on failure. rdar://49172227 Differential revision: https://reviews.llvm.org/D60298 llvm-svn: 358146	2019-04-10 23:42:11 +00:00
Amara Emerson	213e0bde04	[AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal. The existing isel support already works for p0 once the legalizer accepts it. llvm-svn: 358144	2019-04-10 23:06:14 +00:00
Amara Emerson	a7ff111b04	[AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD. llvm-svn: 358143	2019-04-10 23:06:11 +00:00
Amara Emerson	ae878dab03	[AArch64][GlobalISel] Scalarize vector SDIV. llvm-svn: 358142	2019-04-10 23:06:08 +00:00
Craig Topper	10048060f6	[X86] Add SSE1 command line to atomic-fp.ll and atomic-non-integer.ll. NFC llvm-svn: 358141	2019-04-10 22:35:32 +00:00
Craig Topper	a3ee7e2b3e	[X86] Autogenerate complete checks. NFC llvm-svn: 358140	2019-04-10 22:35:24 +00:00
Craig Topper	61f31cbcb2	[X86] Teach foldMaskedShiftToScaledMask to look through an any_extend from i32 to i64 between the and & shl foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work. This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part. This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded. Differential Revision: https://reviews.llvm.org/D60532 llvm-svn: 358139	2019-04-10 21:42:08 +00:00
Craig Topper	4a32ce39b7	[X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX. Many of our instructions have both a _Int form used by intrinsics and a form used by other IR constructs. In the EVEX space the _Int versions usually cover all the capabilities include broadcasting and rounding. While the other version only covers simple register/register or register/load forms. For this reason in EVEX, the non intrinsic form is usually marked isCodeGenOnly=1. In the VEX encoding space we were less consistent, but usually the _Int version was the isCodeGenOnly version. This commit makes the VEX instructions match the EVEX instructions. This was done by manually studying the AsmMatcher table so its possible I missed some cases, but we should be closer now. I'm thinking about using the isCodeGenOnly bit to simplify the EVEX2VEX tablegen code that disambiguates the _Int and non _Int versions. Currently it checks register class sizes and Record the memory operands come from. I have some other changes I was looking into for D59266 that may break the memory check. I had to make a few scheduler hacks to keep the _Int versions from being treated differently than the non _Int version. Differential Revision: https://reviews.llvm.org/D60441 llvm-svn: 358138	2019-04-10 21:29:41 +00:00
David Green	deb3342018	[ARM] Add an extra test for constant hoist. NFC llvm-svn: 358128	2019-04-10 19:18:58 +00:00
Craig Topper	cacb70c94b	[X86] Add test case for LEA formation regression seen with D60358. NFC If we have an (add X, (and (aext (shl Y, C1)), C2)), we can pull the shift through and+aext to fold into an LEA with the. Assuming C1 is small enough and C2 masks off all of the extend bits. This pattern showed up in D60358. And we need to handle it to prevent a regression. llvm-svn: 358124	2019-04-10 19:09:06 +00:00
Roman Lebedev	5d9f656bb7	[TableGen] Introduce !listsplat 'binary' operator Summary: ``` ``!listsplat(a, size)`` A list value that contains the value ``a`` ``size`` times. Example: ``!listsplat(0, 2)`` results in ``[0, 0]``. ``` I plan to use this in X86ScheduleBdVer2.td for LoadRes handling. This is a little bit controversial because unlike every other binary operator the types aren't identical. Reviewers: stoklund, javed.absar, nhaehnle, craig.topper Reviewed By: javed.absar Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60367 llvm-svn: 358117	2019-04-10 18:26:36 +00:00
David Green	4e3fd7757a	[ARM] Add an extra constant hoisting test. NFC This adds a simple extra test for constant hoisting to show it's usefulness with constant addresses like those seen in memory mapped registers in embedded systems. llvm-svn: 358114	2019-04-10 18:05:57 +00:00
David Green	0861c87b06	Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. llvm-svn: 358113	2019-04-10 18:00:41 +00:00
Nico Weber	5f6eb1817a	llvm-undname: Fix another crash-on-invalid This fixes a regression from https://reviews.llvm.org/D60354. We used to SymbolNode Symbol = demangleEncodedSymbol(MangledName, QN); if (Symbol) { Symbol->Name = QN; } but changed that to SymbolNode Symbol = demangleEncodedSymbol(MangledName, QN); if (Error) return nullptr; Symbol->Name = QN; and one branch somewhere returned a nullptr without setting Error. Looking at the code changed in r340083 and r340710 that branch looks like a remnant from an earlier attempt to demangle RTTI descriptors that has since been rewritten -- so just remove this branch. It shouldn't change behavior for correctly mangled symbols. llvm-svn: 358112	2019-04-10 17:31:34 +00:00
Matt Arsenault	7187272b2b	GlobalISel: Support legalizing G_CONSTANT with irregular breakdown llvm-svn: 358109	2019-04-10 17:27:53 +00:00
Craig Topper	35fe07916a	[AArch64] Teach getTestBitOperand to look through ANY_EXTENDS This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression. Differential Revision: https://reviews.llvm.org/D60482 llvm-svn: 358108	2019-04-10 17:27:29 +00:00
Matt Arsenault	9e0eeba569	GlobalISel: Handle odd breakdowns for bit ops llvm-svn: 358105	2019-04-10 17:07:56 +00:00
Nikita Popov	0a8228fd28	[InstCombine] Handle ssubo always overflow Following D60483 and D60497, this adds support for AlwaysOverflows handling for ssubo. This is the last case we can handle right now. Differential Revision: https://reviews.llvm.org/D60518 llvm-svn: 358100	2019-04-10 16:32:15 +00:00
Nikita Popov	7a543c3758	[InstCombine] ssubo X, C -> saddo X, -C ssubo X, C is equivalent to saddo X, -C. Make the transformation in InstCombine and allow the logic implemented for saddo to fold prior usages of add nsw or sub nsw with constants. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D60061 llvm-svn: 358099	2019-04-10 16:27:36 +00:00
Simon Pilgrim	37d8d55823	[X86][AVX] getTargetConstantBitsFromNode - extract bits from X86ISD::SUBV_BROADCAST llvm-svn: 358096	2019-04-10 16:24:47 +00:00
Nikita Popov	ef23e88480	[InstCombine] Handle saddo always overflow Followup to D60483: Handle AlwaysOverflow conditions for saddo as well. Differential Revision: https://reviews.llvm.org/D60497 llvm-svn: 358095	2019-04-10 16:18:01 +00:00
Diogo N. Sampaio	aae424a2d2	[AArch64] Add lowering pattern for scalar fp16 facge and facgt Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands. Reviewers: olista01, javed.absar, pbarrio Reviewed By: javed.absar Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60212 llvm-svn: 358083	2019-04-10 13:34:18 +00:00
Diogo N. Sampaio	651463e4a8	[ARM] [FIX] Add missing f16 vector operations lowering Summary: Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node. As well, allows <8xhalf> and <4xhalf> vldup1 operations. These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics. Reviewers: olista01, pbarrio, LukeGeeson, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60319 llvm-svn: 358081	2019-04-10 13:28:06 +00:00
Xing GUO	8ab7414580	[llvm-readobj] Should declare `ListScope` for `verneed` entries. Summary: YAML mappings require keys to be unique. See: https://yaml.org/spec/1.2/spec.html#id2764652 Reviewers: jhenderson, grimar, rupprecht, espindola, ruiu Reviewed By: ruiu Subscribers: ruiu, emaste, arichardson, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60438 llvm-svn: 358078	2019-04-10 12:47:21 +00:00
David Stenberg	b96943b6a0	[DebugInfo] Track multiple registers in DbgEntityHistoryCalculator Summary: When calculating the debug value history, DbgEntityHistoryCalculator would only keep track of register clobbering for the latest debug value per inlined entity. This meant that preceding register-described debug value fragments would live on until the next overlapping debug value, ignoring any potential clobbering. This patch amends DbgEntityHistoryCalculator so that it keeps track of all registers that a inlined entity's currently live debug values are described by. The DebugInfo/COFF/pieces.ll test case has had to be changed since previously a register-described fragment would incorrectly outlive its basic block. The parent patch D59941 is expected to increase the coverage slightly, as it makes sure that location list entries are inserted after clobbered fragments, and this patch is expected to decrease it, as it stops preceding register-described from living longer than they should. All in all, this patch and the preceding patch has a negligible effect on the output from `llvm-dwarfdump -statistics' for a clang-3.4 binary built using the RelWithDebInfo build profile. "Scope bytes covered" increases by 0.5%, and "variables with location" increases from 2212083 to 2212088, but it should improve the accuracy quite a bit. This fixes PR40283. Reviewers: aprantl, probinson, dblaikie, rnk, bjope Reviewed By: aprantl Subscribers: llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D59942 llvm-svn: 358073	2019-04-10 11:28:28 +00:00
David Stenberg	5ffec6deef	[DebugInfo] Improve handling of clobbered fragments Summary: Currently the DbgValueHistorymap only keeps track of clobbered registers for the last debug value that it has encountered. This could lead to preceding register-described debug values living on longer in the location lists than they should. See PR40283 for an example. This patch does not introduce tracking of multiple registers, but changes the DbgValueHistoryMap structure to allow for that in a follow-up patch. This patch is not NFC, as it at least fixes two bugs in DwarfDebug (both are covered in the new clobbered-fragments.mir test): * If a debug value was clobbered (its End pointer set), the value would still be added to OpenRanges, meaning that the succeeding location list entries could potentially contain stale values. * If a debug value was clobbered, and there were non-overlapping fragments that were still live after the clobbering, DwarfDebug would not create a location list entry starting directly after the clobbering instruction. This meant that the location list could have a gap until the next debug value for the variable was encountered. Before this patch, the history map was represented by <Begin, End> pairs, where a new pair was created for each new debug value. When dealing with partially overlapping register-described debug values, such as in the following example: DBG_VALUE $reg2, $noreg, !1, !DIExpression(DW_OP_LLVM_fragment, 32, 32) [...] DBG_VALUE $reg3, $noreg, !1, !DIExpression(DW_OP_LLVM_fragment, 64, 32) [...] $reg2 = insn1 [...] $reg3 = insn2 the history map would then contain the entries `[<DV1, insn1>, [<DV2, insn2>]`. This would leave it up to the users of the map to be aware of the relative order of the instructions, which e.g. could make DwarfDebug::buildLocationList() needlessly complex. Instead, this patch makes the history map structure monotonically increasing by dropping the End pointer, and replacing that with explicit clobbering entries in the vector. Each debug value has an "end index", which if set, points to the entry in the vector that ends the debug value. The ending entry can either be an overlapping debug value, or an instruction which clobbers the register that the debug value is described by. The ending entry's instruction can thus either be excluded or included in the debug value's range. If the end index is not set, the debug value that the entry introduces is valid until the end of the function. Changes to test cases: * DebugInfo/X86/pieces-3.ll: The range of the first DBG_VALUE, which describes that the fragment (0, 64) is located in RDI, was incorrectly ended by the clobbering of RAX, which the second (non-overlapping) DBG_VALUE was described by. With this patch we get a second entry that only describes RDI after that clobbering. * DebugInfo/ARM/partial-subreg.ll: This test seems to indiciate a bug in LiveDebugValues that is caused by it not being aware of fragments. I have added some comments in the test case about that. Also, before this patch DwarfDebug would incorrectly include a register-described debug value from a preceding block in a location list entry. Reviewers: aprantl, probinson, dblaikie, rnk, bjope Reviewed By: aprantl Subscribers: javed.absar, kristof.beyls, jdoerfert, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D59941 llvm-svn: 358072	2019-04-10 11:28:20 +00:00
Diana Picus	b6e83b98f9	[ARM GlobalISel] Select G_FCONSTANT for VFP3 Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063	2019-04-10 09:14:32 +00:00
Diana Picus	3533ad6801	[ARM GlobalISel] Select G_FCONSTANT into pools Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062	2019-04-10 09:14:24 +00:00
Diana Picus	165846b031	[ARM GlobalISel] Map G_FCONSTANT llvm-svn: 358061	2019-04-10 09:14:16 +00:00
David Stenberg	fab4bdf4b9	Add REQUIRES: asserts to test using -debug-only llvm-svn: 358057	2019-04-10 08:44:57 +00:00
Florian Hahn	db1a69c250	[VPLAN] Minor improvement to testing and debug messages. 1. Use computed VF for stress testing. 2. If the computed VF does not produce vector code (VF smaller than 2), force VF to be 4. 3. Test vectorization of i64 data on AArch64 to make sure we generate VF != 4 (on X86 that was already tested on AVX). Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D59952 llvm-svn: 358056	2019-04-10 08:17:28 +00:00
Nikita Popov	09020ec2a7	[InstCombine] Handle usubo always overflow Check AlwaysOverflow condition for usubo. The implementation is the same as the existing handling for uaddo and umulo. Handling for saddo and ssubo will follow (smulo doesn't have the necessary ValueTracking support). Differential Revision: https://reviews.llvm.org/D60483 llvm-svn: 358052	2019-04-10 07:10:53 +00:00
Chen Zheng	5e13ff1da2	[InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y). Differential Revision: https://reviews.llvm.org/D60395 llvm-svn: 358050	2019-04-10 06:52:09 +00:00
Akira Hatanaka	9ca9d32b6b	[ObjC][ARC] Convert the retainRV marker that is passed as a named metadata into a module flag in the auto-upgrader and make the ARC contract pass read the marker as a module flag. This is needed to fix a bug where ARC contract wasn't inserting the retainRV marker when LTO was enabled, which caused objects returned from a function to be auto-released. rdar://problem/49464214 Differential Revision: https://reviews.llvm.org/D60303 llvm-svn: 358047	2019-04-10 06:20:20 +00:00
Craig Topper	391d5caa10	[X86] Move the 2 byte VEX optimization for MOV instructions back to the X86AsmParser::processInstruction where it used to be. Block when {vex3} prefix is present. Years ago I moved this to an InstAlias using VR128H/VR128L. But now that we support {vex3} pseudo prefix, we need to block the optimization when it is set to match gas behavior. llvm-svn: 358046	2019-04-10 05:43:20 +00:00
Jim Lin	a49c95e02a	[Sparc] Fix incorrect MI insertion position for spilling f128. Summary: Obviously, new built MI (sethi+add or sethi+xor+add) for constructing large offset should be inserted before new created MI for storing even register into memory. So the insertion position should be *StMI instead of II. before fixed: std %f0, [%g1+80] sethi 4, %g1 <<< add %g1, %sp, %g1 <<< this two instructions should be put before "std %f0, [%g1+80]". sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] after fixed: sethi 4, %g1 add %g1, %sp, %g1 std %f0, [%g1+80] sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: jyknight, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60397 llvm-svn: 358042	2019-04-10 01:56:32 +00:00
Craig Topper	9ca3a95f79	[X86] Support the EVEX versions vcvt(t)ss2si and vcvt(t)sd2si with the {evex} pseudo prefix in the assembler. The EVEX versions are ambiguous with the VEX versions based on operands alone so we had explicitly dropped them from the AsmMatcher table. Unfortunately, when we add them they incorrectly show in the table before their VEX counterparts. This is different how the prioritization normally works. To fix this we have to explicitly reject the instructions unless the {evex} prefix has been seen. llvm-svn: 358041	2019-04-10 01:29:59 +00:00
Amara Emerson	9bf092d719	[AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHL The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035	2019-04-09 21:22:43 +00:00
Amara Emerson	888dd5d198	[AArch64][GlobalISel] Legalize vector G_ICMP. Selection support will be coming in a later patch. Differential Revision: https://reviews.llvm.org/D60435 llvm-svn: 358034	2019-04-09 21:22:40 +00:00
Amara Emerson	92d74f19cf	[AArch64][GlobalISel] Add legalization for some vector G_SHL and G_ASHR. This is needed for some future support for vector ICMP. Differential Revision: https://reviews.llvm.org/D60433 llvm-svn: 358033	2019-04-09 21:22:37 +00:00
Amara Emerson	2b523f8162	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032	2019-04-09 21:22:33 +00:00
Nikita Popov	c176b708e4	[InstCombine] Add with.overflow always overflow tests; NFC The uadd and umul cases are currently handled, the usub, sadd, ssub and smul cases are not. usub, sadd and ssub already have the necessary ValueTracking support, smul doesn't. llvm-svn: 358031	2019-04-09 20:02:23 +00:00
Craig Topper	ba55a40fd0	[AArch64] Add test case to show missed opportunity to remove a shift before tbnz when the shift has been zero extended from i32 to i64. NFC This pattern showed up in D60358 and it was suggested I had a test and fix that separately. llvm-svn: 358030	2019-04-09 19:23:37 +00:00
Craig Topper	8e2871cd2c	[X86] Add support for {vex2}, {vex3}, and {evex} to the assembler to match gas. Use {evex} to improve the one our 32-bit AVX512 tests. These can be used to force the encoding used for instructions. {vex2} will fail if the instruction is not VEX encoded, but otherwise won't do anything since we prefer vex2 when possible. Might need to skip use of the _REV MOV instructions for this too, but I haven't done that yet. {vex3} will force the instruction to use the 3 byte VEX encoding or fail if there is no VEX form. {evex} will force the instruction to use the EVEX version or fail if there is no EVEX version. Differential Revision: https://reviews.llvm.org/D59266 llvm-svn: 358029	2019-04-09 18:45:15 +00:00
Craig Topper	61e77b11d1	[DAGCombiner][X86][SystemZ] Canonicalize SSUBO with immediate RHS to SADDO by negating the immediate. This lines up with what we do for regular subtract and it matches up better with X86 assumptions in isel patterns that add with immediate is more canonical than sub with immediate. Differential Revision: https://reviews.llvm.org/D60020 llvm-svn: 358027	2019-04-09 18:33:56 +00:00
Nikita Popov	2f5e9de8d1	Revert "[InstCombine] [InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y)." This reverts commit `1383a91689`. sdiv-canonicalize.ll fails after this revision. The fold needs to be moved outside the branch handling constant operands. However when this is done there are further test changes, so I'm reverting this in the meantime. llvm-svn: 358026	2019-04-09 18:32:38 +00:00
Nikita Popov	4b2323d1a3	[ValueTracking] Use computeConstantRange() for signed sub overflow determination This is the same change as D60420 but for signed sub rather than signed add: Range information is intersected into the known bits result, allows to detect more no/always overflow conditions. Differential Revision: https://reviews.llvm.org/D60469 llvm-svn: 358020	2019-04-09 17:01:49 +00:00
Simon Pilgrim	d7cc0ec581	[TargetLowering] SimplifyDemandedBits - add ISD::INSERT_SUBVECTOR support llvm-svn: 358019	2019-04-09 16:52:21 +00:00
Chen Zheng	1383a91689	[InstCombine] [InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y). Differential Revision: https://reviews.llvm.org/D60395 llvm-svn: 358017	2019-04-09 16:34:31 +00:00
Stanislav Mekhanoshin	913ba8eeb4	Revert LIS handling in MachineDCE One of out of tree targets has regressed with this patch. Reverting it for now and let liveness to be fully reconstructed in case pass was used after the LIS is created to resolve the regression. Differential Revision: https://reviews.llvm.org/D60466 llvm-svn: 358015	2019-04-09 16:13:53 +00:00
Nikita Popov	10edd2b79d	[ValueTracking] Use computeConstantRange() in signed add overflow determination This is D59386 for the signed add case. The computeConstantRange() result is now intersected into the existing known bits information, allowing to detect additional no-overflow/always-overflow conditions (though the latter isn't used yet). This (finally...) covers the motivating case from D59071. Differential Revision: https://reviews.llvm.org/D60420 llvm-svn: 358014	2019-04-09 16:12:59 +00:00
Sanjay Patel	49d9d17a77	[InstCombine] prevent possible miscompile with sdiv+negate of vector op Similar to: rL358005 Forego folding arbitrary vector constants to fix a possible miscompile bug. We can enhance the transform if we do want to handle the more complicated vector case. llvm-svn: 358013	2019-04-09 15:13:03 +00:00
Sanjay Patel	d5173f5acf	[InstCombine] add tests for sdiv with negated dividend and constant divisor; NFC llvm-svn: 358010	2019-04-09 14:48:44 +00:00
Sanjay Patel	7563b65ad4	[InstCombine] add tests for sdiv-by-int-min; NFC llvm-svn: 358008	2019-04-09 14:27:07 +00:00
Sanjay Patel	d469954d61	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 358007	2019-04-09 14:27:03 +00:00
Sanjay Patel	f62dcea7ed	[InstCombine] prevent possible miscompile with negate+sdiv of vector op // 0 - (X sdiv C) -> (X sdiv -C) provided the negation doesn't overflow. This fold has been around for many years and nobody noticed the potential vector miscompile from overflow until recently... So it seems unlikely that there's much demand for a vector sdiv optimization on arbitrary vector constants, so just limit the matching to splat constants to avoid the possible bug. Differential Revision: https://reviews.llvm.org/D60426 llvm-svn: 358005	2019-04-09 14:09:06 +00:00
Sanjay Patel	a230bb5fc0	[InstCombine] add tests/comments for negate+sdiv; NFC llvm-svn: 358003	2019-04-09 13:41:29 +00:00
Chen Zheng	11cf397292	[InstCombine] add more testcases for canonicalize (-X s/ Y) to -(X s/ Y). llvm-svn: 358000	2019-04-09 12:47:29 +00:00
Simon Pilgrim	345eacd555	[TargetLowering] SimplifyDemandedBits - call SimplifyDemandedBits in bitcast handling When bitcasting from a source op to a larger bitwidth op, split the demanded bits and OR them on top of one another and demand those merged bits in the SimplifyDemandedBits call on the source op. llvm-svn: 357992	2019-04-09 10:27:59 +00:00
Tom Stellard	206b9927f8	AMDGPU/GlobalISel: Implement call lowering for shaders returning values Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D57166 llvm-svn: 357964	2019-04-09 02:26:03 +00:00
Chen Zheng	19ce6719bc	[PowerPC] initialize SchedModel according to platform. Differential Revision: https://reviews.llvm.org/D60177 llvm-svn: 357962	2019-04-09 01:25:25 +00:00
Peter Collingbourne	df57979ba7	hwasan: Enable -hwasan-allow-ifunc by default. It's been on in Android for a while without causing problems, so it's time to make it the default and remove the flag. Differential Revision: https://reviews.llvm.org/D60355 llvm-svn: 357960	2019-04-09 00:25:59 +00:00
Sanjay Patel	74ccef1f4f	[InstCombine] add tests for negate+sdiv; NFC PR41425: https://bugs.llvm.org/show_bug.cgi?id=41425 llvm-svn: 357953	2019-04-08 22:55:10 +00:00
Shoaib Meenai	867131a96c	[BinaryFormat] Update Mach-O ARM64E CPU subtype and dumping The new value is taken from <mach/machine.h> in the MacOSX10.14 SDK from Xcode 10.1. Update llvm-objdump and llvm-readobj accordingly. Differential Revision: https://reviews.llvm.org/D58636 llvm-svn: 357945	2019-04-08 21:37:08 +00:00
Sanjay Patel	773e04c883	[InstCombine] peek through fdiv to find a squared sqrt A more general canonicalization between fdiv and fmul would not handle this case because that would have to be limited by uses to prevent 2 values from becoming 3 values: (x/y) * (x/y) --> (xx) / (yy) (But we probably should still have that limited -- but more general -- canonicalization independently of this change.) llvm-svn: 357943	2019-04-08 21:23:50 +00:00
Simon Pilgrim	9f74df7d5b	[TargetLowering] SimplifyDemandedBits - use DemandedElts in bitcast handling Be more selective in the SimplifyDemandedBits -> SimplifyDemandedVectorElts bitcast call based on the demanded elts. llvm-svn: 357942	2019-04-08 20:59:38 +00:00
Sanjay Patel	bf1417d7e4	[InstCombine] add extra-use tests for fmul+sqrt; NFC llvm-svn: 357939	2019-04-08 20:37:34 +00:00
Nikita Popov	15abd74de7	[InstCombine] Add more tests for signed saturing math overflow; NFC Overflow conditions for sadd.sat and ssub.sat which can be determined based on constant ranges, but not necessarily known bits. llvm-svn: 357938	2019-04-08 20:02:47 +00:00
Nico Weber	63b97d2a67	llvm-undname: Fix more crashes and asserts on invalid inputs For functions whose callers don't check that enough input is present, add checks at the start of the function that enough input is there and set Error otherwise. For functions that return AST objects, return nullptr instead of incomplete AST objects with nullptr fields if an error occurred during the function. Introduce a new function demangleDeclarator() for the sequence demangleFullyQualifiedSymbolName(); demangleEncodedSymbol() and use it in the two places that had this sequence. Let this new function check that ConversionOperatorIdentifiers have a valid TargetType. Some of the bad inputs found by oss-fuzz, others by inspection. Differential Revision: https://reviews.llvm.org/D60354 llvm-svn: 357936	2019-04-08 19:46:53 +00:00
Adrian Prantl	6ed5706a2b	Add LLVM IR debug info support for Fortran COMMON blocks COMMON blocks are a feature of Fortran that has no direct analog in C languages, but they are similar to data sections in assembly language programming. A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to their own, possibly distinct, non-empty list of variables. A Fortran COMMON block might look like the following example. COMMON /ALPHA/ I, J For this construct, the compiler generates a new scope-like DI construct (!DICommonBlock) into which variables (see I, J above) can be placed. As the common block implies a range of storage with global lifetime, the !DICommonBlock refers to a !DIGlobalVariable. The Fortran variable that comprise the COMMON block are also linked via metadata to offsets within the global variable that stands for the entire common block. @alpha_ = common global %alphabytes_ zeroinitializer, align 64, !dbg !27, !dbg !30, !dbg !33 !14 = distinct !DISubprogram(…) !20 = distinct !DICommonBlock(scope: !14, declaration: !25, name: "alpha") !25 = distinct !DIGlobalVariable(scope: !20, name: "common alpha", type: !24) !27 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) !29 = distinct !DIGlobalVariable(scope: !20, name: "i", file: !3, type: !28) !30 = !DIGlobalVariableExpression(var: !29, expr: !DIExpression()) !31 = distinct !DIGlobalVariable(scope: !20, name: "j", file: !3, type: !28) !32 = !DIExpression(DW_OP_plus_uconst, 4) !33 = !DIGlobalVariableExpression(var: !31, expr: !32) The DWARF generated for this is as follows. DW_TAG_common_block: DW_AT_name: alpha DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: common alpha DW_AT_type: array of 8 bytes DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: i DW_AT_type: integer4 DW_AT_location: @Alpha+0 DW_TAG_variable: DW_AT_name: j DW_AT_type: integer4 DW_AT_location: @Alpha+4 Patch by Eric Schweitz! Differential Revision: https://reviews.llvm.org/D54327 llvm-svn: 357934	2019-04-08 19:13:55 +00:00
Steven Wu	f41e70d6eb	Revert [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols This reverts r357931 (git commit `8b70a5c11e`) llvm-svn: 357932	2019-04-08 18:53:21 +00:00
Steven Wu	8b70a5c11e	[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols Summary: ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, dexonsmith Reviewed By: tejohnson Subscribers: mehdi_amini, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60226 llvm-svn: 357931	2019-04-08 18:24:10 +00:00
Brian M. Rzycki	887865c1ad	[JumpThreading] Fix incorrect fold conditional after indirectbr/callbr Fixes bug 40992: https://bugs.llvm.org/show_bug.cgi?id=40992 There is potential for miscompiled code emitted from JumpThreading when analyzing a block with one or more indirectbr or callbr predecessors. The ProcessThreadableEdges() function incorrectly folds conditional branches into an unconditional branch. This patch prevents incorrect branch folding without fully pessimizing other potential threading opportunities through the same basic block. This IR shape was manually fed in via opt and is unclear if clang and the full pass pipeline will ever emit similar code shapes. Thanks to Matthias Liedtke for the bug report and simplified IR example. Differential Revision: https://reviews.llvm.org/D60284 llvm-svn: 357930	2019-04-08 18:20:35 +00:00
Andrea Di Biagio	f6a60f1f80	[llvm-mca][scheduler-stats] Print issued micro opcodes per cycle. NFCI It makes more sense to print out the number of micro opcodes that are issued every cycle rather than the number of instructions issued per cycle. This behavior is also consistent with the dispatch-stats: numbers from the two views can now be easily compared. llvm-svn: 357919	2019-04-08 16:05:54 +00:00
Simon Pilgrim	86844a865e	[X86][AVX] Add PR34380 shuffle test cases llvm-svn: 357914	2019-04-08 14:05:42 +00:00
Sanjay Patel	50c3b290ed	[x86] make 8-bit shl undesirable I was looking at a potential DAGCombiner fix for 1 of the regressions in D60278, and it caused severe regression test pain because x86 TLI lies about the desirability of 8-bit shift ops. We've hinted at making all 8-bit ops undesirable for the reason in the code comment: // TODO: Almost no 8-bit ops are desirable because they have no actual // size/speed advantages vs. 32-bit ops, but they do have a major // potential disadvantage by causing partial register stalls. ...but that leads to massive diffs and exposes all kinds of optimization holes itself. Differential Revision: https://reviews.llvm.org/D60286 llvm-svn: 357912	2019-04-08 13:58:50 +00:00
Sanjay Patel	b33938df7a	[InstCombine] remove overzealous assert for shuffles (PR41419) As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910	2019-04-08 13:28:29 +00:00
Simon Pilgrim	b4f1bfa659	[InstCombine][X86] Expand MOVMSK to generic IR (PR39927) First step towards removing the MOVMSK intrinsics completely - this patch expands MOVMSK to the pattern: e.g. PMOVMSKB(v16i8 x): %cmp = icmp slt <16 x i8> %x, zeroinitializer %int = bitcast <16 x i8> %cmp to i16 %res = zext i16 %int to i32 Which is correctly handled by ISel and FastIsel (give or take an annoying movzx move....): https://godbolt.org/z/rkrSFW Differential Revision: https://reviews.llvm.org/D60256 llvm-svn: 357909	2019-04-08 13:17:51 +00:00
Chen Zheng	923c7c9daa	[InstCombine] sdiv exact flag fixup. Differential Revision: https://reviews.llvm.org/D60396 llvm-svn: 357904	2019-04-08 12:08:03 +00:00
Roman Lebedev	a82235843b	[llvm-exegesis][X86] Randomize CMOVcc/SETcc OPERAND_COND_CODE CondCodes Reviewers: courbet, gchatelet Reviewed By: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60066 llvm-svn: 357898	2019-04-08 10:11:00 +00:00
Chen Zheng	edf91ed855	[InstCombine] add more testcases for sdiv exact flag fixup. llvm-svn: 357894	2019-04-08 09:19:42 +00:00
Chen Zheng	d3b1d74624	[InstCombine] add testcases for sdiv exact flag fixing - NFC. llvm-svn: 357884	2019-04-08 05:49:15 +00:00
Chen Zheng	c84107612a	[InstCombine]add testcase for sdiv canonicalizetion - NFC llvm-svn: 357883	2019-04-08 03:07:32 +00:00
Craig Topper	afb6b42691	[X86] Split floating point tests out of atomic-mi.ll into atomic-fp.ll. Add avx and avx512f command lines. NFC llvm-svn: 357882	2019-04-08 01:54:27 +00:00
Craig Topper	8aeefe3149	[X86] Add avx and avx512f command lines to atomic-non-integer.ll. NFC llvm-svn: 357881	2019-04-08 01:54:24 +00:00
Craig Topper	424417da79	[X86] Use (SUBREG_TO_REG (MOV32rm)) for extloadi64i8/extloadi64i16 when the load is 4 byte aligned or better and not volatile. Summary: Previously we would use MOVZXrm8/MOVZXrm16, but those are longer encodings. This is similar to what we do in the loadi32 predicate. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60341 llvm-svn: 357875	2019-04-07 19:19:44 +00:00
Nikita Popov	3db93ac5d6	Reapply [ValueTracking] Support min/max selects in computeConstantRange() Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This fixes an infinite InstCombine loop, with the test case taken from D59378. Relative to the previous iteration, this contains some adjustments for AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen, which ends up constant folding some existing med3 tests after this change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option is added, which allows disabling scalar IR passes (that use InstSimplify) for testing purposes. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 357870	2019-04-07 17:22:16 +00:00
Simon Pilgrim	07adb6abda	[X86][SSE] SimplifyDemandedBitsForTargetNode - Add initial PACKSS support In the case where we only want the sign bit (e.g. when using PACKSS truncation of comparison results for MOVMSK) then we can just demand the sign bit of the source operands. This makes use of the fact that PACKSS saturates out of range values to the min/max int values - so the sign bit is always preserved. Differential Revision: https://reviews.llvm.org/D60333 llvm-svn: 357859	2019-04-07 10:40:01 +00:00
Fangrui Song	47a7662e29	[llvm-objdump] Fix split of source lines; don't ltrim source lines If the file does not end with a newline, it may be dropped. Fix the splitting algorithm. Also delete an unnecessary SourceCache lookup. llvm-svn: 357858	2019-04-07 10:16:46 +00:00
Fangrui Song	545ed223a6	[llvm-objdump] Simplify disassembleObject * Use std::binary_search to replace some std::lower_bound * Use llvm::upper_bound to replace some std::upper_bound * Use format_hex and support::endian::read{16,32} llvm-svn: 357853	2019-04-07 05:32:16 +00:00
Craig Topper	399102b464	[X86] When converting (x << C1) AND C2 to (x AND (C2>>C1)) << C1 during isel, try using andl over andq by favoring 32-bit unsigned immediates. llvm-svn: 357848	2019-04-06 19:00:11 +00:00
Craig Topper	f9b9f8d2e4	[X86] Use a signed mask in foldMaskedShiftToScaledMask to enable a shorter immediate encoding. This function reorders AND and SHL to enable the SHL to fold into an LEA. The upper bits of the AND will be shifted out by the SHL so it doesn't matter what mask value we use for these bits. By using sign bits from the original mask in these upper bits we might enable a shorter immediate encoding to be used. llvm-svn: 357846	2019-04-06 18:00:50 +00:00
Craig Topper	82448bc09e	[X86] Add test cases to show missed opportunities to use a sign extended 8 or 32 bit immediate AND when reversing SHL+AND to form an LEA. When we shift the AND mask over we should shift in sign bits instead of zero bits. The scale in the LEA will shift these bits out so it doesn't matter whether we mask the bits off or not. Using sign bits will potentially allow a sign extended immediate to be used. Also add some other test cases for cases that are currently optimal. llvm-svn: 357845	2019-04-06 18:00:45 +00:00
Craig Topper	9d7379c250	[X86] Autogenerate complete checks. NFC llvm-svn: 357844	2019-04-06 18:00:41 +00:00
Simon Pilgrim	ec28615f7f	[X86] Add AVX-target expandload and compressstore tests llvm-svn: 357842	2019-04-06 14:40:52 +00:00
Roman Lebedev	404bdb1c9e	[llvm-exegesis][X86] Handle CMOVcc/SETcc OPERAND_COND_CODE OperandType Summary: D60041 / D60138 refactoring changed how CMOV/SETcc opcodes are handled. concode is now an immediate, with it's own operand type. This at least allows to not crash on the opcode. However, this still won't generate all the snippets with all the condcode enumerators. D60066 does that. Reviewers: courbet, gchatelet Reviewed By: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60057 llvm-svn: 357841	2019-04-06 14:16:26 +00:00
Simon Pilgrim	d23611f9ad	[X86] Split expandload and compressstore tests llvm-svn: 357840	2019-04-06 14:14:54 +00:00
Simon Pilgrim	18a8a64c9f	[X86][SSE] Add more exhaustive masked load/store tests Reordered/renamed some existing tests to match the cleaned up order llvm-svn: 357839	2019-04-06 14:01:37 +00:00
Simon Pilgrim	2ea8dbf564	[CostModel][X86] Add more exhaustive masked load/store/gather/scatter/expand/compress cost tests llvm-svn: 357838	2019-04-06 12:08:37 +00:00
Sanjay Patel	c538c50113	[InstCombine] add more tests for fmul+fdiv+sqrt; NFC llvm-svn: 357816	2019-04-05 20:54:35 +00:00
Francis Visoiu Mistrih	9d9d1b6b2b	[X86] Enable tail calls for CallingConv::Swift It's currently only enabled on AArch64 (enabled in r281376). llvm-svn: 357809	2019-04-05 20:18:25 +00:00
Francis Visoiu Mistrih	ab051a378c	[X86] Preserve operand flag when expanding TCRETURNri The expansion of TCRETURNri(64) would not keep operand flags like undef/renamable/etc. which can result in machine verifier issues. Also add plumbing to be able to use `-run-pass=x86-pseudo`. llvm-svn: 357808	2019-04-05 20:18:21 +00:00
Stanislav Mekhanoshin	c8f78f8dd3	[AMDGPU] Add MachineDCE pass after RenameIndependentSubregs Detect dead lanes can create some dead defs. Then RenameIndependentSubregs will break a REG_SEQUENCE which may use these dead defs. At this point a dead instruction can be removed but we do not run a DCE anymore. MachineDCE was only running before live variable analysis. The patch adds a mean to preserve LiveIntervals and SlotIndexes in case it works past this. Differential Revision: https://reviews.llvm.org/D59626 llvm-svn: 357805	2019-04-05 20:11:32 +00:00
Craig Topper	80aa2290fb	[X86] Merge the different Jcc instructions for each condition code into single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802	2019-04-05 19:28:09 +00:00
Craig Topper	7323c2bf85	[X86] Merge the different SETcc instructions for each condition code into single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801	2019-04-05 19:27:49 +00:00
Craig Topper	e0bfeb5f24	[X86] Merge the different CMOV instructions for each condition code into single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800	2019-04-05 19:27:41 +00:00
Guozhi Wei	36fc9c3107	[LCG] Add aliased functions as LCG roots Current LCG doesn't check aliased functions. So if an internal function has a public alias it will not be added to CG SCC, but it is still reachable from outside through the alias. So this patch adds aliased functions to SCC. Differential Revision: https://reviews.llvm.org/D59898 llvm-svn: 357795	2019-04-05 18:51:08 +00:00
Sanjay Patel	79df4454e1	[InstCombine] add tests for fdiv+fmul; NFC llvm-svn: 357782	2019-04-05 17:00:57 +00:00
Sanjay Patel	7e3e7f8040	[InstCombine] add tests for sqrt+fdiv+fmul; NFC Examples based on recent llvm-dev thread. These are specific patterns of more general enhancements that would solve these. llvm-svn: 357780	2019-04-05 16:52:57 +00:00
Sanjay Patel	9965f5aa70	[InstCombine] add test to show reassociation that creates a denormal constant; NFC llvm-svn: 357776	2019-04-05 16:42:21 +00:00
Stephen Tozer	bbeca849d7	Revert "[llvm-readobj] Improve error message for --string-dump" This reverts commit `681b0798db`. Reverted due to causing build failures: llvm-svn: 357772 llvm-svn: 357774	2019-04-05 16:32:25 +00:00
Stephen Tozer	681b0798db	[llvm-readobj] Improve error message for --string-dump Fixes bug 40630: https://bugs.llvm.org/show_bug.cgi?id=40630 This patch changes the error message when the section specified by --string-dump cannot be found by including the name of the section in the error message and changing the prefix text to not imply that the file itself was invalid. As part of this change some uses of std::error_code have been replaced with the llvm Error class to better encapsulate the error info (rather than passing File strings around), and the WithColor class replaces string literal error prefixes. Differential Revision: https://reviews.llvm.org/D59946 llvm-svn: 357772	2019-04-05 16:15:50 +00:00
Clement Courbet	1d8c9dfe03	[ExpandMemCmp][NFC] Add tests for `memcmp(p, q, n) < 0` case. llvm-svn: 357767	2019-04-05 15:03:25 +00:00
Simon Pilgrim	17586cda4a	[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D60006 llvm-svn: 357765	2019-04-05 14:56:21 +00:00
Matt Arsenault	4ed6ccab9b	AMDGPU/GlobalISel: Fix non-power-of-2 select llvm-svn: 357762	2019-04-05 14:03:04 +00:00
Sanjay Patel	50a8652785	[DAGCombiner][x86] scalarize splatted vector FP ops There are a variety of vector patterns that may be profitably reduced to a scalar op when scalar ops are performed using a subset (typically, the first lane) of the vector register file. For x86, this is true for float/double ops and element 0 because insert/extract is just a sub-register rename. Other targets should likely enable the hook in a similar way. Differential Revision: https://reviews.llvm.org/D60150 llvm-svn: 357760	2019-04-05 13:32:17 +00:00
Simon Pilgrim	faa5b939f0	[X86][AVX] Add PR34584 masked store test cases llvm-svn: 357757	2019-04-05 11:34:30 +00:00
Simon Pilgrim	329e63b915	[X86] Add SSE/AVX1/AVX2 masked trunc+store tests llvm-svn: 357756	2019-04-05 11:22:28 +00:00
Roger Ferrer Ibanez	e011e4f89c	[RISCV] Implement adding a displacement to a BlockAddress Recent change rL357393 uses MachineInstrBuilder::addDisp to add a based on a BlockAddress but this case was not implemented. This patch adds the missing case and a test for RISC-V that exercises the new case. Differential Revision: https://reviews.llvm.org/D60136 llvm-svn: 357752	2019-04-05 08:40:57 +00:00
Pavel Labath	51d9fa0a22	Minidump: Add support for reading/writing strings Summary: Strings in minidump files are stored as a 32-bit length field, giving the length of the string in bytes, which is followed by the appropriate number of UTF16 code units. The string is also supposed to be null-terminated, and the null-terminator is not a part of the length field. This patch: - adds support for reading these strings out of the minidump file (this implementation does not depend on proper null-termination) - adds support for writing them to a minidump file - using the previous two pieces implements proper (de)serialization of the CSDVersion field of the SystemInfo stream. Previously, this was only read/written as hex, and no attempt was made to access the referenced string -- now this string is read and written correctly. The changes are tested via yaml2obj\|obj2yaml round-trip as well as a unit test which checks the corner cases of the string deserialization logic. Reviewers: jhenderson, zturner, clayborg Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59775 llvm-svn: 357749	2019-04-05 08:06:26 +00:00
Piotr Sobczak	0376ac1d94	[SelectionDAG] Compute known bits of CopyFromReg Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 llvm-svn: 357745	2019-04-05 07:44:09 +00:00
Craig Topper	94f1772b1e	[X86] Promote i16 SRA instructions to i32 We already promote SRL and SHL to i32. This will introduce sign extends sometimes which might be harder to deal with than the zero we use for promoting SRL. I ran this through some of our internal benchmark lists and didn't see any major regressions. I think there might be some DAG combine improvement opportunities in the test changes here. Differential Revision: https://reviews.llvm.org/D60278 llvm-svn: 357743	2019-04-05 06:32:50 +00:00
Serguei Katkov	c39636cc2c	[FastISel] Fix crash for gc.relocate lowring Lowering safepoint checks that all gc.relocaes observed in safepoint must be lowered. However Fast-Isel is able to skip dead gc.relocate. To resolve this issue we just ignore dead gc.relocate in the check. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60184 llvm-svn: 357742	2019-04-05 05:41:08 +00:00
David Callahan	f498bdcebf	Include invoke'd functions for recursive extract Summary: When recursively extracting a function from a bit code file, include functions mentioned in InvokeInst as well as CallInst Reviewers: loladiro, espindola, volkan Reviewed By: loladiro Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60231 llvm-svn: 357735	2019-04-04 23:30:47 +00:00
James Y Knight	a040174418	Revert [X86] When using Win64 ABI, exit with error if SSE is disabled for varargs It unnecessarily breaks previously-working code which used varargs, but didn't pass any float/double arguments (such as EDK2). Also revert the fixup on top of that: Revert [X86] Fix a test from r357317 This reverts r357317 (git commit `d413f41de6`) This reverts r357380 (git commit `7af32444b9`) llvm-svn: 357718	2019-04-04 19:05:48 +00:00
Sam Clegg	2a7cac932b	[WebAssembly] Add new explicit relocation types for PIC relocations See https://github.com/WebAssembly/tool-conventions/pull/106 Differential Revision: https://reviews.llvm.org/D59907 llvm-svn: 357710	2019-04-04 17:43:50 +00:00
Don Hinton	98e3954fe9	[llvm-objcopy] [llvm-symbolizer] Fix failing tests Summary: Fix failing tests that matched substrings in path. Reviewers: evgeny777, mattd, espindola, alexshap, rupprecht, jhenderson Reviewed By: jhenderson Subscribers: Bulletmagnet, emaste, arichardson, jakehehrlich, MaskRay, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60170 llvm-svn: 357709	2019-04-04 17:35:41 +00:00
Adrian Prantl	ce2d45e7ba	llvm-dwarfdump: Support alternative architecture names in the -arch filter <rdar://problem/47918606> llvm-svn: 357706	2019-04-04 15:48:40 +00:00
Sanjay Patel	17648b848e	[x86] eliminate unnecessary broadcast of horizontal op This is another pattern that comes up if we more aggressively scalarize FP ops. llvm-svn: 357703	2019-04-04 14:46:13 +00:00
Lewis Revill	aa79a3fe8e	[RISCV] Support assembling TLS add and associated modifiers This patch adds support in the MC layer for parsing and assembling the 4-operand add instruction needed for TLS addressing. This also involves parsing the %tprel_hi, %tprel_lo and %tprel_add operand modifiers. Differential Revision: https://reviews.llvm.org/D55341 llvm-svn: 357698	2019-04-04 14:13:37 +00:00

... 4 5 6 7 8 ...

61110 Commits