llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	51ff15f472	[X86][SSE] Simplify combineVectorTruncationWithPACKUS. NFCI. Move code only used by combineVectorTruncationWithPACKUS out of combineVectorTruncation. llvm-svn: 334201	2018-06-07 14:53:32 +00:00
Hiroshi Inoue	01ef4c2c64	[PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation. However, enforcing 32-bit instruction sometimes results in redundant generated code. For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF). uint64_t func(uint64_t a, uint64_t b) { return (a & 0xFFFFFFFF) \| (b << 32) ; } To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group. Differential Revision: https://reviews.llvm.org/D47867 llvm-svn: 334195	2018-06-07 13:21:14 +00:00
Petar Jovanovic	241f286bd7	[Mips] Silencing warnings in instruction info (NFC) isORCopyInst and isReadOrWriteToDSPReg functions were producing warning that some statements my fall through. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D47876 llvm-svn: 334194	2018-06-07 13:06:06 +00:00
Simon Pilgrim	09953d8412	[X86][SSE] Simplify combineVectorTruncationWithPACKSS to reduce code duplication Simplify combineVectorTruncationWithPACKSS to just a SIGN_EXTEND_INREG followed by using the existing truncateVectorWithPACK instead of duplicating code. llvm-svn: 334193	2018-06-07 13:01:42 +00:00
Hiroshi Inoue	b557846083	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334191	2018-06-07 12:49:12 +00:00
Matt Arsenault	f1c868ef08	AMDGPU: Fix not including v2f64 in SReg_128 Fixes assertion with calls returning v2f64. llvm-svn: 334189	2018-06-07 12:16:31 +00:00
Florian Hahn	0d6b01761c	[Mem2Reg] Avoid replacing load with itself in promoteSingleBlockAlloca. We do the same thing in rewriteSingleStoreAlloca. Fixes PR37632. Reviewers: chandlerc, davide, efriedma Reviewed By: davide Differential Revision: https://reviews.llvm.org/D47825 llvm-svn: 334187	2018-06-07 11:09:05 +00:00
Matt Arsenault	697300bd4f	AMDGPU: Use scalar operations for f16 fabs/fneg patterns Fixes unnecessary differences between subtargets. llvm-svn: 334184	2018-06-07 10:15:20 +00:00
Matt Arsenault	90083d3088	AMDGPU: Try a lot harder to emit scalar loads This has two main components. First, widen widen short constant loads in DAG when they have the correct alignment. This is already done a bit in AMDGPUCodeGenPrepare, since that has access to DivergenceAnalysis. This can't help kernarg loads created in the DAG. Start to use DAG divergence analysis to help this case. The second part is to avoid kernel argument lowering breaking the alignment of short vector elements because calling convention lowering wants to split everything into legal register types. When loading a split type, load the nearest 4-byte aligned segment and shift to get the desired bits. This extra load of the earlier argument piece ends up merging, and the bit extract hopefully folds out. There are a number of improvements and regressions with this, but I think as-is this is a better compromise between several of the worst parts of SelectionDAG. Particularly when i16 is legal, this produces worse code for i8 and i16 element vector kernel arguments. This is partially due to the very weak load merging the DAG does. It only looks for fairly specific combines between pairs of loads which no longer appear. In particular this causes v4i16 loads to be split into 2 components when previously the two halves were merged. Worse, because of the newly introduced shifts, there is a lot more unnecessary vector packing and unpacking code emitted. At least some of this is due to reporting false for isTypeDesirableForOp for i16 as a workaround for the lack of divergence information in the DAG. The cases where this happens it doesn't actually matter, but the relevant code in SimplifyDemandedBits doens't have the context to know to ignore this. The use of the scalar cache is probably more important than the mess of mostly scalar instructions doing this packing and unpacking. Future work can fix this, possibly by making better use of the new DAG divergence information for controlling promotion decisions, or adding another version of shift + trunc + shift combines that doesn't only know about the used types. llvm-svn: 334180	2018-06-07 09:54:49 +00:00
Clement Courbet	4281b1d3b5	[X86][NFC] Fix harmless typo in BtVer2 model. See D46356 for context. llvm-svn: 334178	2018-06-07 09:26:33 +00:00
Tomasz Krupa	f8c7637027	[X86] Block UndefRegUpdate Summary: Prevent folding of operations with memory loads when one of the sources has undefined register update. Reviewers: craig.topper Subscribers: llvm-commits, mike.dvoretsky, ashlykov Differential Revision: https://reviews.llvm.org/D47621 llvm-svn: 334175	2018-06-07 08:48:45 +00:00
Max Kazantsev	b4b2ccea6d	[NFC] Use variable instead of accessing pair many times llvm-svn: 334173	2018-06-07 08:47:19 +00:00
Tomasz Krupa	145825162a	Test commit access. Added a bunch of periods after comments. llvm-svn: 334171	2018-06-07 08:20:28 +00:00
Clement Courbet	9212ef0a0a	[X86][NFC] Fix harmless typos in BDW/ZnVer1 sched models. See D46356 for context. llvm-svn: 334164	2018-06-07 07:37:49 +00:00
Karl-Johan Karlsson	abb11f805f	[BranchFolding] Fix live-in's when hoisting code Summary: When the branch folder hoist code into a predecessor it adjust live-in's in the blocks it hoist code from. However it fail to handle hoisted code that contain a defed register that originally is live-in in the block through a super register. This is fixed by replacing the live-in handling code with calls to utility functions in LivePhysRegs. Reviewers: kparzysz, gberry, MatzeB, uweigand, aprantl Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47529 llvm-svn: 334163	2018-06-07 07:20:33 +00:00
Jonas Paulsson	e80d405760	[SystemZ] Build Load And Test from scratch in convertToLoadAndTest. This is needed to get CC operand in right place, as expected by the SchedModel. Review: Ulrich Weigand https://reviews.llvm.org/D47820 llvm-svn: 334161	2018-06-07 05:59:07 +00:00
Michael Zolotukhin	31800864dc	SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation. The pass doesn't touch CFG in any way, only moves instructions between blocks. llvm-svn: 334150	2018-06-07 00:19:29 +00:00
Stanislav Mekhanoshin	df61be70b2	[AMDGPU] Improve reciprocal handling When denormals are supported we are producing a full division for 1.0f / x. That still can be replaced by the faster version: bool c = fabs(x) > 0x1.0p+96f; float s = c ? 0x1.0p-32f : 1.0f; x = s; return s v_rcp_f32(x) in case if requested accuracy is 2.5ulp or less. The same version is used if denormals are not supported for non 1.0 numerators, where just v_rcp_f32 is then used for 1.0 numerator. The optimization of 1/x is extended to the case -1/x, which is the same except for the resulting sign bit. OpenCL conformance passed with both enabled and disabled denorms. Differential Revision: https://reviews.llvm.org/D47805 llvm-svn: 334142	2018-06-06 22:22:32 +00:00
Teresa Johnson	4ffc3e7834	[ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC) With the upcoming patch to add summary parsing support, IsAnalysis would be true in contexts where we are not performing module summary analysis. Rename to the more specific and approprate HaveGVs, which is essentially what this flag is indicating. llvm-svn: 334140	2018-06-06 22:22:01 +00:00
Sanjay Patel	3cd1aa88f9	[InstCombine] fold another shifty abs pattern to cmp+sel (PR36036) The bug report: https://bugs.llvm.org/show_bug.cgi?id=36036 ...requests a DAG change for this, but an IR canonicalization probably handles most cases. If we still want to match this pattern in the backend, there's a proposal for that too: D47831 Alive proofs including nsw/nuw cases that were first noted in: D46988 https://rise4fun.com/Alive/Kmp This patch is largely copied from the existing code that was initially added with: D40984 ...but I didn't see much gain from trying to share code. llvm-svn: 334137	2018-06-06 21:58:12 +00:00
Matt Arsenault	e9524f1fb3	AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16 Fixes terrible code on targets without f16 support. The legalization creates a mess that is difficult to recover from. Also should avoid randomly breaking these tests multiple times in sequence in future commits. Some regressions in cases where it happens to be better to pull the source modifier after the conversion. llvm-svn: 334132	2018-06-06 21:28:11 +00:00
Roman Lebedev	cbf8446359	[InstCombine] PR37603: low bit mask canonicalization Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 \| PR37603 ]]. https://godbolt.org/g/VCMNpS https://rise4fun.com/Alive/idM When doing bit manipulations, it is quite common to calculate some bit mask, and apply it to some value via `and`. The typical C code looks like: ``` int mask_signed_add(int nbits) { return (1 << nbits) - 1; } ``` which is translated into (with `-O3`) ``` define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 { %2 = shl i32 1, %0 %3 = add nsw i32 %2, -1 ret i32 %3 } ``` But there is a second, less readable variant: ``` int mask_signed_xor(int nbits) { return ~(-(1 << nbits)); } ``` which is translated into (with `-O3`) ``` define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 { %2 = shl i32 -1, %0 %3 = xor i32 %2, -1 ret i32 %3 } ``` Since we created such a mask, it is quite likely that we will use it in `and` next. And then we may get rid of `not` op by folding into `andn`. But now that i have actually looked: https://godbolt.org/g/VTUDmU _some_ backend changes will be needed too. We clearly loose `bzhi` recognition. Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47428 llvm-svn: 334127	2018-06-06 19:38:27 +00:00
Roman Lebedev	488d28d4e5	[X86] Emit BZHI when mask is ~(-1 << nbits)) Summary: In D47428, i propose to choose the `~(-(1 << nbits))` as the canonical form of low-bit-mask formation. As it is seen from these tests, there is a reason for that. AArch64 currently better handles `~(-(1 << nbits))`, but not the more traditional `(1 << nbits) - 1` (sic!). The other way around for X86. It would be much better to canonicalize. This patch is completely monkey-typing. I don't really understand how this works :) I have based it on `// x & (-1 >> (32 - y))` pattern. Also, when we only have `BMI`, i wonder if we could use `BEXTR` with `start=0` ? Related links: https://bugs.llvm.org/show_bug.cgi?id=36419 https://bugs.llvm.org/show_bug.cgi?id=37603 https://bugs.llvm.org/show_bug.cgi?id=37610 https://rise4fun.com/Alive/idM Reviewers: craig.topper, spatel, RKSimon, javed.absar Reviewed By: craig.topper Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D47453 llvm-svn: 334125	2018-06-06 19:38:16 +00:00
Krzysztof Parzyszek	c1e712baa5	[Hexagon] Implement vector-pair zero as V6_vsubw_dv llvm-svn: 334123	2018-06-06 19:34:40 +00:00
Craig Topper	ef813a5226	[X86] Properly disassemble gather/scatter instructions where xmm4/ymm4/zmm4 are used as the index. These encodings correspond to the cases in the normal encoding scheme where there is no index and our modrm reading code initially decodes it as such. The VSIB handling code tried to compensate for this, but failed to add the base needed to make later code do the right thing. Fixes PR37712. llvm-svn: 334121	2018-06-06 19:15:15 +00:00
Craig Topper	d04cc8e640	[X86] Rename vy512mem->vy512xmem and vz256xmem->vz256mem. The index size is represented by the letter after the 'v'. The number represents the memory size. If an 'x' appears after the number its means the index register can be from VR128X/VR256X instead of VR128/VR256. As vy512mem uses a VR256X index it should have an x. And vz256mem uses a VR512 index so it shouldn't have an x. I admit these names kind of suck and are confusing. llvm-svn: 334120	2018-06-06 19:15:12 +00:00
Simon Pilgrim	aef5bdbea1	[X86][BtVer2] Add support for all vector instructions that should match the dependency-breaking 'zero-idiom' As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), all these instructions are dependency breaking and zero the destination register. llvm-svn: 334119	2018-06-06 19:06:09 +00:00
Evandro Menezes	b2c8244715	[AArch64, ARM] Add support for Samsung Exynos M4 Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115	2018-06-06 18:56:00 +00:00
Michael Berg	cc1c4b6912	guard fsqrt with fmf sub flags Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483. It contains only context for fsqrt. Reviewers: spatel, hfinkel, arsenm Reviewed By: spatel Subscribers: hfinkel, wdng, andrew.w.kaylor, wristow, efriedma, nemanjai Differential Revision: https://reviews.llvm.org/D47749 llvm-svn: 334113	2018-06-06 18:47:55 +00:00
Krzysztof Parzyszek	0da1fe3770	[Hexagon] Split CTPOP of vector pairs llvm-svn: 334109	2018-06-06 18:03:29 +00:00
Petar Jovanovic	8cb6a521be	Change TII isCopyInstr way of returning arguments(NFC) Make TII isCopyInstr() return MachineOperands through pointer to pointer instead via reference. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D47364 llvm-svn: 334105	2018-06-06 16:36:30 +00:00
David Green	25312b2b6c	[GlobalMerge] Set the alignment on merged global structs If no alignment is set, the abi/preferred alignment of structs will be used which may be higher than required. This can lead to extra padding and in the end an increase in data size. Differential Revision: https://reviews.llvm.org/D47633 llvm-svn: 334099	2018-06-06 14:48:32 +00:00
Tim Northover	9b80060d7b	InstCombine: ignore debug instructions during fence combine We should never get different CodeGen based on whether the code is being compiled in debug mode so we must skip over @llvm.dbg.value (and similar) calls. Should fix at least the worst part of PR37690. llvm-svn: 334090	2018-06-06 12:46:02 +00:00
Simon Pilgrim	f06ff16049	Fix MSVC '*/' found outside of comment warning. NFCI. llvm-svn: 334086	2018-06-06 11:10:11 +00:00
Ilya Biryukov	3c9c10649b	Fix compilation of WebAssembly and RISCV after r334078 llvm-svn: 334085	2018-06-06 10:57:50 +00:00
Simon Pilgrim	3d14158891	[X86][BMI][TBM] Only demand bottom 16-bits of the BEXTR control op (PR34042) Only the bottom 16-bits of BEXTR's control op are required (0:8 INDEX, 15:8 LENGTH). Differential Revision: https://reviews.llvm.org/D47690 llvm-svn: 334083	2018-06-06 10:52:10 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00
Petar Jovanovic	326ec32403	[MIPS GlobalISel] Add lowerCall Add minimal support to lower function calls. Support only functions with arguments/return that go through registers and have type i32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D45627 llvm-svn: 334071	2018-06-06 07:24:52 +00:00
Petr Hosek	fc9b29bd61	[Support] Use zx_cache_flush on Fuchsia to flush instruction cache Fuchsia doesn't use __clear_cache, instead it provide zx_cache_flush system call. Use it to flush instruction cache. Differential Revision: https://reviews.llvm.org/D47753 llvm-svn: 334068	2018-06-06 06:26:18 +00:00
Sanjay Patel	59313be8d3	[CodeGen] assume max/default throughput for unspecified instructions This is a fix for the problem arising in D47374 (PR37678): https://bugs.llvm.org/show_bug.cgi?id=37678 We may not have throughput info because it's not specified in the model or it's not available with variant scheduling, so assume that those instructions can execute/complete at max-issue-width. Differential Revision: https://reviews.llvm.org/D47723 llvm-svn: 334055	2018-06-05 23:34:45 +00:00
Amaury Sechet	a79b6b3ef0	[Mips] Remove uneeded variants of ADDC/ADDE lowering Summary: As it turns out, the lowering for the Mips16* family of target is the exact same thing as what the ops expands to, so the code handling them can be removed and the ops only enabled for the MipsSE* family of targets. Reviewers: smaksimovic, atanasyan, abeserminji Subscribers: sdardis, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D47703 llvm-svn: 334052	2018-06-05 22:13:56 +00:00
Guozhi Wei	c4c6b548c5	[CodeGenPrepare] Move Extension Instructions Through Logical And Shift Instructions CodeGenPrepare pass move extension instructions close to load instructions in different BB, so they can be combined later. But the extension instructions can't move through logical and shift instructions in current implementation. This patch enables this enhancement, so we can eliminate more extension instructions. Differential Revision: https://reviews.llvm.org/D45537 This is re-commit of r331783, which was reverted by r333305. The performance regression was caused by some unlucky alignment, not a code generation problem. llvm-svn: 334049	2018-06-05 21:03:52 +00:00
Zachary Turner	8ac1c38a72	[FileSystem] Remove OpenFlags param from several functions. There was only one place in the entire codebase where a non default value was being passed, and that place was already hidden in an implementation file. So we can delete the extra parameter and all existing clients continue to work as they always have, while making the interface a bit simpler. Differential Revision: https://reviews.llvm.org/D47789 llvm-svn: 334046	2018-06-05 19:58:26 +00:00
Matt Arsenault	57e541e87e	AMDGPU: Preserve metadata when widening loads Preserves the low bound of the !range. I don't think it's legal to do anything with the top half since it's theoretically reading garbage. llvm-svn: 334045	2018-06-05 19:52:56 +00:00
Matt Arsenault	9224c00d2b	AMDGPU: Use more custom insert/extract_vector_elt lowering Apply to i8 vectors. llvm-svn: 334044	2018-06-05 19:52:46 +00:00
Krzysztof Parzyszek	b984ffcc71	[Hexagon] Add pattern to generate 64-bit neg instruction llvm-svn: 334043	2018-06-05 19:52:39 +00:00
Krzysztof Parzyszek	d8b093efef	[Hexagon] Add more patterns for generating abs/absp instructions llvm-svn: 334038	2018-06-05 19:00:50 +00:00
Michael Berg	96925fe0df	guard fneg with fmf sub flags Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483. Reviewers: spatel, hfinkel Reviewed By: spatel Subscribers: nemanjai Differential Revision: https://reviews.llvm.org/D47389 llvm-svn: 334037	2018-06-05 18:49:47 +00:00
Simon Dardis	0d95ff03f2	[mips] Fix the predicates for arithmetic operations Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47635 llvm-svn: 334031	2018-06-05 17:53:22 +00:00
Simon Pilgrim	f2f043acbb	[X86][SSE] Use multiplication scale factors for v8i16 SHL on pre-AVX2 targets. Similar to v4i32 SHL, convert v8i16 shift amounts to scale factors instead to improve performance and reduce instruction count. We were already doing this for constant shifts, this adds variable shift support. Reduces the serial nature of the codegen, which relies on chains of plendvb/pand+pandn+por shifts. This is a step towards adding support for vXi16 vector rotates. Differential Revision: https://reviews.llvm.org/D47546 llvm-svn: 334023	2018-06-05 15:17:39 +00:00
Nirav Dave	05b589101e	[MC][X86] Allow assembler variable assignment to register name. Summary: Allow extended parsing of variable assembler assignment syntax and modify X86 to permit VAR = register assignment. As we emit these as .set directives when possible, we inline such expressions in output assembly. Fixes PR37425. Reviewers: rnk, void, echristo Reviewed By: rnk Subscribers: nickdesaulniers, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47545 llvm-svn: 334022	2018-06-05 15:13:39 +00:00
Matt Arsenault	191bc71541	DAG: Stop dropping invariant/dereferencable When legalizing illegal FP load results, this was for some reason dropping the invariant and dereferencable memory flags. There doesn't seem to be any reason for this, and the equivalent isn't done for integer loads. Fixes an issue in a future AMDGPU commit where some identical loads fail to merge because one of the loads ends up dropping the flags. llvm-svn: 334020	2018-06-05 14:52:24 +00:00
John Brawn	e4ff0bd401	[InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need to use the type of the existing operand when creating a new operand not the type of a select operand, as the two may be different. This fixes PR37686. llvm-svn: 334019	2018-06-05 14:10:55 +00:00
Gabor Buella	1181f94ae4	[X86] NFC Fix typo introduced in r328016 HSI->HDI llvm-svn: 334016	2018-06-05 12:55:12 +00:00
Krzysztof Parzyszek	aafb8c204c	[Hexagon] Minor cleanups in isel lowering llvm-svn: 334015	2018-06-05 12:49:19 +00:00
Hiroshi Inoue	955655f558	[PowerPC] reduce rotate in BitPermutationSelector BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers. Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation. For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation. This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5. This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first. Differential Revision: https://reviews.llvm.org/D47765 llvm-svn: 334011	2018-06-05 11:58:01 +00:00
Simon Pilgrim	fef9b6eea6	[X86][SSE] Add target shuffle support to X86TargetLowering::computeKnownBitsForTargetNode Ideally we'd use resolveTargetShuffleInputs to handle faux shuffles as well but: (a) that code path doesn't handle general/pre-legalized ops/types very well. (b) I'm concerned about the compute time as they recurse to calls to computeKnownBits/ComputeNumSignBits which would need depth limiting somehow. llvm-svn: 334007	2018-06-05 10:52:29 +00:00
Gabor Buella	349ffcee87	[X86] NFC Refactor some code in InstPrinters Summary: Bringing some come duplicated in the AT&T and the Intel printers into a common parent class. Reviewers: craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D47682 llvm-svn: 334005	2018-06-05 10:41:39 +00:00
Peter Smith	ef945b2240	[MC][ARM] Add range checking for Thumb2 resolved fixups. When the branch target of a Thumb2 unconditional or conditonal branch is resolved at assembly time, no range checking is performed on the result leading to incorrect immediates. This change adds a range check: +- 16 Megabytes for unconditional branches, +- 1 Megabyte for the conditional branch. Differential Revision: https://reviews.llvm.org/D46306 llvm-svn: 333997	2018-06-05 10:00:56 +00:00
Simon Pilgrim	7bbe7a2920	[X86][SSE] Add basic PACKUS support to X86TargetLowering::computeKnownBitsForTargetNode Helps improve analysis of saturation ops llvm-svn: 333995	2018-06-05 09:45:03 +00:00
Peter Smith	0aafe0cee5	[MC][ARM] Correct Thumb BL instruction range The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing check for BL range is incorrectly set at +- 32 Megabytes. This change corrects the higher range and uses the lower range if the featurebits don't have the necessary support for it. Differential Revision: https://reviews.llvm.org/D46305 llvm-svn: 333991	2018-06-05 09:32:28 +00:00
Alexander Ivchenko	964b27fa21	[X86][CET] Shadow stack fix for setjmp/longjmp This is the new version of D46181, allowing setjmp/longjmp to work correctly with the Intel CET shadow stack by storing SSP on setjmp and fixing it on longjmp. The patch has been updated to use the cf-protection-return module flag instead of HasSHSTK, and the bug that caused D46181 to be reverted has been fixed with the test expanded to track that fix. patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D47311 llvm-svn: 333990	2018-06-05 09:22:30 +00:00
Craig Topper	f17b33d6c6	[X86] Make all instructions that operate on MMX types, but were added after the initial MMX support via one of the SSE features flags make them require the MMX feature as well. Passing -mattr=-mmx needs to disable these instructions since the MMX register class won't have been set up. But we don't want -mattr=-mmx to disable SSE so we have to do it separately. llvm-svn: 333984	2018-06-05 06:20:06 +00:00
Nirav Dave	e5eb99668c	[RegAllocGreedy] Use simpler map class for EvicteeInfo. NFCI. RegAlloc keeps a insertion-time ordered map of evictee information, but we only use membership. Replace MapVector with contextually equivalent DenseMap which is smaller and faster. llvm-svn: 333981	2018-06-05 03:16:28 +00:00
Francis Visoiu Mistrih	2c0ef67327	Use MF instead of Fn for MachineFunction references. NFC llvm-svn: 333973	2018-06-05 00:27:28 +00:00
Francis Visoiu Mistrih	ca69b3bf6d	[ShrinkWrap] Add optimization remarks to the shrink-wrapping pass Start by emitting remarks for very basic unsupported cases such as irreducible CFGs and EHFunclets. The end goal is to be able to cover all the cases where we give up with an explanation. llvm-svn: 333972	2018-06-05 00:27:24 +00:00
Amara Emerson	d496cc8ffb	[MIRParser] Add parser support for 'true' and 'false' i1s. We already output true and false in the printer, but the parser isn't able to read it. Differential Revision: https://reviews.llvm.org/D47424 llvm-svn: 333970	2018-06-05 00:17:13 +00:00
Reid Kleckner	adcaddb6da	Fix -Wcovered-switch-default warning and clang-format it llvm-svn: 333967	2018-06-04 23:47:29 +00:00
David Blaikie	10d25ffe7d	Move Compiler.h from Demangle back to Support Code review feedback from r328123 prefers copying the few feature test macros used by Demangle into there, rather than sinking the header into an odd corner like Demangle. llvm-svn: 333965	2018-06-04 22:53:38 +00:00
Derek Schuff	72f19241d6	Simplified WebAssemblyAsmBackend by removing explicit ELF variant. The ELF version was broken (does not deal with wasm specific fixups), and now is slightly less broken. It will be removed in its entirety in the future which this change makes slightly easier (just remove the IsELF bool). Differential Revision: https://reviews.llvm.org/D47745 Patch by Wouter van Oortmerssen llvm-svn: 333964	2018-06-04 22:53:36 +00:00
Sanjay Patel	dcb8d304c3	[InstCombine] refine UB-handling in shuffle-binop transform As noted in rL333782, we can be both better for optimization and safer with this transform: BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask The only potentially unsafe-to-speculate binops are integer div/rem. All other binops are always safe (although I don't see a way to assert that in code here). For opcodes like shifts that can produce poison, it can't matter here because we know the lanes with undef are dropped by the subsequent shuffle. Differential Revision: https://reviews.llvm.org/D47686 llvm-svn: 333962	2018-06-04 22:26:45 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Jessica Paquette	aa087327ce	[MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.h This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952	2018-06-04 21:14:16 +00:00
Alexander Ivchenko	2f038c4094	[X86][ELF][CET] Adding the .note.gnu.property ELF section in X86 In preparation for the proposed linker ABI changes (https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf, https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-cet.pdf), this patch enables emission of the .note.gnu.property section to ELF object files when building CET-enabled modules. patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D47145 llvm-svn: 333951	2018-06-04 21:07:35 +00:00
Scott Linder	ba81d7f1eb	[CodeGen] Always update divergence in SelectionDAG::UpdateNodeOperands Some overloads failed to update divergence. Differential Revision: https://reviews.llvm.org/D47148 llvm-svn: 333947	2018-06-04 20:19:45 +00:00
Zachary Turner	63db25ba0d	[Support] Add functions that operate on native file handles on Windows. Windows' CRT has a limit of 512 open file descriptors, and fds which are generated by converting a HANDLE via _get_osfhandle count towards this limit as well. Regardless, often you find yourself marshalling back and forth between native HANDLE objects and fds anyway. If we know from the getgo that we're going to need to work directly with the handle, we can cut out the marshalling layer while also not contributing to filling up the CRT's very limited handle table. On Unix these functions just delegate directly to the existing set of functions since an fd is the native file type. It would be nice, very long term, if we could convert most uses of fds to file_t. Differential Revision: https://reviews.llvm.org/D47688 llvm-svn: 333945	2018-06-04 19:38:11 +00:00
Amaury Sechet	da661e9236	[DAGcombine] Teach the combiner about -a = ~a + 1 Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46505 llvm-svn: 333943	2018-06-04 19:23:22 +00:00
Amaury Sechet	93a7d2aa3c	Get rid of SETCCE Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47685 llvm-svn: 333939	2018-06-04 18:36:22 +00:00
Dmitry Mikulin	4539487650	In thin and full LTO + CFI, direct function calls may go through jump table entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets, except in cases when they can be pre-empted. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 333937	2018-06-04 18:18:12 +00:00
Craig Topper	1f956e2c5f	[X86] Don't pass ParitySrc array into isAddSubOrSubAddMask. Instead use a bool output parameter to get the real piece of info we care about. NFC The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient. llvm-svn: 333935	2018-06-04 17:58:45 +00:00
Stanislav Mekhanoshin	838c07c531	[AMDGPU] Small refactoring in the scheduler After last changes some code can be simplified. Differential Revision: https://reviews.llvm.org/D47661 llvm-svn: 333934	2018-06-04 17:57:40 +00:00
Stanislav Mekhanoshin	28624f94d5	[AMDGPU] Factored out common part of GCNRPTracker::reset() Differential Revision: https://reviews.llvm.org/D47664 llvm-svn: 333931	2018-06-04 17:21:54 +00:00
Sam Clegg	675a51750a	[MachO] Add out-of-bounds check to MachOObjectFile.cpp This is a followup to rL333496. Differential Revision: https://reviews.llvm.org/D47544 llvm-svn: 333929	2018-06-04 17:01:20 +00:00
Sam Clegg	537afe6f0e	[WebAssembly] Fix .td files after rL333900 Differential Revision: https://reviews.llvm.org/D47727 llvm-svn: 333928	2018-06-04 16:59:26 +00:00
John Brawn	c5a6392be3	[ValueTracking] Match select abs pattern when there's an sext involved When checking a select to see if it matches an abs, allow the true/false values to be a sign-extension of the comparison value instead of requiring that they're directly the comparison value, as all the comparison cares about is the sign of the value. This fixes a regression due to r333702, where we were no longer generating ctlz due to isKnownNonNegative failing to match such a pattern. Differential Revision: https://reviews.llvm.org/D47631 llvm-svn: 333927	2018-06-04 16:53:57 +00:00
Mark Searles	f0b93f1e9e	[AMDGPU][Waitcnt] Fix handling of flat instrs On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order. Differential Revision: https://reviews.llvm.org/D46616 llvm-svn: 333926	2018-06-04 16:51:59 +00:00
Simon Pilgrim	7c000d4267	[X86] Only accept const SelectionDAG to resolveTargetShuffleInputs/getFauxShuffleMask These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation. llvm-svn: 333925	2018-06-04 16:48:13 +00:00
Benjamin Kramer	f663eba561	[NVPTX] Delete dead code from the AsmPrinter. llvm-svn: 333924	2018-06-04 16:12:33 +00:00
Andrea Di Biagio	39e5a5695f	[RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca. This patch is the last of a sequence of three patches related to LLVM-dev RFC "MC support for variant scheduling classes". http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html This fixes PR36672. The main goal of this patch is to teach llvm-mca how to solve variant scheduling classes. This patch does that, plus it adds new variant scheduling classes to the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called dependency breaking instructions that are known to generate zero, and that are optimized out in hardware at register renaming stage). Without the BtVer2 change, this patch would not have had any meaningful tests. This patch is effectively the union of two changes: 1) a change that teaches llvm-mca how to resolve variant scheduling classes. 2) a change to the BtVer2 scheduling model that allows us to special-case packed XOR zero-idioms (this partially fixes PR36671). Differential Revision: https://reviews.llvm.org/D47374 llvm-svn: 333909	2018-06-04 15:43:09 +00:00
Krzysztof Parzyszek	623eb54361	[SelectionDAG] Add missing closing parentheses in comments, NFC llvm-svn: 333907	2018-06-04 14:54:53 +00:00
Nicolai Haehnle	59198ed040	AMDGPU: Make various NamedOperands upper case Summary: Avoid name clashes with the corresponding bit fields in the instruction encoding. Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f Reviewers: arsenm, rampitec, kzhuravl Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47433 llvm-svn: 333905	2018-06-04 14:45:20 +00:00
Nicolai Haehnle	01d261f18d	TableGen: Streamline the semantics of NAME Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900	2018-06-04 14:26:05 +00:00
Simon Dardis	fb4dde1142	[mips] Restore the availablity of trap for microMIPS Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47584 llvm-svn: 333895	2018-06-04 12:50:32 +00:00
Luke Geeson	43e4367961	[AArch64] Audit on rL333634 to fix FP16 Disasm BitPatterns llvm-svn: 333879	2018-06-04 09:41:32 +00:00
Sander de Smalen	d0a6f6a502	[AArch64][SVE] Fix range for DUP immediates (16bit elts) For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873	2018-06-04 07:24:23 +00:00
Sander de Smalen	fd54a781f6	[AArch64][SVE] Asm: Print indexed element 0 as FPR. Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872	2018-06-04 07:07:35 +00:00
Sander de Smalen	c33d668ab7	[AArch64][SVE] Asm: Support for indexed DUP instructions. Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871	2018-06-04 06:40:55 +00:00
Sander de Smalen	367a53b059	[AArch64][SVE] Asm: Support for FCPY immediate instructions. Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869	2018-06-04 05:58:06 +00:00
Sander de Smalen	512d57f1a5	[AArch64][SVE] Asm: Support for CPY immediate instructions Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868	2018-06-04 05:40:46 +00:00
Serguei Katkov	d894fb4288	[InstCombine] Fix div handling When we optimize select basing on fact that div by 0 is undef we should not traverse the instruction which are not guaranteed to transfer execution to next instruction. Guard intrinsic is an example. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47576 llvm-svn: 333864	2018-06-04 02:52:36 +00:00
Vedant Kumar	adbd27a599	[Debugify] Don't apply DI before the bitcode writer pass Applying synthetic debug info before the bitcode writer pass has no testing-related purpose. This commit prevents that from happening. It also adds tests which check that IR produced with/without -debugify-each enabled is identical after stripping. This makes it possible to check that individual passes (or full pipelines) are invariant to debug info. llvm-svn: 333861	2018-06-04 00:11:49 +00:00
Craig Topper	9923eac358	[X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked intrinsics and select instructions. llvm-svn: 333857	2018-06-03 23:24:17 +00:00
Lang Hames	d6155ff002	[ORC] Add a constructor to create an IRMaterializationUnit from a module and pre-existing SymbolFlags and SymbolToDefinition maps. This constructor is useful when delegating work from an existing IRMaterialiaztionUnit to a new one, as it avoids the cost of re-computing these maps. llvm-svn: 333852	2018-06-03 19:22:48 +00:00
Sanjay Patel	3bd957b7ae	[InstCombine] improve sub with bool folds There's a patchwork of existing transforms trying to handle these cases, but as seen in the changed test, we weren't catching them all. llvm-svn: 333845	2018-06-03 16:35:26 +00:00
Amaury Sechet	99909e9308	Remove SETCCE use from Lanai's backend Summary: This creates a small perf regression, but after talking with Jacques Pienaar, he was good with it to get things moving toward removng SETCCE. Reviewers: jpienaar, bryant Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47626 llvm-svn: 333838	2018-06-03 12:56:24 +00:00
Ivan A. Kosarev	60a991ed1a	[NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825	2018-06-02 16:40:03 +00:00
Ivan A. Kosarev	73c5337a64	Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)" The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824	2018-06-02 16:38:38 +00:00
Michael J. Spencer	ae6eeaea92	[MC] Add assembler support for .cg_profile. Object FIle Representation At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like: .cg_profile a, b, 32 .cg_profile freq, a, 11 .cg_profile freq, b, 20 When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight). Differential Revision: https://reviews.llvm.org/D44965 llvm-svn: 333823	2018-06-02 16:33:01 +00:00
Craig Topper	93d8fbd8f2	[X86] Add tied source operand to AVX5124FMAPS and AVX5124VNNIW instructions. This doesn't affect the assembly or disassembly, but is more accurate. llvm-svn: 333822	2018-06-02 16:30:39 +00:00
Craig Topper	27234f1d8f	[X86] Fix warning message for AVX5124FMAPS and AVX5124VNNIW instructions in the assembly parser. The caret was positioned on the wrong operand. It's too hard to get right so just put the caret at the beginning of the instruction. llvm-svn: 333821	2018-06-02 16:30:36 +00:00
Sanjay Patel	bbc6d60677	[InstCombine] call simplify before trying vector folds As noted in the review thread for rL333782, we could have made a bug harder to hit if we were simplifying instructions before trying other folds. The shuffle transform in question isn't ever a simplification; it's just a canonicalization. So I've renamed that to make that clearer. This is NFCI at this point, but I've regenerated the test file to show the cosmetic value naming difference of using instcombine's RAUW vs. the builder. Possible follow-ups: 1. Move reassociation folds after simplifies too. 2. Refactor common code; we shouldn't have so much repetition. llvm-svn: 333820	2018-06-02 16:27:44 +00:00
Ivan A. Kosarev	51f19b9ee1	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819	2018-06-02 16:26:42 +00:00
Fangrui Song	8ca769d204	[Support] Remove unused raw_ostream::handle whose anchor role was superseded by anchor() llvm-svn: 333817	2018-06-02 06:00:35 +00:00
Craig Topper	1534929623	[X86] Add encoding information for the AVX5124FMAPS and AVX5124VNNIW instructions so they can be assembled and disassembled. These instructions are unusual in that they operate on 4 consecutive registers so supporting them in codegen will be more difficult than normal. Includes an assembler check to warn if the source register is not the first register of a 4 register group. llvm-svn: 333812	2018-06-02 02:15:10 +00:00
Chandler Carruth	9281503e8f	[PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses. Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811	2018-06-02 01:29:01 +00:00
Roman Tereshin	cf88ffaaf9	[DebugInfo] Refactoring DIType::setFlags to DIType::cloneWithFlags, NFC and using the latter in DIBuilder::createArtificialType and DIBuilder::createObjectPointerType methods as well as introducing mirroring DISubprogram::cloneWithFlags and DIBuilder::createArtificialSubprogram methods. The primary goal here is to add createArtificialSubprogram to support a pass downstream while keeping the method consistent with the existing ones and making sure we don't encourage changing already created DI-nodes. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D47615 llvm-svn: 333806	2018-06-01 23:15:09 +00:00
Craig Topper	3828ce7eab	[X86] Do something sensible when an expand load intrinsic is passed a 0 mask. Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain. llvm-svn: 333804	2018-06-01 22:59:07 +00:00
Vedant Kumar	7224c08141	Add a debug dump for DbgValueHistoryMap This makes it easier to inspect the results of DbgValueHistoryCalculator. Differential Revision: https://reviews.llvm.org/D47663 llvm-svn: 333801	2018-06-01 22:33:15 +00:00
Craig Topper	aa747412b1	[X86] Add isel patterns to use vexpand with zero masking when the passthru value is a zero vector. llvm-svn: 333800	2018-06-01 22:28:28 +00:00
Zachary Turner	b44d7a0da1	Move some function declarations out of WindowsSupport.h The idea behind WindowsSupport.h is that it's in the source directory so that windows.h'isms don't leak out into the larger LLVM project. To that end, any symbol that references a symbol from windows.h must be in this private header, and not in a public header. However, we had some useful utility functions in WindowsSupport.h which have no dependency on the Windows API, but still only make sense on Windows. Those functions should be usable outside of Support since there is no risk of causing a windows.h leak. Although this introduces some preprocessor logic in some header files, It's not too egregious and it's better than the alternative of duplicating a ton of code. Differential Revision: https://reviews.llvm.org/D47662 llvm-svn: 333798	2018-06-01 22:23:46 +00:00
Karl-Johan Karlsson	6d52e5c3e4	[ConstantFold] Disallow folding vector geps into bitcasts Summary: Getelementptr returns a vector of pointers, instead of a single address, when one or more of its arguments is a vector. In such case it is not possible to simplify the expression by inserting a bitcast of operand(0) into the destination type, as it will create a bitcast between different sizes. Reviewers: majnemer, mkuper, mssimpso, spatel Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D46379 llvm-svn: 333783	2018-06-01 19:34:35 +00:00
Sanjay Patel	66f7e19f6a	[InstCombine] fix vector shuffle transform to replace undef elements (PR37648) This bug: https://bugs.llvm.org/show_bug.cgi?id=37648 ...was created with the enhancement to this transform with rL332479. The urem test shows the disaster potential: any undef divisor lane makes the whole op undef. The test diffs show that vector demanded elements turns some of the potential, but not all, unused binop operands back into undef already. llvm-svn: 333782	2018-06-01 19:23:18 +00:00
Simon Atanasyan	e80c3ce9cc	[mips] Support 64-bit offsets for lb/sb/ld/sd/lld ... instructions The `MipsAsmParser::loadImmediate` can load immediates of various sizes into a register. Idea of this change is to use `loadImmediate` in the `MipsAsmParser::expandMemInst` method to load offset into a register and then call required load/store instruction. The patch removes separate `expandLoadInst` and `expandStoreInst` methods and does everything in the `expandMemInst` method to escape code duplication. Differential Revision: https://reviews.llvm.org/D47316 llvm-svn: 333774	2018-06-01 16:37:53 +00:00
Simon Atanasyan	3a44bcf95a	[mips] Extend list of relocations supported by the `.reloc` directive Supporting GOT and TLS related relocations by the `.reloc` directive is useful for purpose of testing various tools like a linker, for example. llvm-svn: 333773	2018-06-01 16:37:42 +00:00
Krzysztof Parzyszek	bc68385dad	[Hexagon] Avoid UB when shifting unsigned integer left by 32 llvm-svn: 333771	2018-06-01 15:39:10 +00:00
Vlad Tsyrklevich	6867ab7c90	[ThinLTOBitcodeWriter] Emit summaries for regular LTO modules Summary: Emit summaries for bitcode modules that are only destined for the regular LTO portion of the build so they can participate in summary-based dead stripping. This change reduces the size of a nacl_helper build with cfi-icall enabled by 7%, removing the majority of the overhead due to enabling cfi-icall. The cfi-icall size increase was caused by compiling in lots of unused code and cfi-icall generating jumptable references to unused symbols that could no longer be removed by -Wl,-gc-sections. Increasing the visibility of summary-based dead stripping prevented jumptable entries being created for unused symbols from the regular LTO portion of the build. Reviewers: pcc Reviewed By: pcc Subscribers: dschuff, mehdi_amini, inglorion, eraman, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47594 llvm-svn: 333768	2018-06-01 15:20:47 +00:00
Nirav Dave	fc9a700f94	[DAG] Avoid checking for consecutive stores in store merge. NFCI. llvm-svn: 333766	2018-06-01 15:05:55 +00:00
Nirav Dave	39ece11ae5	[DAG] Simplify Expression. NFC. llvm-svn: 333765	2018-06-01 15:05:30 +00:00
Nirav Dave	0fc27acaa2	[DAG] Remove untriggerable check. NFCI. Candidate check precludes this check. llvm-svn: 333764	2018-06-01 15:05:05 +00:00
Nirav Dave	a74921a696	[DAG] Prune store merge legal store check to stop invalid size. NFCI. Do not consider store sizes large than the maximum legal store size. llvm-svn: 333763	2018-06-01 15:04:40 +00:00
Krzysztof Parzyszek	aec2c0c9b6	[Hexagon] Select HVX code for vector CTPOP, CTLZ, and CTTZ llvm-svn: 333760	2018-06-01 14:52:58 +00:00
Hiroshi Inoue	9796b47df1	[NFC] Zero initialize local variables This patch makes local variables zero initialized to avoid broken values in debug output. llvm-svn: 333754	2018-06-01 14:23:15 +00:00
Krzysztof Parzyszek	0b6187c1a9	[SelectionDAG] Expand UADDO/USUBO into ADD/SUBCARRY if legal for target Additionally, implement handling of ADD/SUBCARRY on Hexagon, utilizing the UADDO/USUBO expansion. Differential Revision: https://reviews.llvm.org/D47559 llvm-svn: 333751	2018-06-01 14:00:32 +00:00
Amaury Sechet	8467411dad	Set ADDE/ADDC/SUBE/SUBC to expand by default Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748	2018-06-01 13:21:33 +00:00
Amara Emerson	5a3bb68e12	[AArch64][GlobalISel] Zero-extend s1 values when returning. Before we were relying on the any extend of the s1 to s32, but for AAPCS we need to zero-extend it to at least s8. Fixes PR36719 Differential Revision: https://reviews.llvm.org/D47425 llvm-svn: 333747	2018-06-01 13:20:32 +00:00
Florian Hahn	8a17f1f43e	Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp. This is breaking the clang-with-thin-lto-ubuntu bot. llvm-svn: 333745	2018-06-01 12:58:43 +00:00
Sander de Smalen	f95ea047e5	[AArch64][SVE] Asm: Support for FDUP_ZI (copy fp immediate) instruction. Unpredicated copy of floating-point immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47482 llvm-svn: 333744	2018-06-01 12:54:46 +00:00
Simon Dardis	351aa594f6	[mips] Guard more aliases correctly. Also, duplicate an alias for microMIPS. llvm-svn: 333741	2018-06-01 10:57:13 +00:00
Florian Hahn	f4df554f32	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333740	2018-06-01 10:48:54 +00:00
Simon Dardis	54217598b6	[mips] Guard 'nop' properly and add mips16's nop instruction Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47583 llvm-svn: 333739	2018-06-01 10:46:00 +00:00
Pavel Labath	d6ca063907	DWARFAcceleratorTable: Add an iterator-based api for accessing names in the index Summary: Back when we were introducing the DWARF v5 name index, there was a short discussion whether we shouldn't have a nicer api for iterating over the index. At that time, I did not find it necessary since the iteration over names was done only from within the index itself (and I figured the internal implementation can deal with a slightly rough interface). However, now I ran into a use for this kind of API in LLDB (for finding all names matching a regular expression), so it looked like a nice opportunity to introduce one. To make the API more useful, I've made the NameTableEntry class a bit smarter: it now stores the string section reference (so it can return its name) and its position in the name index (mainly useful for dumping/logging). I also convert the internal users to use the new API, which also gives test coverage for the added code. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47590 llvm-svn: 333738	2018-06-01 10:33:11 +00:00
Simon Dardis	ee67dcb837	[mips] Select the correct instruction for computing frameindexes Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47582 llvm-svn: 333736	2018-06-01 10:07:10 +00:00
Gabor Buella	27c96d3d20	NFC Avoid a warning in WasmEHPrepare.cpp ``` ../lib/CodeGen/WasmEHPrepare.cpp:166:30: warning: extra ‘;’ [-Wpedantic] false, false); ^ ``` llvm-svn: 333732	2018-06-01 07:47:46 +00:00
Sander de Smalen	97ca6b9e09	[AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction. Unpredicated copy of repeating immediate pattern to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47328 llvm-svn: 333731	2018-06-01 07:25:46 +00:00
Craig Topper	c3cf55b935	[X86][Disassembler] Make it an error to set EVEX.R' to 0 when modrm.reg encodes a GPR. This is different than the behavior of EVEX.X extending modrm.rm to 5 bits. llvm-svn: 333728	2018-06-01 06:11:29 +00:00
Craig Topper	0838c4d6bc	[X86][Disassembler] Ignore EVEX.X extension of modrm.rm to 5-bits when modrm.rm encodes a k-register. llvm-svn: 333727	2018-06-01 05:36:08 +00:00
Craig Topper	74a61b02e0	[X86][Disassembler] Clamp index to 4-bits when decoding GPR registers. A 5-bit value can occur when EVEX.X is 0 due to it being used to extend modrm.rm to encode XMM16-31. But if modrm.rm instead encodes a GPR, the Intel documentation says EVEX.X should be ignored so just mask it to 4 bits once we know its a GPR. llvm-svn: 333725	2018-06-01 05:12:44 +00:00
Craig Topper	5b1dd01e57	[X86][Disassembler] Make sure EVEX.X is not used to extend base registers of memory operations. This was an accidental side effect of EVEX.X being used to encode XMM16-XMM31 using modrm.rm with modrm.mod==0x3. I think there are still more bugs related to this. llvm-svn: 333722	2018-06-01 04:29:34 +00:00
Craig Topper	c6b2c2bb70	[X86][Disassembler] Use a local variable instead of using a field in the instruction object. NFC llvm-svn: 333721	2018-06-01 04:29:30 +00:00
Tom Stellard	e43778895c	AMDGPU/R600: Move intrinsics to IntrinsicsAMDGPU.td Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D47487 llvm-svn: 333720	2018-06-01 02:19:46 +00:00

1 2 3 4 5 ...

113980 Commits