llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	d674e0ac0d	AMDGPU: Fix failure to select branch with optnone opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360	2017-10-10 20:34:49 +00:00
Rafael Espindola	12db383e20	Convert two uses of ErrorOr to Expected. llvm-svn: 315354	2017-10-10 20:00:07 +00:00
Yaxun Liu	de4b88d9a1	[AMDGPU] Lower enqueued blocks and generate runtime metadata This patch adds a post-linking pass which replaces the function pointer of enqueued block kernel with a global variable (runtime handle) and adds runtime-handle attribute to the enqueued block kernel. In LLVM CodeGen the runtime-handle metadata will be translated to RuntimeHandle metadata in code object. Runtime allocates a global buffer for each kernel with RuntimeHandel metadata and saves the kernel address required for the AQL packet into the buffer. __enqueue_kernel function in device library knows that the invoke function pointer in the block literal is actually runtime handle and loads the kernel address from it and puts it into AQL packet for dispatching. This cannot be done in FE since FE cannot create a unique global variable with external linkage across LLVM modules. The global variable with internal linkage does not work since optimization passes will try to replace loads of the global variable with its initialization value. Differential Revision: https://reviews.llvm.org/D38610 llvm-svn: 315352	2017-10-10 19:39:48 +00:00
Jake Ehrlich	36a2eb34ed	[llvm-objcopy] Add support for removing sections This change adds support for removing sections using the -R field (as GNU objcopy does as well). This change should let us add many helpful tests and is a proper stepping stone for adding more general kinds of stripping. Differential Revision: https://reviews.llvm.org/D38260 llvm-svn: 315346	2017-10-10 18:47:09 +00:00
Jake Ehrlich	c5ff72708d	Revert "temporary" I forgot to add a proper commit message. I'm reverting this to fix that. This reverts commit r315344. llvm-svn: 315345	2017-10-10 18:32:22 +00:00
Jake Ehrlich	77ec1ffe5c	temporary llvm-svn: 315344	2017-10-10 18:28:15 +00:00
Adrian Prantl	16b8b47152	Debug Info: Fix the SDLoc propagation for a DAGCombiner rule This patch ensures that the rule: fold (zext (load x)) -> (zext (truncate (zextload x))) propagates the SDLoc of the load to the zextload. <rdar://problem/33755881> llvm-svn: 315340	2017-10-10 18:08:32 +00:00
Francis Ricci	5776f26fa1	[llvm-objdump] Disable leak checking on an llvm-objdump test Summary: This leak doesn't reproduce locally on macOS 10.12, but is causing buildbot failures. Disable leak checking until it can be fixed. Reviewers: sqlbyme, qcolombet, enderby, bruno Reviewed By: bruno Subscribers: bruno, llvm-commits Differential Revision: https://reviews.llvm.org/D38699 llvm-svn: 315337	2017-10-10 17:50:57 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Jacob Gravelle	37af00e7d0	[WebAssembly] Narrow the scope of WebAssemblyFixFunctionBitcasts Summary: The pass to fix function bitcasts generates thunks for functions that are called directly with a mismatching signature. It was also generating thunks in cases where the function was address-taken, causing aliasing problems in otherwise valid cases. This patch tightens the restrictions for when the pass runs. Reviewers: sunfish, dschuff Subscribers: jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D38640 llvm-svn: 315326	2017-10-10 16:20:18 +00:00
Simon Pilgrim	053a299a9b	[X86][AVX512] Regenerate element insertion/extraction tests llvm-svn: 315322	2017-10-10 15:58:54 +00:00
Simon Dardis	96d35fe06a	[mips] Duplicate the reciprocal instruction definitions for FP32 Add instruction definitions for FP32 mode for recip.d and rsqrt.d. Previously these instructions were only defined when targeting the full 64-bit FPU model but were not guarded properly. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38400 llvm-svn: 315318	2017-10-10 14:41:11 +00:00
Jonas Devlieghere	aa6be823a4	Re-land "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Recommit after being reverted in r315299. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315316	2017-10-10 14:15:25 +00:00
Sanjay Patel	7d52c7ca74	[x86] add tests for insertelement; NFC llvm-svn: 315312	2017-10-10 13:45:25 +00:00
Simon Dardis	a17a7b619a	[mips] Partially fix PR34391 Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser which also rendered the operand to the instruction. In some cases the general parser could construct an MCExpr which was not a MCConstantExpr which MipsAsmParser was expecting. Address this by altering the special handling to cope with unexpected inputs and fine-tune the handling of cases where an register name that is not available in the current ABI is regarded as not a match for the custom parser but also not as an outright error. Also enforces the binutils restriction that only constants are accepted. This partially resolves PR34391. Thanks to Ed Maste for reporting the issue! Reviewers: nitesh.jain, arichardson Differential Revision: https://reviews.llvm.org/D37476 llvm-svn: 315310	2017-10-10 13:34:45 +00:00
David Stuttard	51c1b22806	[DAGCombine] Fix for shuffle to vector extend for non power 2 vectors Summary: See https://llvm.org/PR33743 for more details It seems that for non-power of 2 vector sizes, the algorithm can produce non-matching sizes for input and result causing an assert. This usually isn't a problem as the isAnyExtend check will weed these out, but in some cases (most often with lots of undefined values for the mask indices) it can pass this check for non power of 2 vectors. Adding in an extra check that ensures that bit size will match for the result and input (as required) Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D35241 llvm-svn: 315307	2017-10-10 12:45:45 +00:00
Oliver Stannard	30b732c942	[ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs Previously, the code that implemented the GNU assembler aliases for the LDRD and STRD instructions (where the second register is omitted) assumed that the input was a valid instruction. This caused assertion failures for every example in ldrd-strd-gnu-bad-inst.s. This improves this code so that it bails out if the instruction is not in the expected format, the check bails out, and the asm parser is run on the unmodified instruction. It also relaxes the alias on thumb targets, so that unaligned pairs of registers can be used. The restriction that Rt must be even-numbered only applies to the ARM versions of these instructions. Differential revision: https://reviews.llvm.org/D36732 llvm-svn: 315305	2017-10-10 12:38:22 +00:00
Oliver Stannard	cd3306f62f	[ARM, Asm] Add diagnostics for floating-point register operands This adds diagnostic strings for the ARM floating-point register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, DPR, requires C++ code to select the correct error message, as that class contains different registers depending on the FPU. The rest can all have their diagnostic strings stored in the tablegen decription of them. Differential revision: https://reviews.llvm.org/D36693 llvm-svn: 315304	2017-10-10 12:35:09 +00:00
Oliver Stannard	bbad419e94	[ARM, Asm] Add diagnostics for general-purpose register operands This adds diagnostic strings for the ARM general-purpose register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, rGPR, requires C++ code to select the correct error message, as that class contains different registers in pre-v8 and v8 targets. The rest can all have their diagnostic strings stored in the tablegen description of them. Differential revision: https://reviews.llvm.org/D36692 llvm-svn: 315303	2017-10-10 12:31:53 +00:00
Nicolai Haehnle	312b64f4d7	AMDGPU: Split MUBUF offset into aligned components Summary: Atomic buffer operations do not work (and trap on gfx9) when the components are unaligned, even if their sum is aligned. Previously, we generated an offset of 4156 without an SGPR by splitting it as 4095 + 61 (immediate + inline constant). The highest offset for which we can do this correctly is 4156 = 4092 + 64. Fixes dEQP-GLES31.functional.ssbo.atomic.* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37850 llvm-svn: 315302	2017-10-10 12:22:23 +00:00
Jonas Devlieghere	5b0f885691	Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This reverts commit r315297. llvm-svn: 315299	2017-10-10 11:49:56 +00:00
Jonas Devlieghere	2eb95c33f6	[llvm-dwarfdump] Print type names in DW_AT_type DIEs This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315297	2017-10-10 11:24:41 +00:00
Oliver Stannard	29ffd3f1d9	[AsmParser] Add DiagnosticString to register classes in tablegen This allows a DiagnosticType and/or DiagnosticString to be associated with a RegisterClass in tablegen, so that we can emit diagnostics in the assembler when a register operand is incorrect. DiagnosticType creates a predictable enum value, which gets returned as the error code when an operand does not match, and can be used by the assembly parser to map to a user-facing diagnostic. DiagnosticString creates an anonymous enum value (currently based on the tablegen class name), and a function to map from enum values to strings will be generated. Both of these work the same was as they do for AsmOperand. This isn't used by any targets yet, but has one (positive) side-effect. It improves the diagnostic codes returned by validateOperandClass - we always want to emit the diagnostic that relates to the expected operand class, but this wasn't always being done when the expected and actual classes were completely different (token/register/custom). This causes a few AArch64 diagnostics to be improved, as Match_InvalidOperand was being returned instead of a specific diagnostic type. Differential revision: https://reviews.llvm.org/D36691 llvm-svn: 315295	2017-10-10 11:00:40 +00:00
Gadi Haber	2b132eb4f8	[X86][SKYLAKE] Update regression test to differentiate between HASWELL and SKYLAKE scheduling.<NFC> NFC. Updated 6 regression tests to differentiate between HASWELL and SKYLAKE scheduling information. The fix is in preparation of a patch to update the information of the Skylake Client scheduling to include the appropriate load and store latencies. Reviewers: zvi, RKSimon Differential Revision: https://reviews.llvm.org/D38685 Change-Id: Ifc6b98d9eaf266913698f24c766fd994fc977555 llvm-svn: 315291	2017-10-10 09:53:18 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Nemanja Ivanovic	7bf866eb10	Fix for PR34888. The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285	2017-10-10 08:46:10 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Craig Topper	a88306e6fb	[AVX512] Add patterns to commute integer comparison instructions during isel. This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass. llvm-svn: 315274	2017-10-10 06:36:46 +00:00
Xinliang David Li	4cdc9dab0a	Renable r314928 Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272	2017-10-10 05:07:54 +00:00
Reid Kleckner	97a2d5c42f	[MC] Properly diagnose badly scoped .cfi_ directives Removes two report_fatal_errors. Implement this by removing EmitCFICommon, and do the checking in getCurrentDwarfFrameInfo. Have the callers check for null before dereferencing it. llvm-svn: 315264	2017-10-10 01:49:21 +00:00
Reid Kleckner	78eb8b912f	Give a test a triple llvm-svn: 315263	2017-10-10 01:34:31 +00:00
Reid Kleckner	e52d1e6787	[SEH] Use reportError instead of report_fatal_error for bad directives This makes the .seh_ directives slightly more usable from standalone assembly files. This removes a large number of report_fatal_errors and recovers from the error by ignoring the directive. llvm-svn: 315262	2017-10-10 01:26:25 +00:00
Reid Kleckner	ab23dace56	[MC] Suppress .Lcfi labels when emitting textual assembly Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - \| llvm-mc \| llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259	2017-10-10 00:57:36 +00:00
Aditya Nandakumar	c3bfc81a1f	[GISel]: Fix generation of illegal COPYs during CallLowering We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240	2017-10-09 20:07:43 +00:00
Zvi Rackover	c1d5955684	[X86] Unsigned saturation subtraction canonicalization [the backend part] Summary: On behalf of julia.koval@intel.com The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints). umax(a,b) - b -> subus(a,b) a - umin(a,b) -> subus(a,b) There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987). The example of special case code: ``` void foo(unsigned short p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = --p; p = (unsigned short)(m >= max ? m-max : 0); } } ``` Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero. Here is the table of types, I try to support, special case items are bold: \| Size \| 128 \| 256 \| 512 \| ----- \| ----- \| ----- \| ----- \| i8 \| v16i8 \| v32i8 \| v64i8 \| i16 \| v8i16 \| v16i16 \| v32i16 \| i32 \| \| v8i32* \| v16i32 \| i64 \| \| \| v8i64 Reviewers: zvi, spatel, DavidKreitzer, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37534 llvm-svn: 315237	2017-10-09 20:01:10 +00:00
Alexey Bataev	6ab4d075ff	[SLP] Add test for reversed load, NFC. llvm-svn: 315232	2017-10-09 19:08:15 +00:00
Daniel Sanders	4d4e7650dc	[globalisel] Add support for ValueType operands in patterns. It's rare but there are a small number of patterns like this: (set i64:$dst, (add i64:$src1, i64:$src2)) These should be equivalent to register classes except they shouldn't check for a specific register bank. This doesn't occur in AArch64/ARM/X86 but does occasionally come up in other in-tree targets such as BPF. llvm-svn: 315226	2017-10-09 18:14:53 +00:00
Francis Ricci	01ab402463	[dsymutil] Emit valid debug locations when no symbol flags are set Summary: swiftc emits symbols without flags set, which led dsymutil to ignore them when searching for global symbols, causing dwarf location data to be omitted. Xcode's dsymutil handles this case correctly, and emits valid location data. Add this functionality to llvm-dsymutil by allowing parsing of symbols with no flags set. Reviewers: aprantl, friss, JDevlieghere Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38587 llvm-svn: 315218	2017-10-09 17:27:47 +00:00
Alexey Bataev	aadc2331e4	[SLP] Test for wrongly vectorized set of extractelements, NFC. llvm-svn: 315217	2017-10-09 17:14:03 +00:00
Zachary Turner	bd3a9dbabb	[llvm-rc] Have the tokenizer discard single & block comments. This allows rc files to have comments. Eventually we should just use clang's c preprocessor, but that's a bit larger effort for minimal gain, and this is straightforward. Differential Revision: https://reviews.llvm.org/D38651 llvm-svn: 315207	2017-10-09 15:46:13 +00:00
Sanjay Patel	2a61a821a0	[DAG] combine assertsexts around a trunc This was a suggested follow-up to: D37017 / https://reviews.llvm.org/rL313577 llvm-svn: 315206	2017-10-09 15:22:20 +00:00
Amara Emerson	24ca39ce71	[AArch64] Improve codegen for inverted overflow checking intrinsics E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the inverted condition code for the CSEL. rdar://28495949 Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D38160 llvm-svn: 315205	2017-10-09 15:15:09 +00:00
Sanjay Patel	8557e29408	[x86] regenerate test checks; NFC llvm-svn: 315204	2017-10-09 15:01:58 +00:00
Sanjay Patel	be37ab864c	[AArch64] fix typos in test assertions llvm-svn: 315203	2017-10-09 01:29:54 +00:00
Craig Topper	4f8656a7af	[X86] Enable extended comparison predicate support for SETUEQ/SETONE when targeting AVX instructions. We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX. Differential Revision: https://reviews.llvm.org/D38609 llvm-svn: 315201	2017-10-09 01:05:15 +00:00
Simon Pilgrim	135a2639f4	[X86][SSE] Add test case for PR27708 llvm-svn: 315186	2017-10-08 19:18:10 +00:00
Craig Topper	977c546b0c	[X86] Regenerate fast-isel-select-pseudo-cmov.ll to prepare for D38609. llvm-svn: 315184	2017-10-08 17:54:50 +00:00
Simon Pilgrim	dc32c844f9	[X86] getTargetConstantBitsFromNode - add support for decoding scalar constants llvm-svn: 315182	2017-10-08 17:21:18 +00:00
Craig Topper	c97775c03c	[X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns Summary: We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD. Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary. Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd. This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type. The rest of the patch is removing all the excess scalar patterns. Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38023 llvm-svn: 315181	2017-10-08 16:57:23 +00:00
Amara Emerson	1f83777bd3	[AArch64][GlobalISel] Add a test case for G_PHI of p0 instruction selection. llvm-svn: 315179	2017-10-08 15:29:35 +00:00
Amara Emerson	e205cab66f	[AArch64][GlobalISel] Add a test case for G_PHI of p0 regbank selection. llvm-svn: 315178	2017-10-08 15:29:31 +00:00
Amara Emerson	1cd89ca669	[AArch64][GlobalISel] Make G_PHI of p0 types legal. Differential Revision: https://reviews.llvm.org/D38621 llvm-svn: 315177	2017-10-08 15:29:11 +00:00
Simon Pilgrim	6410ca70aa	[X86][XOP] Add XOP oddshuffles tests XOP codegen is often different to generic AVX - thank you vpperm! llvm-svn: 315176	2017-10-08 12:58:15 +00:00
Gadi Haber	684944b822	[X86][SKX] Adding the scheduling information for the SKX target. Adding the scheduling information for the SkylakeServer (SKX) target. This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613. Please expect some performance fluctuations due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu Differential Revision: https://reviews.llvm.org/D38443 Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185 llvm-svn: 315175	2017-10-08 12:52:54 +00:00
Craig Topper	bbca2f2978	[X86] Stop LowerSIGN_EXTEND_AVX512 from creating v8i16/v16i16/v16i8 vselects with a v8i1/v16i1 condition when BWI is not available. Some of the tests in vector-shuffle-v1.ll would get into an infinite loop without this. llvm-svn: 315172	2017-10-08 08:50:59 +00:00
Craig Topper	27170fee8d	[X86] If we see an insert of a bitcast into zero vector, canonicalize it to move the bitcast to the other side of the insert. This improves detection of zeroing of upper bits during isel. llvm-svn: 315161	2017-10-08 01:33:41 +00:00
Simon Pilgrim	9508fe7924	[X86][SSE] Match bitcasted BUILD_VECTOR of constants for v2i64 shifts on 64-bit targets (PR34855) Extension to rL315155, generate constant shifts on 64-bits as well as 32-bits. llvm-svn: 315156	2017-10-07 17:57:22 +00:00
Simon Pilgrim	70e1db78db	[X86][SSE] Match bitcasted v4i32 BUILD_VECTORS for v2i64 shifts on 64-bit targets (PR34855) We were already doing this for 32-bit targets, but we can generate these on 64-bits as well. llvm-svn: 315155	2017-10-07 17:42:17 +00:00
Craig Topper	2f60295364	[X86] Add X86ISD::CMOV to computeKnownBitsForTargetNode and ComputeNumSignBitsForTargetNode. Summary: Implementations based on ISD::SELECT. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38663 llvm-svn: 315153	2017-10-07 16:51:19 +00:00
Sanjay Patel	865e6c852b	[InstSimplify] add tests to show we can do better at folding poison; NFC llvm-svn: 315152	2017-10-07 15:39:06 +00:00
Simon Pilgrim	73f143e774	[X86][SSE] Improve shuffling combining with horizontal operations Recognise cases when we can merge the shuffles with their horizontal (HADD/HSUB/PACK) instruction inputs. Replaces an older implementation which performed some of this during lowering, expanding an existing target shuffle combine stage instead. Differential Revision: https://reviews.llvm.org/D38506 llvm-svn: 315150	2017-10-07 12:42:23 +00:00
Jessica Paquette	13593843f6	[MachineOutliner] Disable outlining from LinkOnceODRs by default Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, and D,E,F from another function (where letters are instructions). Now those functions are not identical, and cannot be deduped. Locally to M1 and M2, these outlining choices would be good-- to the whole program, however, this might not be true! To mitigate this, this commit makes it so that the outliner sees linkonceodr functions as unsafe to outline from. It also adds a flag, -enable-linkonceodr-outlining, which allows the user to specify that they want to outline from such functions when they know what they're doing. Changing this handles most code size regressions in the test suite caused by competing with linker dedupe. It also doesn't have a huge impact on the code size improvements from the outliner. There are 6 tests that regress > 5% from outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most tests either improve or are not impacted. Not outlined vs outlined without linkonceodrs: https://hastebin.com/raw/qeguxavuda Not outlined vs outlined with linkonceodrs: https://hastebin.com/raw/edepoqoqic Outlined with linkonceodrs vs outlined without linkonceodrs: https://hastebin.com/raw/awiqifiheb Numbers generated using compare.py with -m size.__text. Tests run for AArch64 with -Oz -mllvm -enable-machine-outliner -mno-red-zone. llvm-svn: 315136	2017-10-07 00:16:34 +00:00
Sanjay Patel	72d339abb7	[InstCombine] use correct type when propagating constant condition in simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130	2017-10-06 23:43:06 +00:00
Cameron McInally	9d64101fe8	[AVX512] Fix TERNLOG when folding broadcast Patch to fix ternlog instructions with a folded broadcast. The broadcast decorator, e.g. {1toX}, was missing. Differential Revision: https://reviews.llvm.org/D38649 llvm-svn: 315122	2017-10-06 22:31:29 +00:00
Jonas Devlieghere	f2fa9ebe3f	[dwarfdump] Verify that unit type matches root DIE This patch adds two new verifiers: - It checks that the root DIE of a CU is actually a valid unit DIE. (based on its tag) - For DWARF5 which contains a unit type int he CU header, it checks that this matches the type of the unit DIE. Differential revision: https://reviews.llvm.org/D38453 llvm-svn: 315121	2017-10-06 22:27:31 +00:00
Zachary Turner	a92eb33ad5	[llvm-rc] Implement escape sequences in .rc files. This allows the escape sequences (\a, \n, \r, \t, \\, \x[0-9a-f], \[0-7], "") to appear in .rc scripts. These are parsed and output in the same way as it's done in original MS implementation. The way these sequences are processed depends on the type of the resource it resides in, and on whether the user declared the string to be "wide" or "narrow". I tried to maintain the maximum compatibility with the original tool (and fail in some erroneous situations that are accepted by .rc). However, there are some (extremely rare) cases where Microsoft tool outputs nonsense. I found it infeasible to detect such casses. Patch by Marek Sokolowski Differential Revision: https://reviews.llvm.org/D38426 llvm-svn: 315118	2017-10-06 22:05:15 +00:00
Zachary Turner	9d8b358a49	[llvm-rc] Serialize user-defined resources to .res files. This allows rc to serialize user-defined resources, as documented at: msdn.microsoft.com/en-us/library/windows/desktop/aa381054.aspx Escape sequences are yet unavailable, and are to be added in one of child patches. Patch by: Marek Sokolowski Differential Revision: https://reviews.llvm.org/D38423 llvm-svn: 315117	2017-10-06 21:52:15 +00:00
Zachary Turner	da366693bf	[llvm-rc] Serialize STRINGTABLE statements to .res file. This allows llvm-rc to serialize STRINGTABLE resources. These are output in an unusual way: we locate them at the end of the file, and strings are merged into bundles of max 16 strings, depending on their IDs, language, and characteristics. Ref: msdn.microsoft.com/en-us/library/windows/desktop/aa381050.aspx Patch by: Marek Sokolowski Differential Revision: https://reviews.llvm.org/D38420 llvm-svn: 315112	2017-10-06 21:30:55 +00:00
Zachary Turner	07bc04ff38	[llvm-rc] Serialize VERSIONINFO resources to .res files. This is now able to dump VERSIONINFO resources. Ref: msdn.microsoft.com/en-us/library/windows/desktop/aa381058.aspx Differential Revision: https://reviews.llvm.org/D38410 Patch by: Marek Sokolowski llvm-svn: 315110	2017-10-06 21:26:06 +00:00
Zachary Turner	c3ab013aa1	[llvm-rc] Serialize CURSOR and ICON resources to .res This is part 6 of llvm-rc serialization. This adds ability to output cursors and icons as resources. Unfortunately, we can't just copy .cur or .ico files to output - as each file might contain multiple images, each of them needs to be unpacked and stored as a separate resource. This forces us to parse cursor and icon contents. (Fortunately, these formats are pretty similar and can be processed by mostly common code). As test files are binary, here is a short explanation of .cur and .ico files stored: cursor.cur, cursor-8.cur, cursor-32.cur are sample correct cursor files, differing in their bit depth. icon-old.ico, icon-new.ico are sample correct icon files; icon-png.ico is a sample correct icon file in PNG format (instead of usual BMP); cursor-eof.cur is an incorrect cursor file - this is cursor.cur with some of its final bytes removed. cursor-bad-offset.cur is an incorrect cursor file - image header states that image data begins at offset 0xFFFFFFFF. Sample correct cursors and icons were created by Nico Weber. Patch by Marek Sokolowski Differential Revision: https://reviews.llvm.org/D37878 llvm-svn: 315109	2017-10-06 21:25:44 +00:00
Reid Kleckner	b6b210e61f	Revert "Roll forward r314928" This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char ' and 'int') _Ptr = (char)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108	2017-10-06 21:17:51 +00:00
Zachary Turner	420090af89	[llvm-rc] Add optional serialization support for DIALOG(EX) resources. This is part 5 of llvm-rc serialization support. This allows DIALOG and DIALOGEX to serialize if dialog-specific optional statements are provided. These are (as of now): CAPTION, FONT, and STYLE. Notably, FONT statement can take more than two arguments when describing DIALOGEX resources (as in msdn.microsoft.com/en-us/library/windows/desktop/aa381013.aspx). I made some changes to the parser to reflect this fact. Patch by Marek Sokolowski Differential Revision: https://reviews.llvm.org/D37864 llvm-svn: 315104	2017-10-06 20:51:20 +00:00
Adrian Prantl	59f30b8874	llvm-dwarfdump: Add an option to collect debug info quality metrics. At the last LLVM dev meeting we had a debug info for optimized code BoF session. In that session I presented some graphs that showed how the quality of the debug info produced by LLVM changed over the last couple of years. This is a cleaned up version of the patch I used to collect the this data. It is implemented as an extension of llvm-dwarfdump, adding a new --statistics option. The intended use-case is to automatically run this on the debug info produced by, e.g., our bots, to identify eyebrow-raising changes or regressions introduced by new transformations that we could act on. In the current form, two kinds of data are being collected: - The number of variables that have a debug location versus the number of variables in total (this takes into account inlined instances of the same function, so if a variable is completely missing form only one instance it will be found). - The PC range covered by variable location descriptions versus the PC range of all variables' containing lexical scopes. The output format is versioned and extensible, so I'm looking forward to both bug fixes and ideas for other data that would be interesting to track. Differential Revision: https://reviews.llvm.org/D36627 llvm-svn: 315101	2017-10-06 20:24:34 +00:00
Amara Emerson	ba0f79af92	[GlobalISel] Fix legalizer trying to process a deleted instruction. In some cases an instruction is deleted from the block during combining, however it can still exist in the legalizer worklist. This change modifies the combiner routines to add the given MI to the dead instruction list rather than trying to remove it from the block themselves. The responsibility is then on the caller to delete the instruction from the block and any worklists. Differential Revision: https://reviews.llvm.org/D38622 llvm-svn: 315092	2017-10-06 19:24:15 +00:00
Francis Ricci	85255eda91	Revert "[dsymutil] Emit valid debug locations when no symbol flags are set" This reverts commit r315082, which fails on non-darwin buildbots. llvm-svn: 315088	2017-10-06 18:19:52 +00:00
Saleem Abdulrasool	46a59fdab6	Bitcode: add an auto-upgrade for LTO section name The bitcode reader looks specifically for `__DATA, __objc_catlist` as a section name. However, SVN r304661 removed the spaces (the two names are functionally equivalent but do not compare equally lexicographically). This causes compatibility issues. Add an auto-upgrade path for removing the spaces as well as use the new name in the LTO plugin. llvm-svn: 315086	2017-10-06 18:06:59 +00:00
Zachary Turner	96b04b68ed	[lit] Improve tool substitution in lit. This addresses two sources of inconsistency in test configuration files. 1. Substitution boundaries. Previously you would specify a substitution, such as 'lli', and then additionally a set of characters that should fail to match before and after the tool. This was used, for example, so that matches that are parts of full paths would not be replaced. But not all tools did this, and those that did would often re-invent the set of characters themselves, leading to inconsistency. Now, every tool substitution defaults to using a sane set of reasonable defaults and you have to explicitly opt out of it. This actually fixed a few latent bugs that were never being surfaced, but only on accident. 2. There was no standard way for the system to decide how to locate a tool. Sometimes you have an explicit path, sometimes we would search for it and build up a path ourselves, and sometimes we would build up a full command line. Furthermore, there was no standardized way to handle missing tools. Do we warn, fail, ignore, etc? All of this is now encapsulated in the ToolSubst class. You either specify an exact command to run, or an instance of FindTool('<tool-name>') and everything else just works. Furthermore, you can specify an action to take if the tool cannot be resolved. Differential Revision: https://reviews.llvm.org/D38565 llvm-svn: 315085	2017-10-06 17:54:46 +00:00
Zachary Turner	c981448063	Run pyformat on lit code. llvm-svn: 315084	2017-10-06 17:54:27 +00:00
Diana Picus	57285d7a43	[ARM] GlobalISel: Make tests less strict These are intended as integration tests, so they shouldn't be too specific about what they're checking. llvm-svn: 315083	2017-10-06 17:47:27 +00:00
Francis Ricci	b468fd64f9	[dsymutil] Emit valid debug locations when no symbol flags are set Summary: swiftc emits symbols without flags set, which led dsymutil to ignore them when searching for global symbols, causing dwarf location data to be omitted. Xcode's dsymutil handles this case correctly, and emits valid location data. Add this functionality to llvm-dsymutil by allowing parsing of symbols with no flags set. Reviewers: aprantl, friss, JDevlieghere Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38587 llvm-svn: 315082	2017-10-06 17:43:37 +00:00
Stanislav Mekhanoshin	de42c29a68	[AMDGPU] New 64 bit div/rem expansion Old expansion was 20 VGPRs, 78 SGPRs and ~380 instructions. This expansion is 11 VGPRs, 12 SGPRs and ~120 instructions. Passes OpenCL conformance test_integer_ops quick_[u]long_math Differential Revision: https://reviews.llvm.org/D38607 llvm-svn: 315081	2017-10-06 17:24:45 +00:00
Dehao Chen	9bd60429e2	Directly return promoted direct call instead of rely on stripPointerCast. Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38603 llvm-svn: 315077	2017-10-06 17:04:55 +00:00
Diana Picus	e393bc72ee	[ARM] GlobalISel: Select shifts Unfortunately TableGen doesn't handle this yet: Unable to deduce gMIR opcode to handle Src (which is a leaf). Just add some temporary hand-written code to generate the proper MOVsr. llvm-svn: 315071	2017-10-06 15:39:16 +00:00
Diana Picus	a81a4b17e5	[ARM] GlobalISel: Map shift operands to GPRs llvm-svn: 315067	2017-10-06 14:52:43 +00:00
Francis Ricci	8aedfde298	[llvm-dsymutil] Add support for __swift_ast MachO DWARF section Summary: Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging, and which contains a byte-for-byte dump of the swiftmodule file. Add this feature to llvm-dsymutil. Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of `__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil (Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the __swift_ast section). Reviewers: aprantl, friss Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38504 llvm-svn: 315066	2017-10-06 14:49:20 +00:00
Diana Picus	2c95730450	[ARM] GlobalISel: Mark shifts as legal for s32 The new legalize combiner introduces shifts all over the place, so we should support them sooner rather than later. llvm-svn: 315064	2017-10-06 14:30:05 +00:00
Jonas Paulsson	c63ed222b8	[SystemZ] Enable machine scheduler. The machine scheduler (before register allocation) is enabled by default for SystemZ. The SelectionDAG scheduling preference now becomes source order scheduling (was regpressure). Review: Ulrich Weigand https://reviews.llvm.org/D37977 llvm-svn: 315063	2017-10-06 13:59:28 +00:00
Clement Courbet	028d2eb671	[MergeICmp][NFC] Make test tuple-four-int8.ll more readable. llvm-svn: 315062	2017-10-06 13:45:16 +00:00
Simon Pilgrim	a29dbdf2ca	[X86][SSE] Add SKX cpu tests to SSE/AVX scheduling tests (D38443) llvm-svn: 315061	2017-10-06 13:40:29 +00:00
Clement Courbet	d12c189e2e	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058	2017-10-06 13:02:24 +00:00
Clement Courbet	4e1bae8136	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed unit tests by making comparisons stable) This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583. llvm-svn: 315056	2017-10-06 12:12:35 +00:00
Xinliang David Li	bcd36f7c5a	Roll forward r314928 Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044	2017-10-06 05:15:25 +00:00
Francis Ricci	b4e77d98ed	Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section" Breaks aarch64 builders This reverts commit r315014. llvm-svn: 315034	2017-10-05 23:09:17 +00:00
Peter Collingbourne	715bcfe0c9	ModuleUtils: Stop using comdat members to generate unique module ids. It is possible for two modules to define the same set of external symbols without causing a duplicate symbol error at link time, as long as each of the symbols is a comdat member. So we cannot use them as part of a unique id for the module. Differential Revision: https://reviews.llvm.org/D38602 llvm-svn: 315026	2017-10-05 21:54:53 +00:00
Derek Schuff	885dc59297	[WebAssembly] Add the rest of the atomic loads Add extending loads and constant offset patterns A bit more refactoring of the tablegen to make the patterns fairly nice and uniform between the regular and atomic loads. Differential Revision: https://reviews.llvm.org/D38523 llvm-svn: 315022	2017-10-05 21:18:42 +00:00
Sanjay Patel	7ac2db6a48	[InstCombine] improve folds for icmp gt/lt (shr X, C1), C2 We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021	2017-10-05 21:11:49 +00:00
Francis Ricci	6ae88262a8	[dsymutil] Fix typo in swift-ast.test llvm-svn: 315017	2017-10-05 20:16:16 +00:00
Dehao Chen	16f01fb1db	Annotate VP prof on indirect call if it is ICPed in the profiled binary. Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38477 llvm-svn: 315016	2017-10-05 20:15:29 +00:00
Francis Ricci	2b513b5c99	[llvm-dsymutil] Add support for __swift_ast MachO DWARF section Summary: Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging, and which contains a byte-for-byte dump of the swiftmodule file. Add this feature to llvm-dsymutil. Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of `__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil (Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the __swift_ast section). Reviewers: aprantl, friss Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38504 llvm-svn: 315014	2017-10-05 20:03:01 +00:00
Rafael Espindola	42eb1f2ba9	Added phdr upper bound checks to ElfObject. Ensure the program_headers call will fail correctly if the program headers are larger than the underlying buffer. Patch by Parker Thompson! llvm-svn: 315012	2017-10-05 20:01:32 +00:00

1 2 3 4 5 ...

48136 Commits