llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	2a26a286db	AMDGPU/GlobalISel: Make f64 constants legal llvm-svn: 326101	2018-02-26 17:20:43 +00:00
Sanjay Patel	31a90468e1	[InstCombine] allow fdiv folds with less than fully 'fast' ops Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098	2018-02-26 16:02:45 +00:00
Francis Visoiu Mistrih	e4fae4d5b6	[CodeGen] Don't omit any redundant information in -debug output In r322867, we introduced IsStandalone when printing MIR in -debug output. The default behaviour for that was: 1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any redundant information. 2) When -debug-printing a MF entirely, don't print any redundant information. 3) When printing MIR, don't print any redundant information. I'd like to change 2) to: 2) When -debug-printing a MF entirely, don't omit any redundant information. Differential Revision: https://reviews.llvm.org/D43337 llvm-svn: 326094	2018-02-26 15:23:42 +00:00
Jonas Devlieghere	560ce2c70f	Re-land: "[Support] Replace HashString with djbHash." This patch removes the HashString function from StringExtraces and replaces its uses with calls to djbHash from DJB.h. This change is almost NFC. While the algorithm is identical, the djbHash implementation in StringExtras used 0 as its default seed while the implementation in DJB uses 5381. The latter has been shown to result in less collisions and improved avalanching and is used by the DWARF accelerator tables. Because some test were implicitly relying on the hash order, I've reverted to using zero as a seed for the following two files: lld/include/lld/Core/SymbolTable.h llvm/lib/Support/StringMap.cpp Differential revision: https://reviews.llvm.org/D43615 llvm-svn: 326091	2018-02-26 15:16:42 +00:00
Tim Renouf	832f90fa0c	[AMDGPU] Scratch setup fix on AMDPAL gfx9+ merge shader Summary: With OS type AMDPAL, the scratch descriptor is hardwired to be loaded from offset 0 of the global information table, whose low pointer is passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as the hardware reserves s0-s7. Reviewers: kzhuravl Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D42203 llvm-svn: 326088	2018-02-26 14:46:43 +00:00
Tim Renouf	f40707a2db	[LiveIntervals] Handle moving up dead partial write Summary: In the test case, the machine scheduler moves a dead write to a subreg up into the middle of a segment of the overall reg's live range, where the segment had liveness only for other subregs in the reg. handleMoveUp created an invalid live range, causing an assert a bit later. This commit fixes it to handle that situation. The segment is split in two at the insertion point, and the part after the split, and any subsequent segments up to the old position, are changed to be defined by the moved def. V2: Better test. Subscribers: MatzeB, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D43478 Change-Id: Ibc42445ddca84e79ad1f616401015d22bc63832e llvm-svn: 326087	2018-02-26 14:42:13 +00:00
David Zarzycki	f3fa88b288	Test commit llvm-svn: 326085	2018-02-26 13:05:18 +00:00
Jonas Devlieghere	370bf3ef49	Revert "[Support] Replace HashString with djbHash." It looks like some of our tests depend on the ordering of hashed values. I'm reverting my changes while I try to reproduce and fix this locally. Failing builds: lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/18388 lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/6743 lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/15607 llvm-svn: 326082	2018-02-26 12:05:18 +00:00
Jonas Devlieghere	b9ad175935	[Support] Replace HashString with djbHash. This removes the HashString function from StringExtraces and replaces its uses with calls to djbHash from DJB.h This is almost NFC. While the algorithm is identical, the djbHash implementation in StringExtras used 0 as its seed while the implementation in DJB uses 5381. The latter has been shown to result in less collisions and improved avalanching. https://reviews.llvm.org/D43615 (cherry picked from commit 77f7f965bc9499a9ae768a296ca5a1f7347d1d2c) llvm-svn: 326081	2018-02-26 11:30:13 +00:00
Benjamin Kramer	b84e158df7	[WebAssembly] Relax constexpr for old standard libraries. This will still be constexpr when the standard library supports it, but doesn't force constexpr. Old libraries will get a global constructor, which is not too bad. llvm-svn: 326080	2018-02-26 11:07:25 +00:00
Renato Golin	9d1b2acaaa	[LV] Move isLegalMasked* functions from Legality to CostModel All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079	2018-02-26 11:06:36 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00
Andrew V. Tischenko	083891925b	The final step to close D41278 [MachineCombiner] Improve debug output (NFC). Differential Revision: https://reviews.llvm.org/D41278 llvm-svn: 326074	2018-02-26 09:43:21 +00:00
Serguei Katkov	c2f74638ac	[SCEV] Factor out getUsedLoops The patch introduces the new function in ScalarEvolution to get all loops used in specified SCEV. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43504 llvm-svn: 326072	2018-02-26 09:26:41 +00:00
Serguei Katkov	a95d2aee7d	[SCEV] Introduce SCEVPostIncRewriter The patch introduces the SCEVPostIncRewriter rewriter which is similar to SCEVInitRewriter but rewrites AddRec with post increment value of this AddRec. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43499 llvm-svn: 326071	2018-02-26 08:40:18 +00:00
Jonas Paulsson	b1e81479e9	[XCore] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Robert Lytton llvm-svn: 326069	2018-02-26 08:03:32 +00:00
Serguei Katkov	339c2e8287	[SCEV] Extends the SCEVInitRewriter The patch introduces an additional parameter IgnoreOtherLoops to SCEVInitRewriter. if it is equal to true then rewriter will not invalidate result in case SCEV depends on other loops then specified during creation. The patch does not change the default behavior. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43498 llvm-svn: 326067	2018-02-26 07:08:56 +00:00
Craig Topper	5c980eba47	[X86] Don't use getZExtValue when we have no idea how large the input elements are. llvm-svn: 326066	2018-02-26 04:43:24 +00:00
Craig Topper	2286058f46	[X86] Use SelectionDAG::SplitVectorOperand to simplify some code. NFC llvm-svn: 326065	2018-02-26 02:16:34 +00:00
Craig Topper	2bf8e3e0e1	[X86] Simplify the ReplaceNodeResults code for X86ISD::AVG. This code seemed to try to widen to 128, 256, or 512 bit vectors, but we only create X86ISD::AVG with a power of 2 number of elements. This means the only nodes that need to be legalized are less than 128-bits and need to be widened up to 128 bits. llvm-svn: 326064	2018-02-26 02:16:33 +00:00
Craig Topper	79d189f597	[X86] Remove VT.isSimple() check from detectAVGPattern. Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size. llvm-svn: 326063	2018-02-26 02:16:31 +00:00
Nicolai Haehnle	0d4ad84aa2	TableGen: Remove VarInit::getFieldType It is redundant with the implementation in TypedInit. Change-Id: I8ab1fb5c77e4923f7eb3ffae5889f0f8af6093b4 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43678 llvm-svn: 326061	2018-02-25 20:50:17 +00:00
Nicolai Haehnle	85e4e95e6c	TableGen: Get rid of Init::getFieldInit Summary: FieldInit will just rely on the standardized resolving mechanism to give us DefInits for folding, thus simplifying the code. Unlike the removal of resolveListElementReference, this shouldn't have performance implications, because DefInits do not recurse inside their record. Change-Id: Id4544c774c9d9ee92f293615af6ecff706453f21 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43563 llvm-svn: 326060	2018-02-25 20:50:11 +00:00
Nicolai Haehnle	801403acb3	TableGen: Remove Init::resolveListElementReference Summary: Resolving a VarListElementInit should just resolve the list and then take its element. This eliminates a lot of duplicated logic and simplifies the next steps of refactoring resolveReferences. This does potentially cause sub-elements of the entire list to be resolved resulting in more work, but I didn't notice a measurable change in performance, and a later patch adds a caching mechanism that covers at least the common case of `var[i]` in a more generic way. Change-Id: I7b59185b855c7368585c329c31e5be38c5749dac Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43562 llvm-svn: 326059	2018-02-25 20:50:04 +00:00
Mandeep Singh Grang	46d02dee65	[DebugInfo] Stable sort symbols to remove non-deterministic ordering Summary: This fixes failure in DebugInfo/X86/multiple-aranges.ll uncovered by D39245. Reviewers: rafael, echristo, probinson Reviewed By: probinson Subscribers: probinson, llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D39950 llvm-svn: 326056	2018-02-25 19:52:34 +00:00
Craig Topper	6694df14e6	[X86] Use SDNode instead of SDPatternOperator. NFC llvm-svn: 326048	2018-02-25 06:21:04 +00:00
Simon Pilgrim	295e8b4e12	[TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ADD/SUB ops llvm-svn: 326044	2018-02-24 20:59:14 +00:00
Simon Pilgrim	c0dbdb86c3	[TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through TRUNCATE ops llvm-svn: 326043	2018-02-24 19:28:34 +00:00
Craig Topper	81c0eaf4c8	[X86] Allow int_x86_sse2_cvtps2dq and int_x86_avx_cvt_ps2dq_256 to select EVEX encoded instructions. llvm-svn: 326041	2018-02-24 18:58:07 +00:00
Adam Nemet	e4e1de60aa	Revert "StructurizeCFG: Test for branch divergence correctly" This reverts commit r325881. Breaks many bots llvm-svn: 326037	2018-02-24 17:29:09 +00:00
Simon Pilgrim	a4fb569483	[X86][SSE] combineSubToSubus - support v8i64 handling from SSSE3 Our UMIN/UMAX, vector truncation and shuffle combining is good enough to efficiently handle v8i64 with the number of leading zeros that are necessary for PSUBUS. llvm-svn: 326034	2018-02-24 14:06:39 +00:00
Simon Pilgrim	8ad91261e8	[X86][SSE] combineSubToSubus - support v8i32 handling from SSSE3 (not SSE41) Now that UMIN etc are Legal/Custom for SSE2+, we can efficiently match SUBUS v8i32 cases from SSSE3 which can perform efficient truncation with PSHUFB. llvm-svn: 326033	2018-02-24 13:39:13 +00:00
Simon Pilgrim	744f008a75	[X86][SSE] combineSubToSubus - begun generalizing to work with any type sizes with SplitBinaryOpsAndApply llvm-svn: 326030	2018-02-24 12:44:12 +00:00
Simon Pilgrim	51ce2ed367	Fix spelling in comment. NFCI. llvm-svn: 326029	2018-02-24 12:27:02 +00:00
Jonas Paulsson	8ff0773b13	[Sparc] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: James Y Knight llvm-svn: 326028	2018-02-24 08:24:31 +00:00
Craig Topper	161c805da4	[X86] Use SelectionDAG::getNot instead of implementing manually. NFC llvm-svn: 326020	2018-02-24 03:15:54 +00:00
Stanislav Mekhanoshin	fa48c496e2	[AMDGPU] Shrinking V_SUBBREV_U32 V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when we try to commute V_SUBB_U32 in order to shrink it we do not then process V_SUBBREV_U32 and it stay VOP3. This is fixed. Differential Revision: https://reviews.llvm.org/D43699 llvm-svn: 326011	2018-02-24 01:32:32 +00:00
Heejin Ahn	9386bde11b	[WebAssembly] Add exception handling option and feature Summary: Add a llc command line option and WebAssembly architecture feature for exception handling. Reviewers: dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D43683 llvm-svn: 326004	2018-02-24 00:40:50 +00:00
Pavel Labath	d99072bc97	Implement equal_range for the DWARF v5 accelerator table Summary: This patch implements the name lookup functionality of the .debug_names accelerator table and hooks it up to "llvm-dwarfdump -find". To make the interface of the two kinds of accelerator tables more consistent, I've created an abstract "DWARFAcceleratorTable::Entry" class, which provides a consistent interface to access the common functionality of the table entries (such as getting the die offset, die tag, etc.). I've also modified the apple table to vend entries conforming to this interface. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: vleschuk, clayborg, echristo, llvm-commits Differential Revision: https://reviews.llvm.org/D43067 llvm-svn: 326003	2018-02-24 00:35:21 +00:00
George Burgess IV	6f49f4a951	[MemorySSA] Remove a redundant dyn_cast. StartingAccess is a MemoryUseOrDef. No need to check again. llvm-svn: 326000	2018-02-24 00:15:21 +00:00
Craig Topper	7bcac492d4	[X86] Remove checks for '(scalar_to_vector (i8 (trunc GR32:)))' from scalar masked move patterns. This portion can be matched by other patterns. We don't need it to make the larger pattern valid. It's sufficient to have a v1i1 mask input without caring where it came from. llvm-svn: 325999	2018-02-24 00:15:05 +00:00
Yonghong Song	60fed1fef0	bpf: New optimization pass for eliminating unnecessary i32 promotions This pass performs peephole optimizations to cleanup ugly code sequences at MachineInstruction layer. Currently, the only optimization in this pass is to eliminate type promotion sequences for zero extending 32-bit subregisters to 64-bit registers. If the compiler could prove the zero extended source come from 32-bit subregistere then it is safe to erase those promotion sequece, because the upper half of the underlying 64-bit registers were zeroed implicitly already. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325991	2018-02-23 23:49:32 +00:00
Yonghong Song	ae961bb061	bpf: New decoder namespace for 32-bit subregister load/store When -mattr=+alu32 passed to the disassembler, use decoder namespace for 32-bit subregister. This is to disassemble load and store instructions in preferred B format as described in previous commit: w = (u8 ) (r + off) // BPF_LDX \| BPF_B w = (u16 )(r + off) // BPF_LDX \| BPF_H w = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = w // BPF_STX \| BPF_B (u16 )(r + off) = w // BPF_STX \| BPF_H (u32 )(r + off) = w // BPF_STX \| BPF_W NOTE: all other instructions should still use the default decoder namespace. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325990	2018-02-23 23:49:31 +00:00
Yonghong Song	ca31c3bb3f	bpf: Enable 32-bit subregister support for -mattr=+alu32 After all those preparation patches, now we could enable 32-bit subregister support once -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325989	2018-02-23 23:49:30 +00:00
Yonghong Song	fcd1e0f625	bpf: Support 32-bit subregister in various InstrInfo hooks This patch support 32-bit subregister in three InstrInfo hooks, i.e. copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot, Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325988	2018-02-23 23:49:29 +00:00
Yonghong Song	b1a52bd756	bpf: New instruction patterns for 32-bit subregister load and store The instruction mapping between eBPF/arm64/x86_64 are: eBPF arm64 x86_64 LD1 BPF_LDX \| BPF_B ldrb movzbl LD2 BPF_LDX \| BPF_H ldrh movzwl LD4 BPF_LDX \| BPF_W ldr movl movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax, the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And actually these instructions only accept sub-registers. There is no point to have LD1/2/4 (unsigned) for 64-bit register, because on these arches, upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into the smallest available register class is the best choice for maintaining type information. For eBPF we should adopt the same philosophy, to change current format (A): r = (u8 ) (r + off) // BPF_LDX \| BPF_B r = (u16 )(r + off) // BPF_LDX \| BPF_H r = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = r // BPF_STX \| BPF_B (u16 )(r + off) = r // BPF_STX \| BPF_H (u32 )(r + off) = r // BPF_STX \| BPF_W into B: w = (u8 ) (r + off) // BPF_LDX \| BPF_B w = (u16 )(r + off) // BPF_LDX \| BPF_H w = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = w // BPF_STX \| BPF_B (u16 )(r + off) = w // BPF_STX \| BPF_H (u32 )(r + off) = w // BPF_STX \| BPF_W There is no change on encoding nor how should they be interpreted, everything is as it is, load the specified length, write into low bits of the register then zeroing all remaining high bits. The only change is their associated register class and how compiler view them. Format A still need to be kept, because eBPF LLVM backend doesn't support sub-registers at default, but once 32-bit subregister is enabled, it should use format B. This patch implemented this together with all those necessary extended load and truncated store patterns. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325987	2018-02-23 23:49:28 +00:00
Yonghong Song	63cf273f55	bpf: Support i32 in getScalarShiftAmountTy method getScalarShiftAmount method should be implemented for eBPF backend to make sure shift amount could still get correct type once 32-bit subregisters support are enabled. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325986	2018-02-23 23:49:26 +00:00
Yonghong Song	59fc805c7e	bpf: Support condition comparison on i32 We need to support condition comparison on i32. All these comparisons are supposed to be combined into BPF_J* instructions which only support i64. For ISD::BR_CC we need to promote it to i64 first, then do custom lowering. For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64. For ISD::SELECT_CC, we also want to do custom lower for i32. However, after 32-bit subregister support enabled, it is possible the comparison operands are i32 while the selected value are i64, or the comparison operands are i64 while the selected value are i32. We need to define extra instruction pattern and support them in custom instruction inserter. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325985	2018-02-23 23:49:25 +00:00
Yonghong Song	219156cff0	bpf: Handle i32 for ALU operations without ISA support There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU, ADDC, ADDE etc on i32. They could be emulated by other basic BPF_ALU operations, we'd set their lowering action the same as i64. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325984	2018-02-23 23:49:24 +00:00
Yonghong Song	07a7a41753	bpf: New calling convention for 32-bit subregisters This patch add new calling conventions to allow GPR32RegClass as valid register class for arguments and return types. New calling convention will only be choosen when -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325983	2018-02-23 23:49:23 +00:00
Yonghong Song	42389377d8	bpf: New target attribute "alu32" for 32-bit subregister support This new attribute aims to control the enablement of 32-bit subregister support on eBPF backend. Name the interface as "alu32" is because we in particular want to enable the generation of BPF_ALU32 instructions by enable subregister support. This attribute could be used in the following format with llc: llc -mtriple=bpf -mattr=[+\|-]alu32 It is disabled at default. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325982	2018-02-23 23:49:22 +00:00
Yonghong Song	0252f35362	bpf: Define instruction patterns for extensions and truncations between i32 to i64 For transformations between i32 and i64, if it is explicit signed extension: - first cast the operand to i64 - then use SLL + SRA to finish the extension. if it is explicit zero extension: - first cast the operand to i64 - then use SLL + SRL to finish the extension. if it is explicit any extension: - just refer to 64-bit register. if it is explicit truncation: - just refer to 32-bit subregister. NOTE: Some of the zero extension sequences might be unnecessary, they will be removed by an peephole pass on MachineInstruction layer. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325981	2018-02-23 23:49:21 +00:00
Yonghong Song	3a564a8f6e	bpf: Tighten the immediate predication for 32-bit alu instructions These 32-bit ALU insn patterns which takes immediate as one operand were initially added to enable AsmParser support, and the AsmMatcher uses "ins" and "outs" fields to deduct the operand constraint. However, the instruction selector doesn't work the same as AsmMatcher. The selector will use the "pattern" field for which we are not setting the predication for immediate operands correctly. Without this patch, i32 would eventually means all i32 operands are valid, both imm and gpr, while these patterns should allow imm only. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325980	2018-02-23 23:49:19 +00:00
Yonghong Song	ec84e2f1b0	bpf: Use markSuperRegs to mark reserved registers markSuperRegs is the canonical helper function used to mark reserved registers. It could mark any overlapping sub-registers automatically. Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 325979	2018-02-23 23:49:18 +00:00
Nemanja Ivanovic	bcc82c9a78	[PowerPC] Disable shrink-wrapping when getting PC address through the LR The instruction sequence used to get the address of the PC into a GPR requires that we clobber the link register. Doing so without having first saved it in the prologue leaves the function unable to return. Currently, this sequence is emitted into the entry block. To ensure the prologue is inserted before this sequence, disable shrink-wrapping. This fixes PR33547. Differential Revision: https://reviews.llvm.org/D43677 llvm-svn: 325972	2018-02-23 23:08:34 +00:00
George Burgess IV	68ac941780	[MemorySSA] Fix a cache invalidation bug with removed accesses I suspect there's a deeper issue here, but we probably shouldn't be using INVALID_MEMORYSSA_ID as liveOnEntry's ID anyway. llvm-svn: 325971	2018-02-23 23:07:18 +00:00
Scott Linder	16c7bdaf32	[DebugInfo] Support DWARF v5 source code embedding extension In DWARF v5 the Line Number Program Header is extensible, allowing values with new content types. In this extension a content type is added, DW_LNCT_LLVM_source, which contains the embedded source code of the file. Add new optional attribute for !DIFile IR metadata called source which contains source text. Use this to output the source to the DWARF line table of code objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM to support optional source. Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output format of llvm-dwarfdump to make room for the new attribute on file_names entries, and support embedded sources for the -source option in llvm-objdump. Differential Revision: https://reviews.llvm.org/D42765 llvm-svn: 325970	2018-02-23 23:01:06 +00:00
Sanjay Patel	2db2769499	[InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFC llvm-svn: 325968	2018-02-23 22:38:10 +00:00
Eric Christopher	a70ec1308a	Sink the verification code around the assert where it's handled and wrap in NDEBUG. This has the advantage of making release only builds more warning free and there's no need to make this routine a class function if it isn't using class members anyhow. llvm-svn: 325967	2018-02-23 22:32:05 +00:00
Sanjay Patel	db53d1847b	[InstSimplify] sqrt(X) * sqrt(X) --> X This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965	2018-02-23 22:20:13 +00:00
Sriraman Tallam	609f8c013c	Intrinsics calls should avoid the PLT when "RtLibUseGOT" metadata is present. Differential Revision: https://reviews.llvm.org/D42216 llvm-svn: 325962	2018-02-23 21:32:06 +00:00
Sanjay Patel	d32104e1b2	[InstCombine] allow fmul-sqrt folds with less than full -ffast-math Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960	2018-02-23 21:16:12 +00:00
Eric Christopher	545932bec9	Simplify a DEBUG statement to remove a set but not used variable in release builds. llvm-svn: 325959	2018-02-23 21:14:47 +00:00
Craig Topper	16b20245ba	[X86] Add assembler/disassembler support for blendm with zero masking and broacast. Fixes PR31617 llvm-svn: 325957	2018-02-23 20:48:44 +00:00
Stefan Pintilie	626b651016	[Power9] Add missing instructions to the Power 9 scheduler This is the first in a series of patches that will define more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. Differential Revision: https://reviews.llvm.org/D43635 llvm-svn: 325956	2018-02-23 20:37:10 +00:00
Krzysztof Parzyszek	96690ceceb	[Hexagon] Recognize non-immediate constants in HexagonConstPropagation llvm-svn: 325954	2018-02-23 20:33:26 +00:00
Simon Pilgrim	69b8fa8391	Fixed unused variable warning. NFCI. llvm-svn: 325950	2018-02-23 20:16:18 +00:00
Craig Topper	61d6ddbf0a	[X86] Add DAG combine to remove (and X, 1) from in front of a v1i1 scalar to vector. These can be created by type legalization promoting the inputs to select to match scalar boolean contents. We were trying to pattern match them away during isel, but its better to just remove them from the DAG. I've cleaned up some patterns to not check for this 'and' anymore. But I suspect this has also opened up opportunities for pattern removal. llvm-svn: 325949	2018-02-23 20:13:42 +00:00
Benjamin Kramer	ae87f86ec4	[WebAssembly] Fix macro metaprogram to not duplicate code as much. No functionality change intended. llvm-svn: 325947	2018-02-23 20:13:03 +00:00
Simon Pilgrim	425965be0f	[X86][SSE] Generalize x > C-1 ? x+-C : 0 --> subus x, C combine for non-uniform constants llvm-svn: 325944	2018-02-23 19:58:44 +00:00
Benjamin Kramer	b941ababce	Shrink various scheduling tables by using narrower types. 16 bits ought to be enough for everyone. This shrinks clang by ~1MB. llvm-svn: 325941	2018-02-23 19:32:56 +00:00
Evandro Menezes	1afffac05b	[PATCH] [AArch64] Add new target feature to fuse conditional select This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939	2018-02-23 19:27:43 +00:00
Geoff Berry	d6ba3dbbbd	Fix compiler warning introduced in r325931. NFC. llvm-svn: 325938	2018-02-23 19:11:33 +00:00
Craig Topper	11704dcc72	[X86] Custom split v32i16/v64i8 bitcasts when AVX512F is available, but BWI is not. The test changes you can see are related to the changes in ReplaceNodeResults. Though shuffle-vs-trunc-512.ll does have a test that exercises the code in LowerBITCAST. Looks like the test output didn't change because DAG combining is able to clean up the resulting type legalization. Adding the custom hook just makes type legalization work less hard. Differential Revision: https://reviews.llvm.org/D43447 llvm-svn: 325933	2018-02-23 18:43:36 +00:00
Geoff Berry	f8bf2ec0a8	[MachineOperand][Target] MachineOperand::isRenamable semantics changes Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931	2018-02-23 18:25:08 +00:00
Matt Davis	523c656e25	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA. Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926	2018-02-23 17:38:27 +00:00
John Brawn	29bbed3613	[BPI] Detect branches in loops that make themselves not taken If we have a loop like this: int n = 0; while (...) { if (++n >= MAX) { n = 0; } } then the body of the 'if' statement will only be executed once every MAX iterations. Detect this by looking for branches in loops where taking the branch makes the branch condition evaluate to 'not taken' in the next iteration of the loop, and reduce the probability of such branches. This slightly improves EEMBC benchmarks on cortex-m4/cortex-m33 due to making better choices in if-conversion, but has no effect on any other cpu/benchmark that I could detect. Differential Revision: https://reviews.llvm.org/D35804 llvm-svn: 325925	2018-02-23 17:17:31 +00:00
Sanjay Patel	6b9c7a9c83	[InstCombine] refactor fmul with negated op folds; NFCI The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924	2018-02-23 17:14:28 +00:00
Sanjay Patel	4a9116e897	[InstCombine] use FMF-copying functions to reduce code; NFCI llvm-svn: 325923	2018-02-23 17:07:29 +00:00
Stefan Pintilie	15e6b10ee0	[PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9. The following set of instructions was originally planned to be added for Power 9 and so code was added to support them. However, a decision was made later on to withdraw support for these instructions in the hardware. xscmpnedp xvcmpnesp xvcmpnedp This patch removes support for the instructions that were not added. Differential Revision: https://reviews.llvm.org/D43641 llvm-svn: 325918	2018-02-23 15:55:16 +00:00
Petar Jovanovic	a7bd36e63e	[mips] finish removal of unused fields in MipsInstructionSelector r325916 missed to remove calls in constructor. llvm-svn: 325917	2018-02-23 15:47:05 +00:00
Petar Jovanovic	f49c5ce3a6	[mips] remove unused fields in MipsInstructionSelector Unused fields cause buildbreak if -Werror,-Wunused-private-field is passed. llvm-svn: 325916	2018-02-23 15:34:02 +00:00
Hans Wennborg	89c35fc44d	Support for the mno-stack-arg-probe flag Adds support for this flag. There is also another piece for clang (separate review). More info: https://bugs.llvm.org/show_bug.cgi?id=36221 By Ruslan Nikolaev! Differential Revision: https://reviews.llvm.org/D43107 llvm-svn: 325900	2018-02-23 13:46:25 +00:00
Hans Wennborg	35d6e944e1	llvm-config: Add advapi32 to --system-libs on Windows (PR36372) llvm-svn: 325894	2018-02-23 12:20:26 +00:00
Amaury Sechet	893a6b89ff	[DAGCOmbine] Ensure that (brcond (setcc ...)) is handled in a canonical manner. Summary: There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs. Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond. Reviewers: spatel, hfinkel, niravd, craig.topper Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D41235 llvm-svn: 325892	2018-02-23 11:50:42 +00:00
Nicolai Haehnle	c10570f7c6	Revert "TableGen: Fix typeIsConvertibleTo for record types" This reverts r325884. Clang's TableGen has dependencies on the exact ordering of superclasses. Revert this change fully for now to fix the build. Change-Id: Ib297f5571cc7809f00838702ad7ab53d47335b26 llvm-svn: 325891	2018-02-23 11:31:49 +00:00
Petar Jovanovic	fac93e28f0	[MIPS GlobalISel] Adding GlobalISel Add GlobalISel infrastructure up to the point where we can select a ret void. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D43583 llvm-svn: 325888	2018-02-23 11:06:40 +00:00
Nicolai Haehnle	c7711ba2ef	TableGen: Avoid using resolveListElementReference in TGParser A subsequent change intends to remove resolveListElementReference entirely. This part of the removal can be split out for better bisectability. Change-Id: Ibd762d88fd2d1e2cc116a259e2a27a5e9f9a8b10 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43561 Change-Id: Ifb695041cef1964ad8a3102f448249501a9243f0 llvm-svn: 325886	2018-02-23 10:46:21 +00:00
Nicolai Haehnle	aecb68b549	TableGen: Fix typeIsConvertibleTo for record types Summary: Only check whether the left-hand side type is a subclass (or equal to) the right-hand side type. This requires a further fix in handling !if expressions and in type resolution. Furthermore, reverse the order of superclasses so that resolveTypes will find a least common ancestor at least in simple cases. Add a test that used to be accepted without flagging the obvious type error. Change-Id: Ib366db1a4e6a079f1a0851e469b402cddae76714 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43559 llvm-svn: 325884	2018-02-23 10:46:13 +00:00
Nicolai Haehnle	0243aaf42c	TableGen: Add !size operation Summary: Returns the size of a list. I have found this to be rather useful in some development for the AMDGPU backend where we could simplify our .td files by concatenating list<LLVMType> for complex intrinsics. Doing so requires us to compute the position argument for LLVMMatchType. Basically, the usage is in a pattern that looks somewhat like this: list<LLVMType> argtypes = !listconcat(base, [llvm_any_ty, LLVMMatchType<!size(base)>]); Change-Id: I360a0b000fd488d18bea412228230fd93722bd2c Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits, tpr Differential Revision: https://reviews.llvm.org/D43553 llvm-svn: 325883	2018-02-23 10:46:07 +00:00
Nicolai Haehnle	6cf306deca	AMDGPU: Track physreg uses in SILoadStoreOptimizer Summary: This handles def-after-use of physregs, and allows us to merge loads and stores even across some physreg defs (typically M0 defs). Change-Id: I076484b2bda27c2cf46013c845a0380c5b89b67b Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D42647 llvm-svn: 325882	2018-02-23 10:45:56 +00:00
Nicolai Haehnle	43c1115cd4	StructurizeCFG: Test for branch divergence correctly Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881	2018-02-23 10:45:46 +00:00
Bjorn Steinbrink	983d6c3f18	Mark MergedLoadStoreMotion as not preserving MemDep results Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880	2018-02-23 10:41:57 +00:00
Jonas Paulsson	07d6aea61a	[Mips] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Simon Dardis llvm-svn: 325870	2018-02-23 08:30:15 +00:00
Sam Clegg	6c899ba6de	[WebAssembly] Add first claass symbol table to wasm objects This is combination of two patches by Nicholas Wilson: 1. https://reviews.llvm.org/D41954 2. https://reviews.llvm.org/D42495 Along with a few local modifications: - One change I made was to add the UNDEFINED bit to the binary format to avoid the extra byte used when writing data symbols. Although this bit is redundant for other symbols types (i.e. undefined can be implied if a function or global is a wasm import) - I prefer to be explicit and consistent and not have derived flags. - Some field renaming. - Some reverting of unrelated minor changes. - No test output differences. Differential Revision: https://reviews.llvm.org/D43147 llvm-svn: 325860	2018-02-23 05:08:34 +00:00
Richard Smith	ade53736b0	Revert r325128 ("[X86] Reduce Store Forward Block issues in HW"). This is causing miscompiles in some situations. See the llvm-commits thread for the commit for details. llvm-svn: 325852	2018-02-23 01:43:46 +00:00
Craig Topper	0dcc88a500	[X86] Turn setne X, signedmax into setgt signedmax, X in LowerVSETCC to avoid an invert We won't be able to fold the constant pool load, but its still better than materialing ones and xoring for the invert if we used PCMPEQ. This will fix another regression from D42948. llvm-svn: 325845	2018-02-23 00:21:39 +00:00
Evandro Menezes	5c986b010b	[AArch64] Refactor macro fusion (NFC) Move checks for each fusion case into separate functions for better legibility and maintainability. Differential revision: https://reviews.llvm.org/D43649 llvm-svn: 325844	2018-02-23 00:14:39 +00:00
Aaron Smith	89a19ac38d	[PDB] Check the result of setLoadAddress() Summary: Change setLoadAddress() to return true or false on failure. Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D43638 llvm-svn: 325843	2018-02-23 00:02:27 +00:00
Rafael Espindola	ba02f3f242	Fix grammar. NFC. Thank to Eric Christopher for noticing. llvm-svn: 325842	2018-02-22 23:59:46 +00:00
Craig Topper	d2fab30827	[X86] Turn setne X, signedmin into setgt X, signedmin in LowerVSETCC to avoid an invert This will fix one of the regressions from D42948. Differential Revision: https://reviews.llvm.org/D43531 llvm-svn: 325840	2018-02-22 23:46:28 +00:00
Eric Christopher	675dcf02a8	Update comment for whether or not we can optimize an alias - we're checking the alias and not the aliasee. If the alias can be interposed then we shouldn't do anything. llvm-svn: 325837	2018-02-22 23:12:11 +00:00
Benjamin Kramer	a01e97d748	Fix the build of the wasm backend. toString conflicts with llvm::toString here. Yay for overly generic function names. llvm-svn: 325833	2018-02-22 22:29:27 +00:00
Paul Robinson	70def12a96	[DWARFv5] Turn an assert into a diagnostic. Hand-coded assembler files should not trigger assertions. Differential Revision: https://reviews.llvm.org/D43152 llvm-svn: 325831	2018-02-22 21:03:33 +00:00
Craig Topper	a2cc3c055c	[TargetLowering] Rename isCondCodeLegal to isCondCodeLegalOrCustom. Add real isCondCodeLegal. Update callers to use one or the other. isCondCodeLegal internally checked Legal or Custom which is misleading. Though no targets set any cond code action to Custom today. So I've renamed isCondCodeLegal to isCondCodeLegalOrCustom and added a real isCondCodeLegal that only checks Legal. I've changed legalization code to use isCondCodeLegalOrCustom and left things reachable via DAG combine as isCondCodeLegal. I've also changed some places that called getCondCodeAction and compared to Legal to just use isCondCodeLegal. I'm looking at trying to keep SETCC all the way to isel for the AVX512 integer comparisons and I suspect I'll need to make some condition codes Custom to stop DAG combine from changing things post LegalizeOps. Prior to this only Expand stopped DAG combine, but that causes LegalizeOps to try to swap operands or invert rather than calling our Custom handler. Differential Revision: https://reviews.llvm.org/D43607 llvm-svn: 325829	2018-02-22 20:51:26 +00:00
Craig Topper	1aed540ea2	[X86] Make the subus special case in LowerVSETCC self contained Previously this code overrode the flags and opcode used by the later code in LowerVSETCC. This makes the code difficult to read and follow. This patch moves all the SUBUS code into its own function and makes it responsible for creating its own SDNodes on success. Differential Revision: https://reviews.llvm.org/D43530 llvm-svn: 325827	2018-02-22 20:24:18 +00:00
Aaron Smith	9161a6cb25	[PDB] Fix buildbot failure from missing include for DIAEnumLineNumbers llvm-svn: 325826	2018-02-22 20:00:07 +00:00
Sander de Smalen	a86f3cfb49	Revert "[DebugInfo][FastISel] Fix dropping dbg.value()" This patch reverts r325440 and r325438 because it triggers an assertion in SelectionDAGBuilder.cpp. Also having debug enabled may unintentionally affect code-gen. The patch is reverted until we find a better solution. llvm-svn: 325825	2018-02-22 19:53:59 +00:00
Aaron Smith	fbe65404fd	[PDB] Implement more find methods for PDB symbols Summary: Add additional find methods on PDB raw symbols. findChildrenByAddr() findChildrenByVA() findInlineFramesByAddr() findInlineFramesByVA() findInlineLines() findInlineLinesByAddr() findInlineLinesByRVA() findInlineLinesByVA() Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D43637 llvm-svn: 325824	2018-02-22 19:47:43 +00:00
Easwaran Raman	385d8ea8b5	[ThinLTO] Represent relative BF using a scaled representation . Summary: The current integer representation of relative block frequency prevents representing relative block frequencies below 1. This change uses a 8 of the 29 bits to represent the decimal part by using a fixed scale of -8. Reviewers: tejohnson, davidxl Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43520 llvm-svn: 325823	2018-02-22 19:44:08 +00:00
Peter Collingbourne	32f5405bff	Fix DataFlowSanitizer instrumentation pass to take parameter position changes into account for custom functions. When DataFlowSanitizer transforms a call to a custom function, the new call has extra parameters. The attributes on parameters must be updated to take the new position of each parameter into account. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D43132 llvm-svn: 325820	2018-02-22 19:09:07 +00:00
Vitaly Buka	a139b69e12	[ThinLTO] Always create linked objects file for --thinlto-index-only= Summary: ThinLTO indexing may decide to skip all objects. If we don't write something to the list build system may consider this as failure or linker can reuse a file from the previews build. Reviewers: pcc, tejohnson Subscribers: mehdi_amini, inglorion, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D43415 llvm-svn: 325819	2018-02-22 19:06:15 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Simon Pilgrim	be72fe1fda	[SelectionDAG] Move matchUnaryPredicate/matchBinaryPredicate into SelectionDAGNodes.h This allows us to improve vector constant matching in more DAG code (backends, TargetLowering etc.). Differential Revision: https://reviews.llvm.org/D43466 llvm-svn: 325815	2018-02-22 18:45:13 +00:00
Simon Pilgrim	8831f6e57d	[MC] Don't crash on modulo by zero (PR35650) Extension to D12776, handle modulo by zero in the same way we handle divide by zero. Differential Revision: https://reviews.llvm.org/D43631 llvm-svn: 325810	2018-02-22 18:06:48 +00:00
Alexey Bataev	bd786944b9	[DEBUGINFO] Do not output labels for empty macinfo sections. Summary: If there is no debug info for macros, do not emit labels for empty macinfo sections. Reviewers: probinson, echristo Subscribers: aprantl, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43589 llvm-svn: 325803	2018-02-22 16:20:30 +00:00
Nicolai Haehnle	d9f0b07ff7	TableGen: Add strict assertions to sanity check earlier type checking Summary: Both of these errors should have been caught by type-checking during parsing. Change-Id: I891087936fd1a91d21bcda57c256e3edbe12b94d Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43558 llvm-svn: 325800	2018-02-22 15:27:12 +00:00
Nicolai Haehnle	8fb962c04e	TableGen: Allow implicit casting between string and code Summary: Perhaps the distinction between the two should be removed entirely in the long term, and the [{ ... }] syntax should just be a convenient way of writing multi-line strings. In the meantime, a lot of existing .td files are quite relaxed about string vs. code, and this change allows switching on more consistent type checks without breaking those. Change-Id: If85e3e04469e41b58e2703b62ac0032d2711713c Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43557 llvm-svn: 325799	2018-02-22 15:27:03 +00:00
Nicolai Haehnle	81097ba6b5	TableGen: Fix type of resolved and converted lists Summary: There are no new test cases, but a subsequent patch will introduce assertions that would be triggered by existing test cases without this fix. Change-Id: I6a82d4b311b012aff3932978ae86f6a2dcfbf725 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43556 llvm-svn: 325798	2018-02-22 15:26:45 +00:00
Nicolai Haehnle	6d64915c87	TableGen: Fix type deduction for !foreach Summary: In the case of !foreach(id, input-list, transform) where the type of input-list is list<A> and the type of transform is B, we now correctly deduce list<B> as the type of the !foreach. Change-Id: Ia19dd65eecc5991dd648280ba6a15f6a20fd61de Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43555 llvm-svn: 325797	2018-02-22 15:26:35 +00:00
Nicolai Haehnle	e4a2cf5761	TableGen: Generalize type deduction for !listconcat Summary: This way, it should work even with complex operands. Change-Id: Iaccf5bbb50bd5882a0ba5d59689e4381315fb361 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43554 llvm-svn: 325796	2018-02-22 15:26:28 +00:00
Nicolai Haehnle	f19083d1ed	TableGen: Add some more helpful error messages Summary: Some fairly simple changes to start with. Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43552 Change-Id: I0c92731b36d309c6edfcae42595ae1a70cc051c9 llvm-svn: 325795	2018-02-22 15:26:21 +00:00
Nicolai Haehnle	40b140fef1	AMDGPU: Stop using .NAME in .td files Summary: .NAME is a bit of an odd duck, in that we should really treat it like a template argument, but we currently don't, and so when and where NAME is initialized and how is pretty inconsistent. Best to just avoid using it as a field of already instantiated records, and use cast to string instead. Change-Id: I5a0c202401cede3d5c3827ab9c7858ea48b29108 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D43551 llvm-svn: 325794	2018-02-22 15:25:11 +00:00
Shiva Chen	7c17242b92	[RISCV] Implement c.lui immediate operand constraint Implement c.lui immediate constraint to [1, 31] and [0xfffe0, 0xfffff]. The RISC-V ISA describes the constraint as [1, 63], with that value being loaded in to bits 17-12 of the destination register and sign extended from bit 17. Therefore, this 6-bit immediate can represent values in the ranges [1, 31] and [0xfffe0, 0xfffff]. Differential Revision: https://reviews.llvm.org/D42834 llvm-svn: 325792	2018-02-22 15:02:28 +00:00
Luke Cheeseman	6c1e6bbe0c	[FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788	2018-02-22 14:42:08 +00:00
Stefan Maksimovic	ed797a3049	[mips] Generate memory dependencies for byVal arguments There were no memory dependencies made between stores generated when lowering formal arguments and loads generated when call lowering byVal arguments which made the Post-RA scheduler place a load before a matching store. Make the fixed object stored to mutable so that the load instructions can have their memory dependencies added Set the frame object as isAliased which clears the underlying objects vector in ScheduleDAGInstrs::buildSchedGraph(). This results in addition of all stores as dependenies for loads. This problem appeared when passing a byVal parameter coupled with a fastcc function call. Differential Revision: https://reviews.llvm.org/D37515 llvm-svn: 325782	2018-02-22 13:40:42 +00:00
Serge Guelton	1fb81bcb9b	Syndicate duplicate code between CallInst and InvokeInst NFC intended, syndicate common code to a parametric base class. Part of the original problem is that InvokeInst is a TerminatorInst, unlike CallInst. the problem is solved by introducing a parametrized class paramtertized by its base. Differential Revision: https://reviews.llvm.org/D40727 llvm-svn: 325778	2018-02-22 13:30:32 +00:00
Alex Bradbury	8d8d0a733f	[RISCV][NFC] Make logic in RISCVMCCodeEmitter::getImmOpValue more defensive As pointed out by @sabuasal in a comment on D23568, the logic in RISCVMCCodeEmitter::getImmOpValue could be more defensive. Although with the current instruction definitions it is always the case that `VK_RISCV_LO` is always used with either an I- or S-format instruction, this may not always be the case in the future. Add a check to ensure we will get an assertion in debug builds if that changes. llvm-svn: 325775	2018-02-22 13:24:25 +00:00
Sjoerd Meijer	d31a8c0595	Recommit: [ARM] f16 constant pool fix This recommits r325754; the modified and failing test case actually didn't need any modifications. llvm-svn: 325765	2018-02-22 10:43:57 +00:00
David Green	01e0f25a9f	[ARM] Fix issue with large xor constants. Fixup to rL325573 for large xor constants. Thanks to Eli Friedman for the catch. Differential revision: https://reviews.llvm.org/D43549 llvm-svn: 325761	2018-02-22 09:38:57 +00:00
Sjoerd Meijer	9a25247f80	Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy. llvm-svn: 325756	2018-02-22 08:41:55 +00:00
Sjoerd Meijer	7d5909eb0f	[ARM] f16 constant pool fix This is a follow up of r325012, that allowed half types in constant pools. Proper alignment was enforced when a big basic block was split up, but not when a CPE was placed before/after a block; the successor block had the wrong alignment. Differential Revision: https://reviews.llvm.org/D43580 llvm-svn: 325754	2018-02-22 08:16:05 +00:00
Hiroshi Inoue	7f9f92f8b6	[NFC] fix trivial typos in comments "a a" -> "a" llvm-svn: 325752	2018-02-22 07:48:29 +00:00
Craig Topper	1d104b996a	[DAGCombiner] Add two calls to isVector before making calls to getVectorElementType/getVectorNumElements to avoid an assert. We looked through a BITCAST, but the bitcast might be a from a scalar type rather than a vector. I don't have a test case. I stumbled onto it while prototyping another change that isn't ready yet. llvm-svn: 325750	2018-02-22 07:05:27 +00:00
Mircea Trofin	56950974d4	[SampleProf] NFC. Expose reusable functionality in SampleProfile. Summary: Exposing getOffset and findFunctionSamples as members of SampleProfile. They are intimately tied to design choices of the sample profile format - using offsets instead of line numbers, and traversing inlined functions stack, respectively. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43605 llvm-svn: 325747	2018-02-22 06:42:57 +00:00
Max Kazantsev	80843a0acc	[SCEV][NFC] Factor out common logic into a separate method SCEV has multiple occurences of code when we need to prove some predicate on every iteration of a loop and do it with invocations of couple `isLoopEntryGuardedByCond`, `isLoopBackedgeGuardedByCond`. This patch factors out these two calls into a separate method. It is a preparation step to extend this logic: it is not the only way how we can prove such conditions. Differential Revision: https://reviews.llvm.org/D43373 llvm-svn: 325745	2018-02-22 06:27:32 +00:00
Nemanja Ivanovic	e54a9ee8ac	[PowerPC] Do not produce invalid CTR loop with an FRem An FRem instruction inside a loop should prevent the loop from being converted into a CTR loop since this is not an operation that is legal on any PPC subtarget. This will always be a call to a library function which means the loop will be invalid if this instruction is in the body. Fixes PR36292. llvm-svn: 325739	2018-02-22 03:02:41 +00:00
Vedant Kumar	1ceabcf080	[Utils] Avoid a hash table lookup in salvageDI, NFC According to the current coverage report salvageDebugInfo() is called 5.12 million times during testing and almost always returns early. The early return depends on LocalAsMetadata::getIfExists returning null, which involves a DenseMap lookup in an LLVMContextImpl. We can probably speed this up by simply checking the IsUsedByMD bit in Value. llvm-svn: 325738	2018-02-22 01:29:41 +00:00
Simon Pilgrim	55b7e01116	[X86][MMX] Generlize MMX_MOVD64rr combines to accept v4i16/v8i8 build vectors as well as v2i32 Also handle both cases where the lower 32-bits of the MMX is undef or zero extended. llvm-svn: 325736	2018-02-21 23:07:30 +00:00
Yonghong Song	9fdd139b41	bpf: disable DwarfUsesRelocationsAcrossSections The pahole does not work with BPF backend properly: -bash-4.2$ cat test.c struct test_t { int a; int b; }; int test(struct test_t s) { return s->a; } -bash-4.2$ clang -g -O2 -target bpf -c test.c -bash-4.2$ pahole test.o struct clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) { clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); / 0 4 / clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); / 4 4 / / size: 8, cachelines: 1, members: 2 / / last cacheline: 8 bytes / }; -bash-4.2$ The reason is that BPF backend is not yet implemented in elfutils backend https://github.com/threatstack/elfutils/tree/master/backends and pahole depends on elfutils for dwarf parsing and resolving relocation. More specifically, the unsupported relocation in .debug_info for type/member name against symbol table caused the incorrect result above. The following is the raw .rel.debug_info for the above example, Hex dump of section '.rel.debug_info': 0x00000000 06000000 00000000 0a000000 0b000000 ................ 0x00000010 0c000000 00000000 0a000000 01000000 ................ 0x00000020 12000000 00000000 0a000000 02000000 ................ 0x00000030 16000000 00000000 0a000000 0e000000 ................ 0x00000040 1a000000 00000000 0a000000 03000000 ................ ----------------- -------- -------- reloc location type symtab index Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c000000 00000000 00000000 00000000 ................ 0x00000020 00000000 00001000 00000200 00000000 ................ Based on "type", the proper value will be extracted from symbol table and filled in .debug_info so later on .debug_info can be properly resolved against debug strings. There are two ways to fix this problem. One is to fix elfutils by adding BPF support which is desirable. This could take a long time and won't work with already deployed pahole. For a short term workaround, we can disable dwarf cross-section relation which specifically avoids debug_info and symbol table cross relocation. This should help any dwarf-related tool which has not implement BPF specific relocations yet. Now .rel.debug_info does not have any relocation for symbol table and .debug_info itself contains necessary relocation information by itself. Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c003700 00000000 00003e00 00000000 ..7.......>..... 0x00000020 00000000 00001000 00000200 00000000 ................ location 0xc has 0, 0x12 has 0x37, 0x1a has 0x3e in place which will be used in relocation resolution. Here, the values of 0, 0x37 and 0x3e are offset in .debug_str section. Please note the difference between two above .debug_info dumps. With the fix, pahole works properly with BPF backend: -bash-4.2$ clang -O2 -g -target bpf -c test.c -bash-4.2$ pahole test.o struct test_t { int a; / 0 4 / int b; / 4 4 / / size: 8, cachelines: 1, members: 2 / / last cacheline: 8 bytes */ }; Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 325735	2018-02-21 22:59:14 +00:00
Pavel Labath	3b17b84b9c	Resubmit r325107 (case folding DJB hash) The issue was that the has function was generating different results depending on the signedness of char on the host platform. This commit fixes the issue by explicitly using an unsigned char type to prevent sign extension and adds some extra tests. The original commit message was: This patch implements a variant of the DJB hash function which folds the input according to the algorithm in the Dwarf 5 specification (Section 6.1.1.4.5), which in turn references the Unicode Standard (Section 5.18, "Case Mappings"). To achieve this, I have added a llvm::sys::unicode::foldCharSimple function, which performs this mapping. The implementation of this function was generated from the CaseMatching.txt file from the Unicode spec using a python script (which is also included in this patch). The script tries to optimize the function by coalescing adjecant mappings with the same shift and stride (terms I made up). Theoretically, it could be made a bit smarter and merge adjecant blocks that were interrupted by only one or two characters with exceptional mapping, but this would save only a couple of branches, while it would greatly complicate the implementation, so I deemed it was not worth it. Since we assume that the vast majority of the input characters will be US-ASCII, the folding hash function has a fast-path for handling these, and only whips out the full decode+fold+encode logic if we encounter a character outside of this range. It might be possible to implement the folding directly on utf8 sequences, but this would also bring a lot of complexity for the few cases where we will actually need to process non-ascii characters. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: mgorny, hintonda, echristo, clayborg, vleschuk, llvm-commits Differential Revision: https://reviews.llvm.org/D42740 llvm-svn: 325732	2018-02-21 22:36:31 +00:00
Tobias Edler von Koch	ba7a1f08da	[Hexagon] Add TargetRegisterInfo::getPointerRegClass() override llvm-svn: 325731	2018-02-21 22:27:07 +00:00
Sanjay Patel	5a6f904520	[InstCombine] add and use Create*FMF functions; NFC llvm-svn: 325730	2018-02-21 22:18:55 +00:00
Lang Hames	a944589cc5	[ORC] Switch to shared_ptr ownership for SymbolSources in VSOs. This makes it easy to free a SymbolSource (and any related resources) when the last reference in a VSO is dropped. llvm-svn: 325727	2018-02-21 21:55:57 +00:00
Lang Hames	589eece132	[ORC] Switch RTDyldObjectLinkingLayer to take a unique_ptr<MemoryBuffer> rather than a shared ObjectFile/MemoryBuffer pair. There's no need to pre-parse the buffer into an ObjectFile before passing it down to the linking layer, and moving the parsing into the linking layer allows us remove the parsing code at each call site. llvm-svn: 325725	2018-02-21 21:55:49 +00:00
Rafael Espindola	9a2bf413a0	Revert "[IRMover] Implement name based structure type mapping" This reverts commit r325686. There was a misunderstanding and this has not been approved yet. llvm-svn: 325715	2018-02-21 20:12:18 +00:00
Evgeniy Stepanov	43271b1803	[hwasan] Fix inline instrumentation. This patch changes hwasan inline instrumentation: Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte). Emits brk instruction instead of hlt for the kernel and user space. Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel). Fixes and adds appropriate tests. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D43135 llvm-svn: 325711	2018-02-21 19:52:23 +00:00
Frederich Munch	33ef594c58	Handle IMAGE_REL_AMD64_ADDR32NB in RuntimeDyldCOFF Summary: IMAGE_REL_AMD64_ADDR32NB relocations are currently set to zero in all cases. This patch sets the relocation to the correct value when possible and shows an error when not. Reviewers: enderby, lhames, compnerd Reviewed By: compnerd Subscribers: LepelTsmok, compnerd, martell, llvm-commits Differential Revision: https://reviews.llvm.org/D30709 llvm-svn: 325700	2018-02-21 17:18:20 +00:00
Jonas Paulsson	77cdf3881c	[Hexagon] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Krzysztof Parzyszek llvm-svn: 325697	2018-02-21 16:37:45 +00:00
Simon Pilgrim	82d33b7c44	[X86] LowerBITCAST - pull out repeated calls to getOperand(0). NFCI. llvm-svn: 325695	2018-02-21 16:35:40 +00:00
Jonas Devlieghere	e0af7c390d	[Sparc] Include __tls_get_addr in symbol table for TLS calls to it Global Dynamic and Local Dynamic call relocations only implicitly reference __tls_get_addr; there is no connection in the ELF file between the relocations and the symbol other than the specification for the relocations' semantics. However, it still needs to be in the symbol table despite the lack of explicit references to the symbol table entry, since it needs to be bound at link time for these relocations, otherwise any objects will fail to link. For details, see https://sourceware.org/bugzilla/show_bug.cgi?id=22832. Path by: James Clarke (jrtc27) Differential revision: https://reviews.llvm.org/D43271 llvm-svn: 325688	2018-02-21 15:25:26 +00:00
Silviu Baranga	10ad93c6bf	[SCEV] Temporarily disable loop versioning for the purpose of turning SCEVUnknowns of PHIs into AddRecExprs. This feature is now hidden behind the -scev-version-unknown flag. Fixes PR36032 and PR35432. llvm-svn: 325687	2018-02-21 15:20:32 +00:00
Eugene Leviant	c556974f72	[IRMover] Implement name based structure type mapping Differential revision: https://reviews.llvm.org/D43199 llvm-svn: 325686	2018-02-21 15:13:48 +00:00
Nicolai Haehnle	770397f4cd	AMDGPU: Do not combine loads/store across physreg defs Summary: Since this pass operates on machine SSA form, this should only really affect M0 in practice. Fixes various piglit variable-indexing/vs-varying-array-mat4-index-* Change-Id: Ib2a1dc3a8d7b08225a8da49a86f533faa0986aa8 Fixes: r317751 ("AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4") Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40343 llvm-svn: 325677	2018-02-21 13:31:35 +00:00
Dmitry Preobrazhensky	d6e1a9404d	[AMDGPU][MC] Added lds support for MUBUF instructions See bug 28234: https://bugs.llvm.org/show_bug.cgi?id=28234 Differential Revision: https://reviews.llvm.org/D43472 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 325676	2018-02-21 13:13:48 +00:00
Vedant Kumar	56492f9177	[BDCE] Salvage debug info from dying insts This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660	2018-02-21 01:55:33 +00:00
Craig Topper	d710adac2d	[X86] Disable CLWB for Cannon Lake Cannon Lake does not support CLWB, therefore it does not include all features listed under SKX anymore. Instead, enumerate all SKX features with the exception of CLWB. Patch by Gabor Buella Differential Revision: https://reviews.llvm.org/D43380 llvm-svn: 325654	2018-02-21 00:15:48 +00:00
Simon Dardis	7bc8ad5849	[mips] Spectre variant two mitigation for MIPSR2 This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It implements the LLVM part of -mindirect-jump=hazard. It is _not_ enabled by default for the P5600. The migitation strategy suggested by MIPS for these processors is to use hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the attribute +use-indirect-jump-hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Performance benchmarking of this option with -fpic and lld using -z hazardplt shows a difference of overall 10%~ time increase for the LLVM testsuite. Certain benchmarks such as methcall show a substantially larger increase in time due to their nature. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D43486 llvm-svn: 325653	2018-02-21 00:06:53 +00:00
Sanjay Patel	6f716a7c5e	[InstCombine] C / -X --> -C / X We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649	2018-02-21 00:01:45 +00:00
Sanjay Patel	d8dd0151fc	[InstCombine] -X / C --> X / -C for FP We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. llvm-svn: 325648	2018-02-20 23:51:16 +00:00
Konstantin Zhuravlyov	5c1237a1fd	Revert "[AMDGPU] Increased vector length for global/constant loads." https://reviews.llvm.org/rL325518 It breaks following OpenCL conformance tests: - Basic - parameter_types - Basic - vload_private llvm-svn: 325643	2018-02-20 23:30:21 +00:00
Sanjoy Das	737fa40ffa	[DSE] Don't DSE stores that subsequent memmove calls read from Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641	2018-02-20 23:19:34 +00:00
Lang Hames	919f15a1b4	[PBQP] Fix PR33038 by pruning empty intervals in initializeGraph. Spilling may cause previously non-empty intervals (both for the spilled vreg and others) to become empty. Moving the pruning into initializeGraph catches these cases and fixes PR33038. llvm-svn: 325632	2018-02-20 22:15:09 +00:00
Benjamin Kramer	fd0630665b	[MemoryBuiltins] Check nobuiltin status when identifying calls to free. This is usually not a problem because this code's main purpose is eliminating unused new/delete pairs. We got deletes of nullptr or nobuiltin deletes of builtin new wrong though. llvm-svn: 325630	2018-02-20 22:00:33 +00:00
Sanjay Patel	7365b44b85	[InstCombine] remove unneeded operand swap: NFCI FMul is commutative, so complexity-based canonicalization should always take care of the swap via SimplifyAssociativeOrCommutative(). llvm-svn: 325628	2018-02-20 21:52:46 +00:00
Craig Topper	7fbea20b90	[SelectionDAG] Support known true/false SimplifySetCC cases for comparing against vector splats of constants. This is split off from D42948 and includes just the cases that constant fold to true or false. It also includes some refactoring to keep predicate checks together. This supports things like (setcc uge X, 0) -> true Differential Revision: https://reviews.llvm.org/D43489 llvm-svn: 325627	2018-02-20 21:48:14 +00:00
Evandro Menezes	72f3983633	[AArch64] Refactor instructions using SIMD immediates Get rid of icky goto loops and make the code easier to maintain. Otherwise, NFC. Restore r324903 and fix PR36369. Differentail revision: https://reviews.llvm.org/D43364 llvm-svn: 325621	2018-02-20 20:31:45 +00:00
Teresa Johnson	a344fd3db6	[LTO] Remove unused Path parameter to AddBufferFn Summary: With D43396, no clients use the Path parameter anymore. Depends on D43396. Reviewers: pcc Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43400 llvm-svn: 325619	2018-02-20 20:21:53 +00:00
Sjoerd Meijer	4d5c40492a	[ARM] Lower BR_CC for f16 This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616	2018-02-20 19:28:05 +00:00
Krzysztof Parzyszek	f9f2005f94	[Hexagon] Handle *Low8 register classes in early if-conversion llvm-svn: 325606	2018-02-20 18:19:17 +00:00
Craig Topper	df0c22fcd3	[X86] Correct SHRUNKBLEND creation to work correctly when there are multiple uses of the condition. SimplifyDemandedBits forces the demanded mask to all 1s if the node has multiple uses, unless the AssumeSingleUse flag is set. So previously we were only really likely to simplify something if the condition had a single use. And on the off chance we did simplify with multiple uses the demanded mask being used was all ones so there was no reason to create a shrunkblend. This patch now checks that the condition is only used by selects first, and then sets the AssumeSingleUse flag for the simplifcation. Then we convert the selects to shrunkblend, and finally replace condition. Differential Revision: https://reviews.llvm.org/D43446 llvm-svn: 325604	2018-02-20 17:58:17 +00:00
Craig Topper	35801fa5ce	[SelectionDAG] Add LegalTypes flag to getShiftAmountTy. Use it to unify and simplify DAGCombiner and simplifySetCC code and fix a bug. DAGCombiner and SimplifySetCC both use getPointerTy for shift amounts pre-legalization. DAGCombiner uses a single helper function to hide this. SimplifySetCC does it in multiple places. This patch adds a defaulted parameter to getShiftAmountTy that can make it return getPointerTy for scalar types. Use this parameter to simplify the SimplifySetCC and DAGCombiner. Additionally, there were two places in SimplifySetCC that were creating shifts using the target's preferred shift amount pre-legalization. If the target uses a narrow type and the type is illegal, this can cause SimplfiySetCC to create a shift with an amount that can't represent all possible shift values for the type. To fix this we should use pointer type there too. Alternatively we could make getScalarShiftAmountTy for each target return a safe value for large types as proposed in D43445. And maybe we should still do that, but fixing the SimplifySetCC code keeps other targets from tripping over this in the future. Fixes PR36250. Differential Revision: https://reviews.llvm.org/D43449 llvm-svn: 325602	2018-02-20 17:41:05 +00:00
Craig Topper	010ae8dcbb	[X86] Promote 16-bit cmovs to 32-bits This allows us to avoid an opsize prefix. And forcing some move immediates to i32 avoids a length changing prefix on those instructions. This mostly replaces the existing combine we had for zext/sext+cmov of constants. I left in a case for sign extending a 32 bit cmov of constants to 64 bits. Differential Revision: https://reviews.llvm.org/D43327 llvm-svn: 325601	2018-02-20 17:41:00 +00:00
Sanjay Patel	29b98ae337	[InstCombine] remove unneeded dyn_cast to prevent unused variable warning llvm-svn: 325597	2018-02-20 17:14:53 +00:00
Sanjay Patel	b2d978682b	[InstCombine] remove compound fdiv pattern folds These are fdiv-with-constant-divisor, so they already become reciprocal multiplies. The last gap for vector ops should be closed with rL325590. It's possible that we're missing folds for some edge cases with denormal intermediate constants after deleting these, but there are no tests for those patterns, and it would be better to handle denormals more consistently (and less conservatively) as noted in TODO comments. llvm-svn: 325595	2018-02-20 16:52:17 +00:00
Sanjay Patel	90f4c8ec29	[InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C) llvm-svn: 325590	2018-02-20 16:08:15 +00:00
Simon Dardis	d3860e6670	[mips] Correct the definition of cvt.d.w An upcoming patch D41434, changes the ordering of the matcher table for assembly. This patch corrects the definition of the normal MIPS cvt.d.w not to be available in microMIPS. llvm-svn: 325589	2018-02-20 15:55:17 +00:00
Alexey Bataev	0d6aeadc40	[DEBUGINFO] Add support for emission of the inlined strings. Summary: Patch adds an option for emission of inlined strings rather than .debug_str section. Reviewers: echristo, jlebar Subscribers: eraman, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43390 llvm-svn: 325583	2018-02-20 15:28:08 +00:00
Lei Huang	dfd41552f4	[PowerPC] Reduce stack frame for fastcc functions by only allocating parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581	2018-02-20 15:09:45 +00:00
Krzysztof Parzyszek	b404fae9e3	[Hexagon] Fix alignment calculation of stack objects in Hexagon bit tracker llvm-svn: 325580	2018-02-20 14:29:43 +00:00
Simon Pilgrim	2f29afb439	[VectorLegalizer] Fix uint64_t typo in ExpandUINT_TO_FLOAT (PR36391) ExpandUINT_TO_FLOAT can accept vXi32 or vXi64 inputs, so we need to use a uint64_t shift to generate the 2^(BW/2) constant. No test case unfortunately as no upstream target uses this, but its affecting a downstream target. llvm-svn: 325578	2018-02-20 13:24:24 +00:00
David Green	056476497e	[ARM] Mark -1 as cheap in xor's for thumb1 We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1 would not otherwise be considered a cheap constant. This prevents the -1's from being pulled out into constants and potentially hoisted. Differential Revision: https://reviews.llvm.org/D43451 llvm-svn: 325573	2018-02-20 11:07:35 +00:00
George Rimar	da4f43a4b4	[llvm-mc] - Produce R_X86_64_PLT32 for "call/jmp foo". For instructions like call foo and jmp foo patch changes relocation produced from R_X86_64_PC32 to R_X86_64_PLT32. Relocation can be used as a marker for 32-bit PC-relative branches. Linker will reduce PLT32 relocation to PC32 if function is defined locally. Differential revision: https://reviews.llvm.org/D43383 llvm-svn: 325569	2018-02-20 10:17:57 +00:00
Tim Renouf	8234b4893a	[AMDGPU] stop buffer_store being moved illegally Summary: The machine instruction scheduler was illegally moving a buffer store past a buffer load with the same descriptor and offset. Fixed by marking buffer ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D43332 Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7 llvm-svn: 325567	2018-02-20 10:03:38 +00:00
George Rimar	9712113f8b	[MC] - Don't crash on unclosed frame. llvm-mc can crash when there is cfi_startproc without cfi_end_proc: .text .globl foo foo: .cfi_startproc Testcase shows the issue, patch fixes it. Differential revision: https://reviews.llvm.org/D43456 llvm-svn: 325564	2018-02-20 09:04:13 +00:00
Craig Topper	9256ac1a58	[X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559	2018-02-20 07:28:14 +00:00
Serge Pavlov	76d8ccee2e	Report fatal error in the case of out of memory This is the second part of recommit of r325224. The previous part was committed in r325426, which deals with C++ memory allocation. Solution for C memory allocation involved functions `llvm::malloc` and similar. This was a fragile solution because it caused ambiguity errors in some cases. In this commit the new functions have names like `llvm::safe_malloc`. The relevant part of original comment is below, updated for new function names. Analysis of fails in the case of out of memory errors can be tricky on Windows. Such error emerges at the point where memory allocation function fails, but manifests itself when null pointer is used. These two points may be distant from each other. Besides, next runs may not exhibit allocation error. In some cases memory is allocated by a call to some of C allocation functions, malloc, calloc and realloc. They are used for interoperability with C code, when allocated object has variable size and when it is necessary to avoid call of constructors. In many calls the result is not checked for null pointer. To simplify checks, new functions are defined in the namespace 'llvm': `safe_malloc`, `safe_calloc` and `safe_realloc`. They behave as corresponding standard functions but produce fatal error if allocation fails. This change replaces the standard functions like 'malloc' in the cases when the result of the allocation function is not checked for null pointer. Finally, there are plain C code, that uses malloc and similar functions. If the result is not checked, assert statement is added. Differential Revision: https://reviews.llvm.org/D43010 llvm-svn: 325551	2018-02-20 05:41:26 +00:00
Amara Emerson	db211892ed	[AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first. This is a follow on commit to r[x] where we fix the other direction of copy. For this case, after converting the source from gpr32 -> fpr32, we use a subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land. https://reviews.llvm.org/D43444 llvm-svn: 325550	2018-02-20 05:11:57 +00:00
Craig Topper	a05ed17316	[X86] Make XOP VPCOM instructions commutable to fold loads during isel. llvm-svn: 325547	2018-02-20 03:58:13 +00:00
Craig Topper	9b64bf54b9	[X86] Make a helper function for commuting AVX512 VPCMP immediates since we do it in two places. llvm-svn: 325546	2018-02-20 03:58:11 +00:00
Sanjay Patel	2816560b2c	[InstCombine] use CreateWithCopiedFlags to reduce code; NFCI Also, move the folds with constants closer to make it easier to follow. llvm-svn: 325541	2018-02-19 23:09:03 +00:00
Brian Gesiak	d1eabb1810	Revert "[mem2reg] Use range loops (NFCI)" This reverts commit r325532. llvm-svn: 325539	2018-02-19 22:48:51 +00:00
Craig Topper	b195ed8ce3	[X86] Use vpmovq2m/vpmovd2m for truncate to vXi1 when possible. Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway. llvm-svn: 325534	2018-02-19 22:07:31 +00:00
Sanjay Patel	1d14779aed	[InstCombine] allow fdiv with constant dividend folds with less than full -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533	2018-02-19 21:46:52 +00:00
Brian Gesiak	49a9d1a4e6	[mem2reg] Use range loops (NFCI) Summary: Several for loops in PromoteMemoryToRegister.cpp leave their increment expression empty, instead incrementing the iterator within the for loop body. I believe this is because these loops were previously implemented as while loops; see https://reviews.llvm.org/rL188327. Incrementing the iterator within the body of the for loop instead of in its increment expression makes it seem like the iterator will be modified or conditionally incremented within the loop, but that is not the case in these loops. Instead, use range loops. Test Plan: `check-llvm` Reviewers: davide, bkramer Reviewed By: davide, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43473 llvm-svn: 325532	2018-02-19 21:44:52 +00:00
Sanjay Patel	e412954953	[InstCombine] refactor fdiv with constant dividend folds; NFC The last fold that used to be here was not necessary. That's a combination of 2 folds (and there's a regression test to show that). The transforms are guarded by isFast(), but that should be loosened. llvm-svn: 325531	2018-02-19 21:17:58 +00:00
Brian Gesiak	58434db098	[Coroutines] Move debug statement before assert Summary: Move a debug statement to above where an assertion is hit, so that the debug statement can be inspected before a stack trace. Test Plan: `check-llvm` llvm-svn: 325529	2018-02-19 20:50:09 +00:00
Craig Topper	e60f1472f1	[X86] Stop swapping the operands of AVX512 setge. We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap. llvm-svn: 325527	2018-02-19 19:23:35 +00:00
Craig Topper	9471a7c898	[X86] Reduce the number of isel pattern variations needed for VPTESTM/VPTESTNM matching. Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node. This removes over 24000 bytes from the isel table. llvm-svn: 325526	2018-02-19 19:23:31 +00:00
Steven Wu	545d34a272	bitcode support change for fast flags compatibility Summary: The discussion and as per need, each vendor needs a way to keep the old fast flags and the new fast flags in the auto upgrade path of the IR upgrader. This revision addresses that issue. Patched by Michael Berg Reviewers: qcolombet, hans, steven_wu Reviewed By: qcolombet, steven_wu Subscribers: dexonsmith, vsk, mehdi_amini, andrewrk, MatzeB, wristow, spatel Differential Revision: https://reviews.llvm.org/D43253 llvm-svn: 325525	2018-02-19 19:22:28 +00:00

... 2 3 4 5 6 ...

111040 Commits