llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Sanders	85fd10bd93	[mips] Range check simm16 Summary: There are too many instructions to exhaustively test so addiu and lwc2 are used as representative examples. It should be noted that many memory instructions that should have simm16 range checking do not because it is also necessary to support the macro of the same name which accepts simm32. The range checks for these occur in the macro expansion. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18437 llvm-svn: 265019	2016-03-31 14:34:00 +00:00
Daniel Sanders	eab3146156	[mips] Range check simm11 and mem_simm11. Summary: ldc2/sdc2 now emit slightly worse diagnostics for MIPS-I. The problem is that they don't trigger the custom parser because all the candidates are disabled by feature bits. On all other subtargets, the diagnostics are accurate but are subject to the usual issues of needing to report multiple ways to correct the code (e.g. smaller offset, enable a CPU feature) but only being able to report one error. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18436 llvm-svn: 265018	2016-03-31 14:23:20 +00:00
Dmitry Polukhin	cd835ad876	[IFUNC] Introduce GlobalIndirectSymbol as a base class for alias and ifunc This patch is a part of http://reviews.llvm.org/D15525 GlobalIndirectSymbol class contains common implementation for both aliases and ifuncs. This patch should be NFC change that just prepare common code for ifunc support. Differential Revision: http://reviews.llvm.org/D18433 llvm-svn: 265016	2016-03-31 14:16:21 +00:00
Sam Kolton	1048fb1818	[AMDGPU] Disassembler: support for DPP Review: http://reviews.llvm.org/D18642 llvm-svn: 265015	2016-03-31 14:15:04 +00:00
Daniel Sanders	dc0602a2c2	[mips] Split mem_msa into range checked mem_simm10 and mem_simm10_lsl[123] Summary: Also, made test_mi10.s formatting consistent with the majority of the MC tests. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18435 llvm-svn: 265014	2016-03-31 14:12:01 +00:00
Nirav Dave	83ce54aac2	Prevent X86ISelLowering from merging volatile loads Change isConsecutiveLoads to check that loads are non-volatile as this is a requirement for any load merges. Propagate change to two callers. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18546 llvm-svn: 265013	2016-03-31 13:40:55 +00:00
Daniel Sanders	2e9f69d933	[mips] Range check simm9 and fix a bug this revealed. Summary: The bug was that microMIPS's [ls]w[lr]e instructions claimed to support a 12-bit offset when it is only 9-bit. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18434 llvm-svn: 265010	2016-03-31 13:15:23 +00:00
Zlatko Buljan	6221be8e46	[mips][microMIPS] Implement MFC, MFHC and DMFC* instructions Differential Revision: http://reviews.llvm.org/D17334 llvm-svn: 265002	2016-03-31 08:51:24 +00:00
Jonas Paulsson	2ba315218b	Indentation fix in SystemZInstrInfo.cpp llvm-svn: 265000	2016-03-31 08:00:14 +00:00
Sanjoy Das	56df0ec610	[InstCombine] Fix incorrect rule from rL236202 The rule for SMIN introduced in rL236202 doesn't work as advertised: the check for Pred == ICmpInst::ICMP_SGT was missing. llvm-svn: 264996	2016-03-31 05:14:34 +00:00
Sanjoy Das	c9d6d8b106	Delete trailing whitespace llvm-svn: 264995	2016-03-31 05:14:29 +00:00
Sanjoy Das	e12c0e5159	[SCEV] Track NoWrap properties using MatchBinaryOp, NFC This way once we teach MatchBinaryOp to map more things into arithmetic, the non-wrapping add recurrence construction would understand it too. Right now MatchBinaryOp still only understands arithmetic, so this is solely a code-reorganization change. llvm-svn: 264994	2016-03-31 05:14:26 +00:00
Sanjoy Das	118d919a6a	[SCEV] NFC code motion to simplify later change llvm-svn: 264993	2016-03-31 05:14:22 +00:00
Craig Topper	d2aa03a60a	[X86] Use MVT instead of EVT in code called after legalization. llvm-svn: 264992	2016-03-31 04:37:41 +00:00
Hal Finkel	851b33a0b1	[PowerPC] Load two floats directly instead of using one 64-bit integer load When dealing with complex<float>, and similar structures with two single-precision floating-point numbers, especially when such things are being passed around by value, we'll sometimes end up loading both float values by extracting them from one 64-bit integer load. It looks like this: t13: i64,ch = load<LD8[%ref.tmp]> t0, t6, undef:i64 t16: i64 = srl t13, Constant:i32<32> t17: i32 = truncate t16 t18: f32 = bitcast t17 t19: i32 = truncate t13 t20: f32 = bitcast t19 The problem, especially before the P8 where those bitcasts aren't legal (and get expanded via the stack), is that it would have been better to use two floating-point loads directly. Here we add a target-specific DAGCombine to do just that. In short, we turn: ld 3, 0(5) stw 3, -8(1) rldicl 3, 3, 32, 32 stw 3, -4(1) lfs 3, -4(1) lfs 0, -8(1) into: lfs 3, 4(5) lfs 0, 0(5) llvm-svn: 264988	2016-03-31 02:56:05 +00:00
Sanjoy Das	021de058df	Introduce a @llvm.experimental.guard intrinsic Summary: As discussed on llvm-dev[1]. This change adds the basic boilerplate code around having this intrinsic in LLVM: - Changes in Intrinsics.td, and the IR Verifier - A lowering pass to lower @llvm.experimental.guard to normal control flow - Inliner support [1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18527 llvm-svn: 264976	2016-03-31 00:18:46 +00:00
Hans Wennborg	6596977130	[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325) The size savings are significant, and from what I can tell, both ICC and GCC do this. Differential Revision: http://reviews.llvm.org/D18573 llvm-svn: 264966	2016-03-30 23:38:01 +00:00
Matthias Braun	8d41436004	CodeGen: Factor out code for tail call result compatibility check; NFC llvm-svn: 264959	2016-03-30 22:46:04 +00:00
Matt Arsenault	2fe4fbc184	AMDGPU: Add frexp_exp intrinsic llvm-svn: 264944	2016-03-30 22:28:52 +00:00
Matt Arsenault	5cd4f8f89f	AMDGPU: Constant folding for frexp_mant llvm-svn: 264943	2016-03-30 22:28:26 +00:00
Teresa Johnson	d8d94652b2	Use existing PrintEscapedString in AssemblyWriter r264884 introduced a helper to escape the backslashes in the source file path, but I since discovered an existing mechanism to escape strings. llvm-svn: 264936	2016-03-30 22:17:28 +00:00
Peter Collingbourne	2bc252acd5	Cloning: Reduce complexity of debug info cloning and fix correctness issue. Commit r260791 contained an error in that it would introduce a cross-module reference in the old module. It also introduced O(N^2) complexity in the module cloner by requiring the entire module to be visited for each function. Fix both of these problems by avoiding use of the CloneDebugInfoMetadata function (which is only designed to do intra-module cloning) and cloning function-attached metadata in the same way that we clone all other metadata. Differential Revision: http://reviews.llvm.org/D18583 llvm-svn: 264935	2016-03-30 22:05:13 +00:00
Aaron Ballman	ef0fe1eed8	Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929	2016-03-30 21:30:00 +00:00
Matt Arsenault	46ba31650e	LegalizeDAG: Don't replace vector store with integer if not legal For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928	2016-03-30 21:15:18 +00:00
Matt Arsenault	a4b1b6ea05	LegalizeDAG: Don't replace vector load with integer unless legal On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927	2016-03-30 21:15:10 +00:00
David Majnemer	5d518386b6	[IndVarSimplify] Don't insert after a catchswitch Widening a PHI requires us to insert a trunc. The logical place for this trunc is in the same BB as the PHI. This is not possible if the BB is terminated by a catchswitch. This fixes PR27133. llvm-svn: 264926	2016-03-30 21:12:06 +00:00
Simon Pilgrim	c49bd2ede0	[X86][AVX] Ensure EltsFromConsecutiveLoads tests the entire vector for consecutive loads/zeros Fix for issue introduced D17297, where we were breaking early from the loop detecting consecutive loads which could leave us thinking a consecutive load with zeros was possible. llvm-svn: 264922	2016-03-30 20:52:24 +00:00
Justin Lebar	e3804cc932	[NVPTX] Make NVVMReflect a function pass. Summary: Currently it's a module pass. Make it a function pass so that we can move it to PassManagerBuilder's EP_EarlyAsPossible extension point, which only accepts function passes. Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18615 llvm-svn: 264919	2016-03-30 20:40:11 +00:00
Justin Lebar	2fe1323112	[PassManager] Make PassManagerBuilder::addExtension take an std::function, rather than a function pointer. Summary: This gives callers flexibility to pass lambdas with captures, which lets callers avoid the C-style void*-ptr closure style. (Currently, callers in clang store state in the PassManagerBuilderBase arg.) No functional change, and the new API is backwards-compatible. Reviewers: chandlerc Subscribers: joker.eph, cfe-commits Differential Revision: http://reviews.llvm.org/D18613 llvm-svn: 264918	2016-03-30 20:39:29 +00:00
Hal Finkel	2e0ff2b244	[LoopVectorize] Don't vectorize loops when everything will be scalarized This change prevents the loop vectorizer from vectorizing when all of the vector types it generates will be scalarized. I've run into this problem on the PPC's QPX vector ISA, which only holds floating-point vector types. The loop vectorizer will, however, happily vectorize loops with purely integer computation. Here's an example: LV: The Smallest and Widest types: 32 / 32 bits. LV: The Widest register is: 256 bits. LV: Found an estimated cost of 0 for VF 1 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 1 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 1 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 1 for VF 1 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 1 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 1 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 1 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Scalar loop costs: 3. LV: Found an estimated cost of 0 for VF 2 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 2 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 2 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 2 for VF 2 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 2 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 2 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 2 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 2 costs: 2. LV: Found an estimated cost of 0 for VF 4 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 4 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 4 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 4 for VF 4 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 4 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 4 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 4 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 4 costs: 1. ... LV: Selecting VF: 8. LV: The target has 32 registers LV(REG): Calculating max register usage: LV(REG): At #0 Interval # 0 LV(REG): At #1 Interval # 1 LV(REG): At #2 Interval # 2 LV(REG): At #4 Interval # 1 LV(REG): At #5 Interval # 1 LV(REG): VF = 8 The problem is that the cost model here is not wrong, exactly. Since all of these operations are scalarized, their cost (aside from the uniform ones) are indeed VF*(scalar cost), just as the model suggests. In fact, the larger the VF picked, the lower the relative overhead from the loop itself (and the induction-variable update and check), and so in a sense, picking the largest VF here is the right thing to do. The problem is that vectorizing like this, where all of the vectors will be scalarized in the backend, isn't really vectorizing, but rather interleaving. By itself, this would be okay, but then the vectorizer itself also interleaves, and that's where the problem manifests itself. There's aren't actually enough scalar registers to support the normal interleave factor multiplied by a factor of VF (8 in this example). In other words, the problem with this is that our register-pressure heuristic does not account for scalarization. While we might want to improve our register-pressure heuristic, I don't think this is the right motivating case for that work. Here we have a more-basic problem: The job of the vectorizer is to vectorize things (interleaving aside), and if the IR it generates won't generate any actual vector code, then something is wrong. Thus, if every type looks like it will be scalarized (i.e. will be split into VF or more parts), then don't consider that VF. This is not a problem specific to PPC/QPX, however. The problem comes up under SSE on x86 too, and as such, this change fixes PR26837 too. I've added Sanjay's reduced test case from PR26837 to this commit. Differential Revision: http://reviews.llvm.org/D18537 llvm-svn: 264904	2016-03-30 19:37:08 +00:00
Rong Xu	b534166fd4	[PGO] PGOFuncName in LTO optimizations PGOFuncNames are used as the key to retrieve the Function definition from the MD5 stored in the profile. For internal linkage function, we prefix the source file name to the PGOFuncNames. LTO's internalization privatizes many global linkage symbols. This happens after value profile annotation, but those internal linkage functions should not have a source prefix. To differentiate compiler generated internal symbols from original ones, PGOFuncName meta data are created and attached to the original internal symbols in the value profile annotation step. If a symbol does not have the meta data, its original linkage must be non-internal. Also add a new map that maps PGOFuncName's MD5 value to the function definition. Differential Revision: http://reviews.llvm.org/D17895 llvm-svn: 264902	2016-03-30 18:37:52 +00:00
Teresa Johnson	83c517c44e	Restore "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly" This restores commit 264869, with a fix for windows bots to properly escape '\' in the path when serializing out. Added test. llvm-svn: 264884	2016-03-30 18:15:08 +00:00
Chad Rosier	f7ac5f28ab	[AArch64] Fix warnings pointed out by Hal. llvm-svn: 264882	2016-03-30 18:08:51 +00:00
Rong Xu	311ada11f8	[PGO] Use ArrayRef in annotateValueSite() Using ArrayRef in annotateValueSite's parameter instead of using an array and it's size. Differential Revision: http://reviews.llvm.org/D18568 llvm-svn: 264879	2016-03-30 16:56:31 +00:00
Tom Stellard	1d5e6d4bdc	AMDGPU/SI: Improve MachineSchedModel definition This patch contains a few improvements to the model, including: - Using a single resource with a defined buffers size for each memory unit. - Setting the IssueWidth correctly. - Fixing latency values for memory instructions. shader-db stats: 16429 shaders in 3231 tests Totals: SGPRS: 318232 -> 312328 (-1.86 %) VGPRS: 208996 -> 209346 (0.17 %) Code Size: 7147044 -> 7166440 (0.27 %) bytes LDS: 83 -> 83 (0.00 %) blocks Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave Max Waves: 49182 -> 49243 (0.12 %) Wait states: 0 -> 0 (0.00 %)A Differential Revision: http://reviews.llvm.org/D18453 llvm-svn: 264877	2016-03-30 16:35:13 +00:00
Tom Stellard	0bc954e3bc	AMDGPU/SI: Enable lanemask tracking in misched Summary: This results in higher register usage, but should make it easier for the compiler to hide latency. This pass is a prerequisite for some more scheduler improvements, and I think the increase register usage with this patch is acceptable, because when combined with the scheduler improvements, the total register usage will decrease. shader-db stats: 2382 shaders in 478 tests Totals: SGPRS: 48672 -> 49088 (0.85 %) VGPRS: 34148 -> 34847 (2.05 %) Code Size: 1285816 -> 1289128 (0.26 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 492544 -> 573440 (16.42 %) bytes per wave Max Waves: 6856 -> 6846 (-0.15 %) Wait states: 0 -> 0 (0.00 %) Depends on D18451 Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18452 llvm-svn: 264876	2016-03-30 16:35:09 +00:00
Jonas Paulsson	f76123386a	[SystemZ] Add nop and nopr InstAliases. For compatability with GAS, nop and nopr are recognized as alises for bc and bcr, respectively. A mask of 0 turns these instructions effectively into no-operations. Reviewed by Ulrich Weigand. llvm-svn: 264875	2016-03-30 16:11:58 +00:00
Nirav Dave	8dd66e5753	Remove HasFnAttribute guards to getFnAttribute calls These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872	2016-03-30 15:41:12 +00:00
Teresa Johnson	20beeea24a	Revert "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly" This reverts commit r264869. I am seeing Windows bot failures due to the "\" in the path being mishandled at some point (seems to be interpreted wrongly at some point and llvm-as \| llvm-dis is yielding some junk characters). Need to investigate. llvm-svn: 264871	2016-03-30 15:16:04 +00:00
Simon Pilgrim	b87ffe8519	[X86][XOP] BITREVERSE lowering using VPPERM XOP's VPPERM has some great 'permute operations' that it can do as well as part of shuffling the bytes of a 128-bit vector - in this case we use it to perform BITREVERSE in a single instruction. llvm-svn: 264870	2016-03-30 14:14:00 +00:00
Teresa Johnson	832a6790f6	[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly Summary: This change serializes out and in the SourceFileName to LLVM assembly so that it is preserved through "llvm-dis \| llvm-as". This is necessary to ensure that the global identifiers created for local values in the module summary index are the same even if the bitcode is streamed out and read back from LLVM assembly. Serializing the summary itself to LLVM assembly is in progress. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18588 llvm-svn: 264869	2016-03-30 14:00:02 +00:00
Benjamin Kramer	9415e06da7	[NVPTX] Avoid temporary std::string and make single-use function local to the cpp file. No functionality change intended. llvm-svn: 264861	2016-03-30 12:31:51 +00:00
James Molloy	8e46cd05a1	[VectorUtils] Don't try and truncate PHIs to a smaller bitwidth We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852	2016-03-30 10:11:43 +00:00
Chandler Carruth	8e06a10d1f	[x86] Fix a horrible bug in our lowering of x86 floating point atomic operations. Specifically, we had code that tried to badly approximate reconstructing all of the possible variations on addressing modes in two x86 instructions based on those in one pseudo instruction. This is not the first bug uncovered with doing this, so stop doing it altogether. Instead generically and pedantically copy every operand from the address over to both new instructions, and strip kill flags from any register operands. This fixes a subtle bug seen in the wild where we would mysteriously drop parts of the addressing mode, causing for example the index argument in the added test case to just be completely ignored. Hypothetically, this was an extremely bad miscompile because it actually caused a predictable and leveragable write of a 64bit quantity to an unintended offset (the first element of the array intead of whatever other element was intended). As a consequence, in theory this could even have introduced security vulnerabilities. However, this was only something that could happen with an atomic floating point add. No other operation could trigger this bug, so it seems extremely unlikely to have occured widely in the wild. But it did in fact occur, and frequently in scientific applications which were using relaxed atomic updates of a floating point value after adding a delta. Those would end up being quite badly miscompiled by LLVM, which is how we found this. Of course, this often looks like a race condition in the code, but it was actually a miscompile. I suspect that this whole RELEASE_FADD thing was a complete mistake. There is no such operation, and I worry that anything other than add will get remarkably worse codegeneration. But that's not for this change.... llvm-svn: 264845	2016-03-30 08:41:59 +00:00
Duncan P. N. Exon Smith	9071729966	IR: Constify LLVMContext::discardValueNames, NFC llvm-svn: 264823	2016-03-30 04:32:29 +00:00
Duncan P. N. Exon Smith	7457ecbebe	BitcodeReader: Fix weird whitespace, NFC llvm-svn: 264822	2016-03-30 04:21:52 +00:00
George Burgess IV	49cad7d70b	[MemorySSA] Make the visitor more careful with calls. Prior to this patch, the MemorySSA caching visitor would cache all calls that it visited. When paired with phi optimization, this can be problematic. Consider: define void @foo() { ; 1 = MemoryDef(liveOnEntry) call void @clobberFunction() br i1 undef, label %if.end, label %if.then if.then: ; MemoryUse(??) call void @readOnlyFunction() ; 2 = MemoryDef(1) call void @clobberFunction() br label %if.end if.end: ; 3 = MemoryPhi(...) ; MemoryUse(?) call void @readOnlyFunction() ret void } When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to cache them later. We ultimately end up not being able to optimize passed the Phi, so we set MemoryUse(?) to point to the Phi. We then cache the clobbering call for def 1 to be the Phi. This commit changes this behavior so that we wipe out any calls added to VisistedCalls while visiting the defs of a phi we couldn't optimize. Aside: With this patch, we now can bootstrap clang/LLVM without a single MemorySSA verifier failure. Woohoo. :) llvm-svn: 264820	2016-03-30 03:12:08 +00:00
Chandler Carruth	81c3ddeb1c	[x86] Extract a helper function to compute the full addressing mode from an x86 MachineInstr's operands. This will be super useful to fix some bad atomics code in my next commit. No functionality changed. llvm-svn: 264819	2016-03-30 03:10:24 +00:00
Xinliang David Li	a55fd1a9dc	[PGO] Handle invoke inst in IR based icall instrumentation Differential Revision: http://reviews.llvm.org/D18580 llvm-svn: 264818	2016-03-30 02:16:07 +00:00
George Burgess IV	82ee942a8c	[MemorySSA] Change how the walker views/walks visited phis. This patch teaches the caching MemorySSA walker a few things: 1. Not to walk Phis we've walked before. It seems that we tried to do this before, but it didn't work so well in cases like: define void @foo() { %1 = alloca i8 %2 = alloca i8 br label %begin begin: ; 3 = MemoryPhi({%0,liveOnEntry},{%end,2}) ; 1 = MemoryDef(3) store i8 0, i8* %2 br label %end end: ; MemoryUse(?) load i8, i8* %1 ; 2 = MemoryDef(1) store i8 0, i8* %2 br label %begin } Because we wouldn't put Phis in Q.Visited until we tried to visit them. So, when trying to optimize MemoryUse(?): - We would visit 3 above - ...Which would make us put {%0,liveOnEntry} in Q.Visited - ...Which would make us visit {%0,liveOnEntry} - ...Which would make us put {%end,2} in Q.Visited - ...Which would make us visit {%end,2} - ...Which would make us visit 3 - ...Which would realize we've already visited everything in 3 - ...Which would make us conservatively return 3. In the added test-case, (@looped_visitedonlyonce) this behavior would cause us to give incorrect results. Specifically, we'd visit 4 twice in the same query, but on the second visit, we'd skip while.cond because it had been visited, visit if.then/if.then2, and cache "1" as the clobbering def on the way back. 2. If we try to walk the defs of a {Phi,MemLoc} and see it has been visited before, just hand back the Phi we're trying to optimize. I promise this isn't as terrible as it seems. :) We now insert {Phi,MemLoc} pairs just before walking the Phi's upward defs. So, we check the cache for the {Phi,MemLoc} pair before checking if we've already walked the Phi. The {Phi,MemLoc} pair is (almost?) always guaranteed to have a cache entry if we've already fully walked it, because we cache as we go. So, if the {Phi,MemLoc} pair isn't in cache, either: (a) we must be in the process of visiting it (in which case, we can't give a better answer in a cache-as-we-go DFS walker) (b) we visited it, but didn't cache it on the way back (...which seems to require `ModifyingAccess` to not dominate `StartingAccess`, so I'm 99% sure that would be an error. If it's not an error, I haven't been able to get it to happen locally, so I suspect it's rare.) - - - - - As a consequence of this change, we no longer skip upward defs of phis, so we can kill the `VisitedOnlyOne` check. This gives us better accuracy than we had before, at the cost of potentially doing a bit more work when we have a loop. llvm-svn: 264814	2016-03-30 00:26:26 +00:00
Adam Nemet	fb8fbba584	[Aarch64] Turn on the LoopDataPrefetch pass for Cyclone llvm-svn: 264811	2016-03-30 00:21:29 +00:00
Adam Nemet	b81f1e0db3	[PPC] Remove -ppc-loop-prefetch-distance in favor of -prefetch-distance After the previous change, this can now be overridden centrally in the pass. llvm-svn: 264807	2016-03-29 23:45:56 +00:00
Adam Nemet	1428d41f9a	[LoopDataPrefetch] Centralize the tuning cl::opts under the pass This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806	2016-03-29 23:45:52 +00:00
Anna Zaks	1a470b6f7c	[tsan] Do not instrument reads/writes to instruction profile counters. We have known races on profile counters, which can be reproduced by enabling -fsanitize=thread and -fprofile-instr-generate simultaneously on a multi-threaded program. This patch avoids reporting those races by not instrumenting the reads and writes coming from the instruction profiler. llvm-svn: 264805	2016-03-29 23:19:40 +00:00
Kostya Serebryany	9e1a238357	[libFuzzer] more docs llvm-svn: 264803	2016-03-29 23:07:36 +00:00
Duncan P. N. Exon Smith	e8eb94a9a5	ADCE: Remove debug info intrinsics in dead scopes During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800	2016-03-29 22:57:12 +00:00
Fiona Glaser	44a2f7a298	MachineSink: make shouldSink a TII target hook Some targets may disagree on what they want sunk or not sunk, so make this a target hook instead of hardcoded. llvm-svn: 264799	2016-03-29 22:44:57 +00:00
Adam Nemet	85fba39390	[LoopDataPrefetch] Make more member functions private, NFC. llvm-svn: 264798	2016-03-29 22:40:02 +00:00
Derek Schuff	07636cd5e7	Add a print method to MachineFunctionProperties for better error messages This makes check failures much easier to understand. Make it empty (but leave it in the class) for NDEBUG builds. Differential Revision: http://reviews.llvm.org/D18529 llvm-svn: 264780	2016-03-29 20:28:20 +00:00
James Y Knight	7306cd47d4	[SPARC] Use AtomicExpandPass to expand AtomicRMW instructions. They were previously expanded to CAS loops in a custom isel expansion, but AtomicExpandPass knows how to do that generically. Testing is covered by the existing sparc atomics.ll testcases. llvm-svn: 264771	2016-03-29 19:09:54 +00:00
Matthias Braun	72a58c3e28	MachineVerifier: On dead-def live segments, check that corresponding machine operand has a dead flag llvm-svn: 264769	2016-03-29 19:07:43 +00:00
Matthias Braun	1c20c8280a	LiveVariables: Fix typo and shorten comment llvm-svn: 264768	2016-03-29 19:07:40 +00:00
Duncan P. N. Exon Smith	40b44e1d0a	IR: Add DbgInfoIntrinsic::getVariableLocation Create a common accessor, DbgInfoIntrinsic::getVariableLocation, which doesn't care about the type of debug info intrinsic. Use this to further unify the implementations of DbgDeclareInst::getAddress and DbgValueInst::getValue. Besides being a cleanup, I'm planning to use this to prepare DEBUG output without having to branch on the concrete type. llvm-svn: 264767	2016-03-29 18:56:03 +00:00
Teresa Johnson	b703c77b03	[ThinLTO] Remove post-pass metadata linking support Since we have moved to a model where functions are imported in bulk from each source module after making summary-based importing decisions, there is no longer a need to link metadata as a postpass, and all users have been removed. This essentially reverts r255909 and follow-on fixes. llvm-svn: 264763	2016-03-29 18:24:19 +00:00
Nirav Dave	2aab7f4358	Add support for no-jump-tables Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756	2016-03-29 17:46:23 +00:00
Derek Schuff	42666eeea2	Add MachineVerifier check for AllVRegsAllocated MachineFunctionProperty Summary: Check that any function that has the property set is free of virtual register operands. Also, it is actually VirtRegMap (and not the register allocators) that acutally remove the VReg operands (except for RegAllocFast). Reviewers: qcolombet Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: http://reviews.llvm.org/D18535 llvm-svn: 264755	2016-03-29 17:40:22 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Sanjoy Das	2381fcd557	[SCEV] Extract out a MatchBinaryOp; NFCI MatchBinaryOp abstracts out the IR instructions from the operations they represent. While this change is NFC, we will use this factoring later to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add X Y)`. llvm-svn: 264747	2016-03-29 16:40:44 +00:00
Sanjoy Das	260ad4dd63	[SCEV] Use Operator::getOpcode instead of manual dispatch; NFC llvm-svn: 264746	2016-03-29 16:40:39 +00:00
Justin Lebar	3db0b85fc8	Make InlineSimple's one-arg constructor explicit. NFC llvm-svn: 264744	2016-03-29 16:26:06 +00:00
Justin Lebar	bd145b3cb2	Reformat a comment in InlineSimple.cpp. NFC llvm-svn: 264743	2016-03-29 16:26:03 +00:00
Konstantin Zhuravlyov	ecc7cbf611	Test commit access llvm-svn: 264736	2016-03-29 15:15:44 +00:00
Teresa Johnson	efeae0e210	[ThinLTO] Use new GlobalValue::getGUID helper (NFC) This was already being used for functions and aliases, was missed when handling global variables. llvm-svn: 264734	2016-03-29 14:49:26 +00:00
Simon Dardis	9a3f32c00d	[mips] Test commit: Mark insertNoop as dead code (NFC) llvm-svn: 264728	2016-03-29 13:02:19 +00:00
Daniel Sanders	5d3840fdf9	[mips] Correct MIPS16 jal/jalx to have uimm26 offsets and add MC layer range checks. NFC. Summary: However, this has no effect at this time because the instructions affected are marked 'isCodeGenOnly=1' and have no alternative for the MC layer. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18179 llvm-svn: 264712	2016-03-29 09:40:38 +00:00
Elena Demikhovsky	95629caaa9	AVX-512: fixed a bug in fp_to_uint pattern on KNL Fixed fp_to_uint instruction selection on KNL. One pattern was missing for <4 x double> to <4 x i32> Differential Revision: http://reviews.llvm.org/D18512 llvm-svn: 264701	2016-03-29 06:33:41 +00:00
Duncan P. N. Exon Smith	bb7ce3b850	BitcodeReader: Allow METADATA_STRINGS to only have !"" Support parsing a METADATA_STRINGS record that only has a single piece of metadata, !"". Fixes a corner case in r264551. llvm-svn: 264699	2016-03-29 05:25:17 +00:00
Hyojin Sung	4673f10568	[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697	2016-03-29 04:08:57 +00:00
Matthias Braun	f54530ef00	RegisterPressure: Simplify liveness tracking when lanemasks are not checked. Split RegisterOperands code that collects defs/uses into a variant with and without lanemask tracking. This is a bit of code duplication, but there are enough subtle differences between the two variants that this seems cleaner (and potentially faster). This also fixes a problem where lanes where tracked even though TrackLaneMasks was false. This is part of the fix for http://llvm.org/PR27106. I will commit the testcase when it is completely fixed. llvm-svn: 264696	2016-03-29 03:54:22 +00:00
Matthias Braun	82cff88691	LiveVariables: Do not remove dead flags from vreg operands Also add a FIXME comment on why Mips RDDSP causes bogus dead flags to be added which LiveVariables cleans up by accident. llvm-svn: 264695	2016-03-29 03:08:18 +00:00
Hal Finkel	fa7057a415	[PowerPC] Refactor popcnt[dw] target features Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690	2016-03-29 01:36:01 +00:00
Kyle Butt	5e241b11ed	[Codegen] Decrease minimum jump table density. Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689	2016-03-29 00:23:41 +00:00
Easwaran Raman	6f4903d985	Sample profile summary cleanup Replace references to MaxHeadSamples with MaxFunctionCount Differential Revision: http://reviews.llvm.org/D18522 llvm-svn: 264686	2016-03-28 23:14:29 +00:00
Derek Schuff	ecabac6244	[WebAssembly] Remove duplicate disabling of passes Also put all the disabled passes together llvm-svn: 264684	2016-03-28 22:52:20 +00:00
Hal Finkel	69ada2f514	[PowerPC] Clarify a comment in PPCTTI about vector loads This should say that we could do unaligned vector loads on the P7 using VSX instructions, not that we should. llvm-svn: 264683	2016-03-28 22:39:35 +00:00
Evgeniy Stepanov	f575b2687c	Remove personality for declarations in CloneModule. Personality is copied as part of copyFunctionAttributes, but it is invalid on a declaration. Remove the personality attribute it the function body is not cloned. Also add a verifier run over output modules in the llvm-split tool. llvm-svn: 264667	2016-03-28 21:37:02 +00:00
Simon Pilgrim	d3df400fa9	[X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op for all their scalar elements. If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors. The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent. Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least. Differential Revision: http://reviews.llvm.org/D18492 llvm-svn: 264666	2016-03-28 21:33:52 +00:00
Vedant Kumar	86705ba5b1	Reapply (2x) "[PGO] Fix name encoding for ObjC-like functions" Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). What's changed since the original commit? - I fixed up the covmap-V2 binary format tests using a linux VM. - I weakened the CHECK lines in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. These will be fixed up in a follow-up. - I added an assert to make sure we don't get bitten by this again. - I constructed the c-general.profraw file without name compression enabled to appease some bots. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264658	2016-03-28 21:06:42 +00:00
Adrian Prantl	faebbb053d	Add an IR Verifier check for orphaned DICompileUnits. A DICompileUnit that is not listed in llvm.dbg.cu will cause assertion failures and/or crashes in the backend. The Verifier should reject this. rdar://problem/25369499 llvm-svn: 264657	2016-03-28 21:06:26 +00:00
Evgeniy Stepanov	a023f79db1	Handle section vs global name conflict. This is a fix for PR26941. When there is both a section and a global definition with the same name, the global wins. Section symbols are not added to the symbol table; section references are left undefined and fixed up in the object writer unless they've been satisfied by some other definition. llvm-svn: 264649	2016-03-28 20:36:28 +00:00
Ryan Govostes	653f9d0273	[asan] Support dead code stripping on Mach-O platforms On OS X El Capitan and iOS 9, the linker supports a new section attribute, live_support, which allows dead stripping to remove dead globals along with the ASAN metadata about them. With this change __asan_global structures are emitted in a new __DATA,__asan_globals section on Darwin. Additionally, there is a __DATA,__asan_liveness section with the live_support attribute. Each entry in this section is simply a tuple that binds together the liveness of a global variable and its ASAN metadata structure. Thus the metadata structure will be alive if and only if the global it references is also alive. Review: http://reviews.llvm.org/D16737 llvm-svn: 264645	2016-03-28 20:28:57 +00:00
Vedant Kumar	476a94d9ef	Revert "Reapply "[PGO] Fix name encoding for ObjC-like functions"" This reverts commit r264641 to investigate why c-general.test is failing on the bots. llvm-svn: 264643	2016-03-28 20:20:40 +00:00
Vedant Kumar	f20b6cec1c	Reapply "[PGO] Fix name encoding for ObjC-like functions" Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). This reverts the revert commit beaf3d18. What's changed? - I fixed up the covmap-V2 binary format tests using a linux VM. - I updated the expected counts in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. - I added an assert to make sure we don't get bitten by this again. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264641	2016-03-28 20:12:07 +00:00
Easwaran Raman	8f6b9efc36	Profile summary cleanup. Differential Revision: http://reviews.llvm.org/D18468 llvm-svn: 264619	2016-03-28 18:58:05 +00:00
Adam Nemet	2f36f05951	[PGO] Comment how function pointers for indirect calls are mapped to function names Summary: Hopefully this will make it easier for the next person to figure all this out... Reviewers: bogner, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18490 llvm-svn: 264611	2016-03-28 18:27:44 +00:00
Matthias Braun	b74eb41d58	MIRParser: Add %subreg.xxx syntax for subregister index operands Differential Revision: http://reviews.llvm.org/D18279 llvm-svn: 264608	2016-03-28 18:18:46 +00:00
Haicheng Wu	6a6bc750d5	[AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize Mimic what x86 does when optimizing sdiv/udiv for minsize. llvm-svn: 264606	2016-03-28 18:17:07 +00:00
Reid Kleckner	ba85781f58	Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops" This reverts commit r264596. It does not compile. llvm-svn: 264604	2016-03-28 18:07:40 +00:00
Hal Finkel	7059d41622	[PowerPC] On the A2, popcnt[dw] are very slow The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600	2016-03-28 17:52:08 +00:00
David Blaikie	b805f73ad1	Remove else after return llvm-svn: 264599	2016-03-28 17:45:48 +00:00

1 2 3 4 5 ...

88543 Commits