llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Lebar	62c20d8b3b	[StructurizeCFG] Refactor NearestCommonDominator. Summary: As far as I can tell, doing our own computations in NearestCommonDominator is a false optimization -- DomTree will build up what appears to be exactly this data when it decides it's worthwhile. Moreover, by building the cache ourselves, we cannot take advantage of the cache that the domtree might have available. In addition, I am not convinced of the correctness of the original code. In particular, setting ResultIndex = 1 on the first addBlock instead of setting it to 0 is quite fishy. Similarly, it's not clear to me that setting IndexMap[Node] = 0 for every node as we walk up the tree finding a common parent is correct. But rather than ponder over these questions, I'd rather just make the code do the obviously-correct thing. This patch also changes the NearestCommonDominator API a bit, improving the names and getting rid of the boolean parameter in addBlock -- see http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html Reviewers: arsenm Subscribers: aemerson, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26998 llvm-svn: 288050	2016-11-28 18:49:59 +00:00
Simon Pilgrim	2228f70a85	[X86][SSE] Add initial support for combining (V)PMOVZX with shuffles. llvm-svn: 288049	2016-11-28 17:58:19 +00:00
Adam Nemet	a415a9bde6	[GVN, OptDiag] Include the value that is forwarded in load elimination This requires some changes to the opt-diag API. Hal and I have discussed this at the Dev Meeting and came up with a streaming delimiter (setExtraArgs) to solve this. Arguments after this delimiter are only included in the optimization records and not in the remarks printed in the compiler output. (Note, how in the test the content of the YAML file changes but the remarks on the compiler output don't.) This implements the green GVN message with a bug fix at line http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446 The fix is that now we properly include the constant value in the message: "load of type i32 eliminated in favor of 7" Differential Revision: https://reviews.llvm.org/D26489 llvm-svn: 288047	2016-11-28 17:45:34 +00:00
Adam Nemet	e5112b14b9	[GVN] Basic optimization remark support Follow-on patches will add more interesting cases. The goal of this patch-set is to get the GVN messages printed in opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This is the optimization view for the function (the last remark in the function has a bug which is fixed in this series): http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430 Differential Revision: https://reviews.llvm.org/D26488 llvm-svn: 288046	2016-11-28 17:45:28 +00:00
Sanjay Patel	100bc01a72	[x86] fix formatting; NFC llvm-svn: 288045	2016-11-28 17:39:21 +00:00
Adam Nemet	58951d3869	[LTO] Move finishOptimizationRemarks after codegen This addresses the comment D26832. llvm-svn: 288041	2016-11-28 16:51:49 +00:00
Simon Pilgrim	3f10e66981	[X86][SSE] Added support for combining bit-shifts with shuffles. Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining. Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations. llvm-svn: 288040	2016-11-28 16:25:01 +00:00
Daniel Cederman	59168e28e0	Test commit llvm-svn: 288036	2016-11-28 15:33:03 +00:00
Nirav Dave	a413361798	Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor" This reverts commit r287773 which caused issues with ppc64le builds. llvm-svn: 288035	2016-11-28 14:30:29 +00:00
Ulrich Weigand	a29bf16ed5	[SystemZ] Fix build bot fallout from r288030 Remove unused variable that came in due to a copy-and-paste bug and caused build bot failures. llvm-svn: 288033	2016-11-28 14:24:14 +00:00
Ulrich Weigand	84404f30b3	[SystemZ] Support execution hint instructions This adds assembler support for the instructions provided by the execution-hint facility (NIAI and BP(R)P). This required adding support for the new relocation types for 12-bit and 24-bit PC- relative offsets used by the BP(R)P instructions. llvm-svn: 288031	2016-11-28 14:01:51 +00:00
Ulrich Weigand	2d9e3d9d3b	[SystemZ] Support load-and-trap instructions This adds support for the instructions provided with the load-and-trap facility. llvm-svn: 288030	2016-11-28 13:59:22 +00:00
Ulrich Weigand	758399131a	[SystemZ] Add remaining branch instructions This patch adds assembler support for the remaining branch instructions: the non-relative branch on count variants, and all variants of branch on index. The only one of those that can be readily exploited for code generation is BRCTH (branch on count using a high 32-bit register as count). Do use it, however, it is necessary to also introduce a hew CHIMux pseudo to allow comparisons of a 32-bit value agains a short immediate to go into a high register as well (implemented via CHI/CIH). This causes a bit of codegen changes overall, but those have proven to be neutral (or even beneficial) in performance measurements. llvm-svn: 288029	2016-11-28 13:40:08 +00:00
Ulrich Weigand	524f276c74	[SystemZ] Improve use of conditional instructions This patch moves formation of LOC-type instructions from (late) IfConversion to the early if-conversion pass, and in some cases additionally creates them directly from select instructions during DAG instruction selection. To make early if-conversion work, the patch implements the canInsertSelect / insertSelect callbacks. It also implements the commuteInstructionImpl and FoldImmediate callbacks to enable generation of the full range of LOC instructions. Finally, the patch adds support for all instructions of the load-store-on-condition-2 facility, which allows using LOC instructions also for high registers. Due to the use of the GRX32 register class to enable high registers, we now also have to handle the cases where there are still no single hardware instructions (conditional move from a low register to a high register or vice versa). These are converted back to a branch sequence after register allocation. Since the expandRAPseudos callback is not allowed to create new basic blocks, this requires a simple new pass, modelled after the ARM/AArch64 ExpandPseudos pass. Overall, this patch causes significantly more LOC-type instructions to be used, and results in a measurable performance improvement. llvm-svn: 288028	2016-11-28 13:34:08 +00:00
Chandler Carruth	0c6efff178	[PM] Remove weird marking of invalidated analyses as "preserved". This never made a lot of sense. They've been invalidated for one IR unit but they aren't really preserved in any normal sense. It seemed like it would be an elegant way of communicating to outer IR units that pass managers and adaptors had already handled invalidation, but we've since ended up adding sets that model this more clearly: we're now using the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of "preserving" invalidated analyses didn't work. This patch moves to rely on that technique exclusively and removes the cumbersome API aspect of updating the preserved set when doing invalidation. This in turn will simplify a number of upcoming patches. This has a side benefit of exposing a number of places where we were failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This patch fixes those, and with those fixes shouldn't change any observable behavior. llvm-svn: 288023	2016-11-28 10:42:21 +00:00
Davide Italiano	0f0d5d8f8d	[ThreadPool] Rollback recent changes until I figure out the breakage. llvm-svn: 288018	2016-11-28 09:17:12 +00:00
Davide Italiano	3ea0bfa7e0	[ThreadPool] Simplify the interface. NFCI. The callers don't use the return value. Found by Michael Spencer. llvm-svn: 288016	2016-11-28 08:53:41 +00:00
Mehdi Amini	43c2428203	Revert "Improve error handling in YAML parsing" This reverts commit r288014, the unittest isn't passing llvm-svn: 288015	2016-11-28 04:57:04 +00:00
Mehdi Amini	c54281be4f	Improve error handling in YAML parsing Some scanner errors were not checked and reported by the parser. Fix PR30934 Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D26419 llvm-svn: 288014	2016-11-28 04:44:13 +00:00
Craig Topper	17786f77f0	[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output. llvm-svn: 288011	2016-11-27 21:37:04 +00:00
Craig Topper	13b27a2748	[X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns. llvm-svn: 288010	2016-11-27 21:37:02 +00:00
Craig Topper	ff9d45875a	[X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions. llvm-svn: 288009	2016-11-27 21:37:00 +00:00
Craig Topper	3674f44e40	[X86] Add SHL by 1 to the load folding tables. I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships. llvm-svn: 288007	2016-11-27 21:36:54 +00:00
Simon Pilgrim	91d6f5fbc1	[X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts llvm-svn: 288006	2016-11-27 21:08:19 +00:00
Sanjay Patel	8ca30ab0c5	[InstSimplify] allow integer vector types to use computeKnownBits Note that the non-splat lshr+lshr test folded, but that does not work in general. Something is missing or wrong in computeKnownBits as the non-splat shl+shl test still shows. llvm-svn: 288005	2016-11-27 21:07:28 +00:00
Craig Topper	4fab487265	[AVX-512] Add integer and fp unpck instructions to load folding tables. llvm-svn: 288004	2016-11-27 19:51:41 +00:00
Simon Pilgrim	cdb2ce661d	[X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI. Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit). llvm-svn: 288003	2016-11-27 19:28:39 +00:00
Craig Topper	7ad961cc70	[X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size. If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist. I probably missed some instructions, but this should be a large portion of them. llvm-svn: 288001	2016-11-27 18:51:13 +00:00
Sanjay Patel	da9f7bf0fc	fix formatting; NFC llvm-svn: 287997	2016-11-27 15:53:48 +00:00
Craig Topper	c3b3926f8b	[AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables. llvm-svn: 287995	2016-11-27 08:55:31 +00:00
Craig Topper	fb64a25ba1	[X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction. Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load. llvm-svn: 287991	2016-11-27 01:52:51 +00:00
Craig Topper	837ff25da1	[X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold. llvm-svn: 287987	2016-11-26 18:43:26 +00:00
Craig Topper	e266e126ff	[X86] Fix the zero extending load detection in X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold. Previously we were passing the SCALAR_TO_VECTOR node. llvm-svn: 287986	2016-11-26 18:43:24 +00:00
Craig Topper	d3ab1a3905	[X86] Simplify control flow. NFCI llvm-svn: 287985	2016-11-26 18:43:21 +00:00
Craig Topper	991d1ca3ba	[X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times. Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied. Reviewers: RKSimon, zvi, delena, spatel Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D26790 llvm-svn: 287983	2016-11-26 17:29:25 +00:00
Sanjay Patel	8bd69b7ed9	[InstCombine] don't drop metadata in FoldOpIntoSelect() llvm-svn: 287980	2016-11-26 15:23:20 +00:00
Sanjay Patel	91e73a7bfa	add optional param to copy metadata when creating selects; NFC There are other spots where we can use this; we're currently dropping metadata in some places, and there are proposed changes where we will want to propagate metadata. IRBuilder's CreateSelect() already has a parameter like this, so this change makes the regular 'Create' API line up with that. llvm-svn: 287976	2016-11-26 15:01:59 +00:00
Craig Topper	10d5eec1a1	[AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables. llvm-svn: 287975	2016-11-26 08:21:52 +00:00
Craig Topper	97169ea5f9	[AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables. llvm-svn: 287974	2016-11-26 08:21:48 +00:00
Craig Topper	53b33de1e3	[AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables. llvm-svn: 287972	2016-11-26 07:21:00 +00:00
Craig Topper	6677bb4e50	[AVX-512] Teach LowerFormalArguments to use the extended register class when available. Fix the avx512vl stack folding tests to clobber more registers or otherwise they use xmm16 after this change. llvm-svn: 287971	2016-11-26 07:20:57 +00:00
Craig Topper	39265bb1ce	[AVX-512] Add VLX versions of VDIVPD/PS and VMULPD/PS to load folding tables. llvm-svn: 287970	2016-11-26 07:20:53 +00:00
Tom Stellard	1473f07ceb	AMDGPU/SI: Use float as the operand type for amdgcn.interp intrinsics Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26724 llvm-svn: 287962	2016-11-26 02:26:04 +00:00
Craig Topper	7f76c23781	[X86][XOP] Add a reversed reg/reg form for VPROT instructions. The W bit distinquishes which operand is the memory operand. But if the mod bits are 3 then the memory operand is a register and there are two possible encodings. We already did this correctly for several other XOP instructions. llvm-svn: 287961	2016-11-26 02:14:00 +00:00
Craig Topper	516fd7abfe	[X86] Add SSE, AVX, and AVX2 version of MOVDQU to the load/store folding tables for consistency. Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete. llvm-svn: 287960	2016-11-26 02:13:58 +00:00
Craig Topper	a363d42973	[AVX-512] Put the AVX-512 sections of the load folding tables into mostly alphabetical order. This is consistent with the older sections of the table. NFC llvm-svn: 287956	2016-11-25 23:21:34 +00:00
David Majnemer	d5648c7a7d	Replace some callers of setTailCall with setTailCallKind We were a little sloppy with adding tailcall markers. Be more consistent by using setTailCallKind instead of setTailCall. llvm-svn: 287955	2016-11-25 22:35:09 +00:00
Marek Olsak	79c05871a2	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it suggested as a better solution by Matt llvm-svn: 287942	2016-11-25 17:37:09 +00:00
Simon Pilgrim	c5fb167df0	Use SDValue helpers instead of explicitly going via SDValue::getNode(). NFCI llvm-svn: 287941	2016-11-25 17:25:21 +00:00
Simon Pilgrim	8e8ae7219f	Use SDValue helper instead of explicitly going via SDValue::getNode(). NFCI llvm-svn: 287940	2016-11-25 17:19:53 +00:00
Craig Topper	88071b37ab	[AVX-512] Add support for changing VSHUFF64x2 to VSHUFF32x4 when its feeding a vselect with 32-bit element size. Summary: Shuffle lowering may have widened the element size of a i32 shuffle to i64 before selecting X86ISD::SHUF128. If this shuffle was used by a vselect this can prevent us from selecting masked operations. This patch detects this and changes the element size to match the vselect. I don't handle changing integer to floating point or vice versa as its not clear if its better to push such a bitcast to the inputs of the shuffle or to the user of the vselect. So I'm ignoring that case for now. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27087 llvm-svn: 287939	2016-11-25 16:48:05 +00:00
Craig Topper	1e48829747	[AVX-512] Add VPERMT2* and VPERMI2* instructions to load folding tables. llvm-svn: 287937	2016-11-25 16:33:53 +00:00
Marek Olsak	e3895bfb47	Revert "AMDGPU: Implement SGPR spilling with scalar stores" This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. llvm-svn: 287936	2016-11-25 16:03:34 +00:00
Marek Olsak	dad553a5cf	Revert "AMDGPU: Fix MMO when splitting spill" This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1. llvm-svn: 287935	2016-11-25 16:03:27 +00:00
Marek Olsak	8cbbf65361	Revert "AMDGPU: Fix adding extra implicit def of register" This reverts commit e834ce5976567575621901fb967b8018b9916d71. llvm-svn: 287934	2016-11-25 16:03:22 +00:00
Marek Olsak	713e6fc531	Revert "AMDGPU: Fix not setting kill flag on temp reg when spilling" This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb. llvm-svn: 287933	2016-11-25 16:03:19 +00:00
Marek Olsak	a45dae458d	Revert "AMDGPU: Make m0 unallocatable" This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932	2016-11-25 16:03:15 +00:00
Marek Olsak	ea848df84c	Revert "AMDGPU: Remove m0 spilling code" This reverts commit f18de36554eb22416f8ba58e094e0272523a4301. llvm-svn: 287931	2016-11-25 16:03:06 +00:00
Marek Olsak	18a95bcb3c	Revert "AMDGPU: Preserve m0 value when spilling" This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce. llvm-svn: 287930	2016-11-25 16:03:02 +00:00
Abhilash Bhandari	54e5a1a4da	[Loop Unswitch] Patch to selective unswitch only the reachable branch instructions. Summary: The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops. Given the exponential nature of the algorithm, this is quite an overhead. This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header. Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao. Subscribers: llvm-commits. Differential Revision: http://reviews.llvm.org/D26299 llvm-svn: 287925	2016-11-25 14:07:44 +00:00
Simon Dardis	c08af6db5b	[mips] Correct jal expansion for local symbols in .local directives. This patch corrects the behaviour of code such as: .local foo jal foo foo: to use the correct jal expansion when writing ELF files. Patch by: Daniel Sanders Reviewers: zoran.jovanovic, seanbruno, vkalintiris Differential Revision: https://reviews.llvm.org/D24722 llvm-svn: 287918	2016-11-25 11:06:43 +00:00
Craig Topper	d4091494d3	[X86] Invert an 'if' and early out to fix a weird indentation. NFCI llvm-svn: 287909	2016-11-25 02:29:24 +00:00
Craig Topper	a46936185a	[X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCI llvm-svn: 287908	2016-11-25 02:29:21 +00:00
Craig Topper	8c4cdf06db	[DAGCombine] Teach DAG combine that if both inputs of a vselect are the same, then the condition doesn't matter and the vselect can be removed. Selects with scalar condition already handle this correctly. llvm-svn: 287904	2016-11-24 21:48:52 +00:00
Serge Rogatch	a331133e6d	Test commit access. llvm-svn: 287898	2016-11-24 18:51:47 +00:00
Simon Pilgrim	f1ee930db0	Fix unused variable warning llvm-svn: 287889	2016-11-24 15:24:47 +00:00
Benjamin Kramer	fc54e35d94	[X86] Don't round trip a unique_ptr through a raw pointer for assignment. No functional change. llvm-svn: 287888	2016-11-24 15:17:39 +00:00
Simon Pilgrim	9c71e07276	[X86][SSE] Improve UINT_TO_FP v2i32 -> v2f64 Vectorize UINT_TO_FP v2i32 -> v2f64 instead of scalarization (albeit still on the SIMD unit). The codegen matches that generated by legalization (and is in fact used by AVX for UINT_TO_FP v4i32 -> v4f64), but has to be done in the x86 backend to account for legalization via 4i32. Differential Revision: https://reviews.llvm.org/D26938 llvm-svn: 287886	2016-11-24 15:12:56 +00:00
Simon Pilgrim	841d7ca463	[X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882	2016-11-24 14:46:55 +00:00
Simon Pilgrim	7c26a6f9ef	[X86][AVX512DQVL] Add awareness of vcvtqq2ps and vcvtuqq2ps implicit zeroing of upper 64-bits of xmm result llvm-svn: 287878	2016-11-24 14:02:30 +00:00
Simon Pilgrim	ab323ec411	[X86][AVX512DQVL] Add support for v2i64 -> v2f32 SINT_TO_FP/UINT_TO_FP lowering llvm-svn: 287877	2016-11-24 13:38:59 +00:00
Nikolai Bozhenov	3a8d108b2b	[x86] Fixing PR28755 by precomputing the address used in CMPXCHG8B The bug arises during register allocation on i686 for CMPXCHG8B instruction when base pointer is needed. CMPXCHG8B needs 4 implicit registers (EAX, EBX, ECX, EDX) and a memory address, plus ESI is reserved as the base pointer. With such constraints the only way register allocator would do its job successfully is when the addressing mode of the instruction requires only one register. If that is not the case - we are emitting additional LEA instruction to compute the address. It fixes PR28755. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25088 llvm-svn: 287875	2016-11-24 13:23:35 +00:00
Nikolai Bozhenov	bb64aa14a3	[x86] Minor refactoring of X86TargetLowering::EmitInstrWithCustomInserter Move the definitions of three variables out of the switch. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25192 llvm-svn: 287874	2016-11-24 13:15:49 +00:00
Nikolai Bozhenov	a2dabed3b6	[x86] Rewrite getAddressFromInstr helper function - It does not modify the input instruction - Second operand of any address is always an Index Register, make sure we actually check for that, instead of a check for an immediate value Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D24938 llvm-svn: 287873	2016-11-24 13:05:43 +00:00
Simon Pilgrim	a3af79678e	[X86] Generalize CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes. NFCI Replace the CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes with general versions. This is an initial step towards similar FP_TO_SINT/FP_TO_UINT and SINT_TO_FP/UINT_TO_FP lowering to AVX512 CVTTPS2QQ/CVTTPS2UQQ and CVTQQ2PS/CVTUQQ2PS with illegal types. Differential Revision: https://reviews.llvm.org/D27072 llvm-svn: 287870	2016-11-24 12:13:46 +00:00
Peter Collingbourne	debb6f6cc1	Object: Add IRObjectFile::getTargetTriple(). This lets us remove a use of IRObjectFile::getModule() in llvm-nm. Differential Revision: https://reviews.llvm.org/D27074 llvm-svn: 287846	2016-11-24 01:13:09 +00:00
Peter Collingbourne	e32baa0c3e	Object: Simplify the IRObjectFile symbol iterator implementation. Change the IRObjectFile symbol iterator to be a pointer into a vector of PointerUnions representing either IR symbols or asm symbols. This change is in preparation for a future change for supporting multiple modules in an IRObjectFile. Although it causes an increase in memory consumption, we can deal with that issue separately by introducing a bitcode symbol table. Differential Revision: https://reviews.llvm.org/D26928 llvm-svn: 287845	2016-11-24 00:41:05 +00:00
Matt Arsenault	7b54dd039e	AMDGPU: Preserve m0 value when spilling llvm-svn: 287844	2016-11-24 00:26:50 +00:00
Matt Arsenault	94b32ffe8e	TRI: Add hook to pass scavenger during frame elimination The scavenger was not passed if requiresFrameIndexScavenging was enabled. I need to be able to test for the availability of an unallocatable register here, so I can't create a virtual register for it. It might be better to just always use the scavenger and stop creating virtual registers. llvm-svn: 287843	2016-11-24 00:26:47 +00:00
Matt Arsenault	5ee3325358	AMDGPU: Remove m0 spilling code Since m0 isn't allocatable it should never be spilled anymore. llvm-svn: 287842	2016-11-24 00:26:44 +00:00
Matt Arsenault	9e5c7b1031	AMDGPU: Make m0 unallocatable m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841	2016-11-24 00:26:40 +00:00
Davide Italiano	8812f28f47	[lib/LTO] Rename few instances of Lto to LTO. llvm-svn: 287840	2016-11-24 00:23:09 +00:00
Greg Clayton	e65439797a	Rely on a single DWARF version instead of having two copies This patch makes AsmPrinter less reliant on DwarfDebug by relying on the DWARF version in the AsmPrinter's MCStreamer's MCContext. This allows us to remove the redundant DWARF version from DwarfDebug. It also lets us change code that used to access the AsmPrinter's DwarfDebug just to get to the DWARF version by changing the DWARF version accessor on AsmPrinter so that it grabs the version from its MCStreamer's MCContext. Differential Revision: https://reviews.llvm.org/D27032 llvm-svn: 287839	2016-11-23 23:30:37 +00:00
Eugene Zelenko	570e39a25c	[DebugInfo] Fix some Clang-tidy modernize-use-default and Include What You Use warnings; other minor fixes (NFC). Per Zachary Turner and Mehdi Amini suggestion to make only post-commit reviews. llvm-svn: 287838	2016-11-23 23:16:32 +00:00
Simon Pilgrim	3ce6a545c7	[X86][SSE] Add awareness of (v)cvtpd2dq and vcvtpd2udq implicit zeroing of upper 64-bits of xmm result We've already added the equivalent for (v)cvttpd2dq (rL284459) and vcvttpd2udq llvm-svn: 287835	2016-11-23 22:35:06 +00:00
Nicolai Haehnle	934470f536	[SelectionDAG] Early-out in TargetLowering::expandMUL (NFC) Summary: Reduce indentation level; preparation for D24956. Reviewers: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27063 llvm-svn: 287831	2016-11-23 22:14:20 +00:00
Matt Arsenault	a24d84beb9	AMDGPU: Cleanup immediate folding code Move code down to use, reorder to avoid hard to follow immediate folding logic. llvm-svn: 287818	2016-11-23 21:51:07 +00:00
Matt Arsenault	391c3ea9bc	AMDGPU: Fix debug printing The uint8_t was printed as a char which didn't really work. llvm-svn: 287817	2016-11-23 21:51:05 +00:00
Matt Arsenault	997a9abf4c	AMDGPU: Fix not setting kill flag on temp reg when spilling llvm-svn: 287808	2016-11-23 21:00:12 +00:00
Matt Arsenault	dd0cb2a3e5	AMDGPU: Fix adding extra implicit def of register In the scalar case, there's no reason to add an additional def of the same register. llvm-svn: 287807	2016-11-23 21:00:10 +00:00
Matt Arsenault	2669a76f01	AMDGPU: Fix MMO when splitting spill The size and offset were wrong. The size of the object was being used for the size of the access, when here it is really being split into 4-byte accesses. The underlying object size is set in the MachinePointerInfo, which also didn't have the offset set. llvm-svn: 287806	2016-11-23 20:52:53 +00:00
Haicheng Wu	731b04ca43	[LoopUnroll] Move code to exit early. NFC. Just to save some compilation time. Differential Revision: https://reviews.llvm.org/D26784 llvm-svn: 287800	2016-11-23 19:39:26 +00:00
Daniel Berlin	4056253c4d	Revert "[Triple] Add Facebook vendor" This reverts commit r287684 Objections on the review thread had not been addressed to prior to commit. I asked the committer to revert, but i expect they are gone for the US holiday or something. llvm-svn: 287798	2016-11-23 19:03:54 +00:00
Michael Kuperstein	47eb85a003	[X86] Allow folding of stack reloads when loading a subreg of the spilled reg We did not support subregs in InlineSpiller:foldMemoryOperand() because targets may not deal with them correctly. This adds a target hook to let the spiller know that a target can handle subregs, and actually enables it for x86 for the case of stack slot reloads. This fixes PR30832. Differential Revision: https://reviews.llvm.org/D26521 llvm-svn: 287792	2016-11-23 18:33:49 +00:00
Chandler Carruth	dab4eae274	[PM] Change the static object whose address is used to uniquely identify analyses to have a common type which is enforced rather than using a char object and a `void ` type when used as an identifier. This has a number of advantages. First, it at least helps some of the confusion raised in Justin Lebar's code review of why `void ` was being used everywhere by having a stronger type that connects to documentation about this. However, perhaps more importantly, it addresses a serious issue where the alignment of these pointer-like identifiers was unknown. This made it hard to use them in pointer-like data structures. We were already dodging this in dangerous ways to create the "all analyses" entry. In a subsequent patch I attempted to use these with TinyPtrVector and things fell apart in a very bad way. And it isn't just a compile time or type system issue. Worse than that, the actual alignment of these pointer-like opaque identifiers wasn't guaranteed to be a useful alignment as they were just characters. This change introduces a type to use as the "key" object whose address forms the opaque identifier. This both forces the objects to have proper alignment, and provides type checking that we get it right everywhere. It also makes the types somewhat less mysterious than `void `. We could go one step further and introduce a truly opaque pointer-like type to return from the `ID()` static function rather than returning `AnalysisKey `, but that didn't seem to be a clear win so this is just the initial change to get to a reliably typed and aligned object serving is a key for all the analyses. Thanks to Richard Smith and Justin Lebar for helping pick plausible names and avoid making this refactoring many times. =] And thanks to Sean for the super fast review! While here, I've tried to move away from the "PassID" nomenclature entirely as it wasn't really helping and is overloaded with old pass manager constructs. Now we have IDs for analyses, and key objects whose address can be used as IDs. Where possible and clear I've shortened this to just "ID". In a few places I kept "AnalysisID" to make it clear what was being identified. Differential Revision: https://reviews.llvm.org/D27031 llvm-svn: 287783	2016-11-23 17:53:26 +00:00
Alina Sbirlea	a3d2f703a5	[LoadStoreVectorizer] Enable vectorization of stores in the presence of an aliasing load Summary: The "getVectorizablePrefix" method would give up if it found an aliasing load for a store chain. In practice, the aliasing load can be treated as a memory barrier and all stores that precede it are a valid vectorizable prefix. Issue found by volkan in D26962. Testcase is a pruned version of the one in the original patch. Reviewers: jlebar, arsenm, tstellarAMD Subscribers: mzolotukhin, wdng, nhaehnle, anna, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D27008 llvm-svn: 287781	2016-11-23 17:43:15 +00:00
Nirav Dave	cf34556330	[DAG] Improve loads-from-store forwarding to handle TokenFactor Forward store values to matching loads down through token factors. Factored from D14834. Reviewers: jyknight, hfinkel Subscribers: hfinkel, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D26080 llvm-svn: 287773	2016-11-23 16:48:35 +00:00
John Brawn	150addb45c	[DAGCombiner] Fix infinite loop in vector mul/shl combining We have the following DAGCombiner transformations: (mul (shl X, c1), c2) -> (mul X, c2 << c1) (mul (shl X, C), Y) -> (shl (mul X, Y), C) (shl (mul x, c1), c2) -> (mul x, c1 << c2) Usually the constant shift is optimised by SelectionDAG::getNode when it is constructed, by SelectionDAG::FoldConstantArithmetic, but when we're dealing with vectors and one of those vector constants contains an undef element FoldConstantArithmetic does not fold and we enter an infinite loop. Fix this by making FoldConstantArithmetic use getNode to decide how to fold each vector element, the same as FoldConstantVectorArithmetic does, and rather than adding the constant shift to the work list instead only apply the transformation if it's already been folded into a constant, as if it's not we're going to loop endlessly. Additionally add missing NoOpaques to one of those transformations, which I noticed when writing the tests for this. Differential Revision: https://reviews.llvm.org/D26605 llvm-svn: 287766	2016-11-23 16:05:51 +00:00
Nemanja Ivanovic	10fc3cfc63	[PowerPC] Remove InstAlias definitions that cause incorrect assembly In rL283190, I added some InstAlias definitions to generate extended mnemonics for some uses of the XXPERMDI instruction. However, when the assembler matches these extended mnemonics, it matches the new instruction in situations where it should match the old one. This patch removes these definitions and accomplishes that by defining these mnemonics with additional instructions that are isCodeGenOnly. Fixes PR31127. llvm-svn: 287765	2016-11-23 15:51:52 +00:00
Simon Pilgrim	4e9b9cbee9	[X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287762	2016-11-23 14:01:18 +00:00
Elena Demikhovsky	09375d98b8	Type legalization for compressstore and expandload intrinsics. Implemented widening (v2f32) and splitting (v16f64). On splitting, I use "popcnt" to calculate memory increment. More type legalization work will come in the next patches. llvm-svn: 287761	2016-11-23 13:58:24 +00:00
Simon Pilgrim	03cd8f887c	[CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs llvm-svn: 287760	2016-11-23 13:42:09 +00:00
Benjamin Kramer	8a3c49897f	[MD5] Use write32le instead of spelling it out with shifts. No functionality change intended. llvm-svn: 287757	2016-11-23 11:49:28 +00:00
Craig Topper	f57e17def0	[AVX-512] Remove intrinsics for valignd/q and autoupgrade them to native shuffles. llvm-svn: 287744	2016-11-23 06:54:55 +00:00
Zvi Rackover	14aba43ea9	[X86] Simplify lowerVectorShuffleAsBitMask to handle only integer VT's Summary: This function is only called with integer VT arguments, so remove code that handles FP vectors. Reviewers: RKSimon, craig.topper, delena, andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26985 llvm-svn: 287743	2016-11-23 06:45:25 +00:00
Rui Ueyama	c464fadc64	Fix builbots. llvm-svn: 287735	2016-11-23 03:58:12 +00:00
Kuba Mracek	06995e866b	[xray] Add XRay support for Mach-O in CodeGen Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support. Differential Revision: https://reviews.llvm.org/D26983 llvm-svn: 287734	2016-11-23 02:07:04 +00:00
Rui Ueyama	877c26c844	Add convenient functions to compute hashes of byte vectors. In many sitautions, you just want to compute a hash for one chunk of data. This patch adds convenient functions for that purpose. Differential Revision: https://reviews.llvm.org/D26988 llvm-svn: 287726	2016-11-23 00:46:09 +00:00
Justin Lebar	6c0f25aec6	[StructurizeCFG] Refactor OrderNodes. Summary: No need to copy the RPOT vector before using it. Switch from std::map to SmallDenseMap. Get rid of an unused variable (TempVisited). Get rid of a typedef, RNVector, which is now used only once. Differential Revision: https://reviews.llvm.org/D26997 llvm-svn: 287721	2016-11-22 23:14:11 +00:00
Justin Lebar	23aaf60277	[StructurizeCFG] Add whitespace in getAnalysisUsage. Summary: "addRequired" and "addPreserved" look very similar when squished up next to each other -- without the newline this code looked to me like it was addRequired'ing DominatorTreeWrapperPass twice. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26996 llvm-svn: 287720	2016-11-22 23:14:07 +00:00
Justin Lebar	820db74c1e	[StructurizeCFG] Remove unnecessary "using" in class. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26995 llvm-svn: 287719	2016-11-22 23:13:49 +00:00
Justin Lebar	73c4baf3a3	[StructurizeCFG] Merge the two constructors into one. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26994 llvm-svn: 287718	2016-11-22 23:13:44 +00:00
Justin Lebar	1b60d70025	[StructurizeCFG] Use a for-each loop instead of iterators in runOnRegion. Summary: Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26993 llvm-svn: 287717	2016-11-22 23:13:37 +00:00
Justin Lebar	c7445d5731	[StructurizeCFG] Make hasOnlyUniformBranches a non-member function. Summary: Lets us get rid of one member variable too. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26992 llvm-svn: 287716	2016-11-22 23:13:33 +00:00
Sanjay Patel	1e6ca44a8e	add and use isBitwiseLogicOp() helper function; NFCI llvm-svn: 287712	2016-11-22 22:54:36 +00:00
Dehao Chen	554f500ae2	Before sample pgo annotation, do not inline a function that has no debug info. (NFC) If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119. llvm-svn: 287710	2016-11-22 22:50:01 +00:00
Davide Italiano	e7ffae9dea	[SCCP] Remove code in visitBinaryOperator (and add tests). We visit and/or, we try to derive a lattice value for the instruction even if one of the operands is overdefined. If the non-overdefined value is still 'unknown' just return and wait for ResolvedUndefsIn to "plug in" the correct value. This simplifies the logic a bit. While I'm here add tests for missing cases. llvm-svn: 287709	2016-11-22 22:11:25 +00:00
Matthias Braun	7f423442d1	TargetSubtargetInfo: Move implementation to lib/CodeGen; NFC TargetSubtargetInfo is filled with CodeGen specific interfaces nowadays (getInstrInfo(), getFrameLowering(), getSelectionDAGInfo()) most of the tuning flags like enablePostRAScheduler(), getAntiDepBreakMode(), enableRALocalReassignment(), ... also do not seem to be universal enough to make sense outside of CodeGen. Differential Revision: https://reviews.llvm.org/D26948 llvm-svn: 287708	2016-11-22 22:09:03 +00:00
Sanjay Patel	e359eaaf70	[InstCombine] change bitwise logic type to eliminate bitcasts In PR27925: https://llvm.org/bugs/show_bug.cgi?id=27925 ...we proposed adding this fold to eliminate a bitcast. In D20774, there was some concern about changing the type of a bitwise op as well as creating bitcasts that might not be free for a target. However, if we're strictly eliminating an instruction (by limiting this to one-use ops), then we should be able to do this in InstCombine. But we're cautiously restricting the transform for now to vector types to avoid possible backend problems. A transform to make sure the logic op is legal for the target should be added to reverse this transform and improve codegen. Differential Revision: https://reviews.llvm.org/D26641 llvm-svn: 287707	2016-11-22 22:05:48 +00:00
Chandler Carruth	9eb857cb84	[LCG] Add a previously missing assert about the relationship of RefSCCs. No intended change, everything seems to be in working order already. llvm-svn: 287705	2016-11-22 21:40:10 +00:00
Vyacheslav Klochkov	9a630dfb57	Fixed the lost FastMathFlags in GVN(Global Value Numbering). Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26952 llvm-svn: 287700	2016-11-22 20:52:53 +00:00
Rui Ueyama	2b4ba04d57	Remove PDBFileBuilder::build() and related functions. PDBFileBuilder supports two different ways to create files. One is PDBFileBuilder::commit. That function takes a filename and write a result to the file. The other is PDBFileBuilder::build. That returns a new PDBFile object. This patch removes the latter because no one is using it and in a real life situation we are very unlikely to need it. Even if you need it, it'd be easy to write a new PDB to a memory buffer and read it back. Removing PDBFileBuilder::build enables us to remove other classes build transitively. Differential Revision: https://reviews.llvm.org/D26987 llvm-svn: 287697	2016-11-22 20:32:22 +00:00
Vyacheslav Klochkov	68a677ae5b	Fixed the lost FastMathFlags in Reassociate optimization. Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26957 llvm-svn: 287695	2016-11-22 20:23:04 +00:00
Paul Robinson	f428c9b298	Restructure DwarfDebug::beginInstruction(). [NFC] Will help a pending patch. Differential Revision: http://reviews.llvm.org/D26982 llvm-svn: 287686	2016-11-22 19:46:51 +00:00
Shoaib Meenai	5497121613	[Triple] Add Facebook vendor Add a compiler vendor for Facebook, to enable future vendor-specific behavior. Differential Revision: https://reviews.llvm.org/D25136 llvm-svn: 287684	2016-11-22 19:36:26 +00:00
Chandler Carruth	bae595b742	[LCG] Add utilities to compute parent and ascestor relationships between SCCs. These will be fairly expensive routines to call and might be abused in real code, but are quite useful when debugging or in asserts and are reasonable and well formed properties to query. I've used one of them in an assert that was requested in a code review here. In subsequent commits I'll start using these routines more heavily, for example in unittests etc. But this at least gets the groundwork in place. Differential Revision: https://reviews.llvm.org/D25506 llvm-svn: 287682	2016-11-22 19:23:31 +00:00
Simon Dardis	6efb8dd2e3	[mips] seb, seh instruction aliases Add the single operand form. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D26961 llvm-svn: 287681	2016-11-22 19:17:23 +00:00
Nemanja Ivanovic	b8e30d6db6	[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE This patch corresponds to review: https://reviews.llvm.org/D26861 It also fixes PR30730. Committing on behalf of Lei Huang. llvm-svn: 287679	2016-11-22 19:02:07 +00:00
Simon Pilgrim	4aa876ca7c	[X86][SSE] Combine UNPCKL(FHADD,FHADD) -> FHADD for v2f64 shuffles. This occurs during UINT_TO_FP v2f64 lowering. We can easily generalize this to other horizontal ops (FHSUB, PACKSS, PACKUS) as required - we are doing something similar with PACKUS in lowerV2I64VectorShuffle llvm-svn: 287676	2016-11-22 17:50:06 +00:00
Vasileios Kalintiris	04dc211e6a	[mips] Add support for unaligned load/store macros. Add missing unaligned store macros (ush/usw) and fix the exisiting implementation of the unaligned load macros in order to generate identical expansions with the GNU assembler. llvm-svn: 287646	2016-11-22 16:43:49 +00:00
Tim Northover	b64fb453ea	CodeGen: simplify TargetMachine::getSymbol interface. NFC. No-one actually had a mangler handy when calling this function, and getSymbol itself went most of the way towards getting its own mangler (with a local TLOF variable) so forcing all callers to supply one was just extra complication. llvm-svn: 287645	2016-11-22 16:17:20 +00:00
Zvi Rackover	9a355219d1	[X86] Change lowerBuildVectorToBitOp() to take a BuildVectorSDNode. NFC. llvm-svn: 287644	2016-11-22 15:33:28 +00:00
Zvi Rackover	0aa1c32d14	[X86] Remove dead code from LowerVectorBroadcast Summary: Splat vectors are canonicalized to BUILD_VECTOR's so the code can be simplified. NFC-ish. Reviewers: craig.topper, delena, RKSimon, andreadb Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D26678 llvm-svn: 287643	2016-11-22 15:17:52 +00:00
Chad Rosier	ecc77273a0	[AArch64] Set the max interleave factor for Falkor. llvm-svn: 287642	2016-11-22 14:25:02 +00:00
Chad Rosier	2abc29c593	[AArch64] Maximize 80-column. NFC. llvm-svn: 287640	2016-11-22 14:12:09 +00:00
Simon Pilgrim	72e43570b7	[SelectionDAG] ComputeNumSignBits of TRUNCATE operations Add basic ComputeNumSignBits support for TRUNCATE ops for cases where the source's number of sign bits overlaps with the truncated size. Improves X86 SIGN_EXTEND_IN_REG vector cases which were needlessly sign extending boolean vector results. Differential Revision: https://reviews.llvm.org/D26851 llvm-svn: 287635	2016-11-22 11:29:19 +00:00
Coby Tayree	49b3733d57	[AVX512][inline-asm] Fix AVX512 inline assembly instruction resolution when the size qualifier of a memory operand is not specified explicitly. This commit handles cases where the size qualifier of an indirect memory reference operand in Intel syntax is missing (e.g. "vaddps xmm1, xmm2, [a]"). GCC will deduce the size qualifier for AVX512 vector and broadcast memory operands based on the possible matches: "vaddps xmm1, xmm2, [a]" matches only “XMMWORD PTR” qualifier. "vaddps xmm1, xmm2, [a]{1to4}" matches only “DWORD PTR” qualifier. This is different from the current behavior of LLVM, which deduces the size qualifier based on the size of the memory operand. For "vaddps xmm1, xmm2, [a]" "char a;" will imply "BYTE PTR" qualifier "short a;" will imply "WORD PTR" qualifier. This commit aligns LLVM to GCC’s behavior. This is the LLVM part of the review. The Clang part of the review: https://reviews.llvm.org/D26587 Differential Revision: https://reviews.llvm.org/D26586 llvm-svn: 287630	2016-11-22 09:30:29 +00:00
Adam Nemet	de33651bd9	Rename option to -lto-pass-remarks-output The new option -pass-remarks-output broke LLVM_LINK_LLVM_DYLIB because of the duplicate option name with opt. llvm-svn: 287627	2016-11-22 07:35:14 +00:00
Craig Topper	3dcf45f08d	[X86] Remove alternate CodeGenOnly version of (v)movq that declared the load size as i128mem. Change all uses to the use the i64mem version. I'm sure this caused the load size to misprint in Intel syntax output. We were also inconsistent about which patterns used which instruction between VEX and EVEX. There are two different reg/reg versions of movq, one from a GPR and one from the lower 64-bits of an XMM register. This changes the loading folding table to use the single i64mem memory form for folding both cases. But we need to use TB_NO_REVERSE to prevent a duplicate entry in the unfolding table. llvm-svn: 287622	2016-11-22 05:31:43 +00:00
Craig Topper	cada9f2275	[AVX-512] Add support for commuting VPERMT2(B/W/D/Q/PS/PD) to/from VPERMI2(B/W/D/Q/PS/PD). Summary: The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities. We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too. Reviewers: igorb, delena, Ayal, Farhana, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25652 llvm-svn: 287621	2016-11-22 04:57:34 +00:00
Saleem Abdulrasool	9b106ea072	MC: ensure that we have a section before accessing it We would attempt to access the symbol section without ensuring that the symbol was not absolute. When the assembler referenced relocation is not evaluated to the absolute, but when we record the relocation, we would query the section. Because the symbol is absolute, it does not have a section associated with it, triggering an assertion. Just be more careful about the access of the section. Addresses PR31064! llvm-svn: 287619	2016-11-22 04:32:54 +00:00
Craig Topper	da22267055	[AVX-512] Add support for changing the element size of PALIGNR/VALIGND/VALIGNQ shuffles if they feed a vselect with a different type Summary: Shuffle lowering widens the element size of a shuffle if elements are contiguous. This is sometimes help because wider element types have more shuffle options. If the shuffle is one of the arguments to a vselect this shuffle widening can introduce a bitcast between the vselect and the shuffle. This will prevent isel from selecting a masked operation. If the shuffle can be written equally efficiently with a different element size to match the vselect type we should change the shuffle type to allow masking. This patch does this conversion for all VALIGND/VALIGNQ sizes. It also supports turning 128-bit PALIGNR into VALIGND/VALIGNQ. This fixes the case shown in PR31018. I plan to add support for more operations in future patches. Reviewers: RKSimon, zvi, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26902 llvm-svn: 287612	2016-11-22 03:51:53 +00:00
Peter Collingbourne	435890a4fe	Object: Make SymbolicFile::symbol_{begin,end}() virtual and remove unnecessary wrappers. llvm-svn: 287611	2016-11-22 03:38:40 +00:00
Stanislav Mekhanoshin	ae0f6620e4	[AMDGPU] Fix multiple vreg definitions in si-lower-control-flow Differential Revision: https://reviews.llvm.org/D26939 llvm-svn: 287608	2016-11-22 01:42:34 +00:00
Peter Collingbourne	0a4fc46321	Analysis: gep inbounds (gep inbounds (...)) is inbounds. Differential Revision: https://reviews.llvm.org/D26441 llvm-svn: 287604	2016-11-22 01:03:40 +00:00
Matt Arsenault	b30d2aca58	DAG: Ignore call site attributes when emitting target intrinsic A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. llvm-svn: 287593	2016-11-21 22:56:42 +00:00
Geoff Berry	e0bf52f394	[AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber. Summary: When searching for load/store instructions to pair/merge don't treat writes to WZR/XZR as clobbers since they don't change the value read from WZR/XZR (which is always 0). Reviewers: mcrosier, junbuml, jmolloy, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26921 llvm-svn: 287592	2016-11-21 22:51:10 +00:00
Justin Lebar	3e50a5be8f	[CodeGenPrepare] Don't sink non-cheap addrspacecasts. Summary: Previously, CGP would unconditionally sink addrspacecast instructions, even going so far as to sink them into a loop. Now we check that the cast is "cheap", as defined by TLI. We introduce a new "is-cheap" function to TLI rather than using isNopAddrSpaceCast because some GPU platforms want the ability to ask for non-nop casts to be sunk. Reviewers: arsenm, tra Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26923 llvm-svn: 287591	2016-11-21 22:49:15 +00:00
Justin Lebar	838c7f5a85	[CodeGenPrepare] Rewrite a loop in terms of llvm::none_of. NFC. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26924 llvm-svn: 287590	2016-11-21 22:49:11 +00:00
Eli Friedman	c0bba1a96d	[LoopReroll] Make root-finding more aggressive. Allow using an instruction other than a mul or phi as the base for root-finding. For example, the included testcase includes a loop which requires using a getelementptr as the base for root-finding. Differential Revision: https://reviews.llvm.org/D26529 llvm-svn: 287588	2016-11-21 22:35:34 +00:00
Sanjay Patel	3b0bafee63	[InstCombine] canonicalize min/max constant to select's false value This is a first step towards canonicalization and improved folding/codegen for integer min/max as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Here, we're just matching the simplest min/max patterns and adjusting the icmp predicate while swapping the select operands. I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll so it's easier to see how this might be extended (corresponds to the TODO comment in the code). That's also why I'm using matchSelectPattern() rather than a simpler check; once the backend is patched, we can just remove some of the restrictions to allow the obfuscated min/max patterns in the FIXME tests to be matched. Differential Revision: https://reviews.llvm.org/D26525 llvm-svn: 287585	2016-11-21 22:04:14 +00:00
Evgeny Stupachenko	8efbe6acae	LSR debug fix. Summary: Dump instruction instead of address. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D26877 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 287584	2016-11-21 21:55:03 +00:00
Sanjay Patel	c89911ba02	fix formatting; NFC llvm-svn: 287582	2016-11-21 21:48:36 +00:00
Reid Kleckner	01660a3d2a	[asan] Make ASan compatible with linker dead stripping on Windows Summary: This is similar to what was done for Darwin in rL264645 / http://reviews.llvm.org/D16737, but it uses COFF COMDATs to achive the same result instead of relying on new custom linker features. As on MachO, this creates one metadata global per instrumented global. The metadata global is placed in the custom .ASAN$GL section, which the ASan runtime will iterate over during initialization. There are no other references to the metadata, so normal linker dead stripping would discard it. However, the metadata is put in a COMDAT group with the instrumented global, so that it will be discarded if and only if the instrumented global is discarded. I didn't update the ASan ABI version check since this doesn't affect non-Windows platforms, and the WinASan ABI isn't really stable yet. Implementing this for ELF will require extending LLVM IR and MC a bit so that we can use non-COMDAT section groups. Reviewers: pcc, kcc, mehdi_amini, kubabrecka Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26770 llvm-svn: 287576	2016-11-21 20:40:37 +00:00
Simon Dardis	43115a1ce4	[mips] seq macro support This patch adds the seq macro. This partially resolves PR/30381. Thanks to Sean Bruno for reporting the issue! Reviewers: zoran.jovanovic, vkalintiris, seanbruno Differential Revision: https://reviews.llvm.org/D24607 llvm-svn: 287573	2016-11-21 20:30:41 +00:00
Krzysztof Parzyszek	73c8a9bc2f	Check proper live range in extendPHIRanges The function extendPHIRanges checks the main range of the original live interval, even when dealing with a subrange. This could also lead to an assert when the subrange is not live at the extension point, but the main range is. To avoid this, check the corresponding subrange of the original live range, instead of always checking the main range. Review (as a part of a bigger set of changes): https://reviews.llvm.org/D26359 llvm-svn: 287571	2016-11-21 20:24:12 +00:00
Marcin Koscielnicki	6af8e6c3d5	[TLI] Fix breakage introduced by D21739. The initialize function has an early return for AMDGPU targets. If taken, the ShouldExtI32* initialization code will not be executed, resulting in invalid values in the corresponding fields. Fix this by moving the code to the top of the function. llvm-svn: 287570	2016-11-21 20:20:39 +00:00
Shoaib Meenai	106e05a0e8	[AsmPrinter] Enable codeview for windows-itanium Enable codeview emission for windows-itanium targets. Co-opt an existing test (which is derived from a C source file and should therefore be identical across the Itanium and MS ABIs). Differential Revision: https://reviews.llvm.org/D26693 llvm-svn: 287567	2016-11-21 20:13:32 +00:00
Mandeep Singh Grang	73f0095d71	[MemorySSA] Fix for non-determinism in codegen This patch fixes the non-determinism caused due to iterating SmallPtrSet's which was uncovered due to the experimental "reverse iteration order " patch: https://reviews.llvm.org/D26718 The following unit tests failed because of the undefined order of iteration. LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll LLVM :: Transforms/Util/MemorySSA/many-doms.ll LLVM :: Transforms/Util/MemorySSA/phi-translation.ll Reviewers: dberlin, mgrang Subscribers: dberlin, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D26704 llvm-svn: 287563	2016-11-21 19:33:02 +00:00
Simon Pilgrim	5662074ba3	[VectorLegalizer] Remove EVT::getSizeInBits code duplications. NFCI. We were calling SVT.getSizeInBits() several times in a row - just call it once and reuse the result. llvm-svn: 287556	2016-11-21 18:24:44 +00:00
Jun Bum Lim	82f55c5446	[CodeGenPrep] Skip merging empty case blocks Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 287553	2016-11-21 16:47:28 +00:00
Coby Tayree	94ddbb4a04	small fixup which enables the issuing of the aforementioned instruction (w/o operands), on MS/Intel syntax. Differential Revision: https://reviews.llvm.org/D26913 llvm-svn: 287548	2016-11-21 15:50:56 +00:00
Yaxun Liu	02f75f31e0	Fix known zero bits for addrspacecast. Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1. This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand. Differential Revision: https://reviews.llvm.org/D26803 llvm-svn: 287545	2016-11-21 15:42:31 +00:00
Simon Pilgrim	49d7eda968	[SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcode llvm-svn: 287541	2016-11-21 14:36:19 +00:00
Simon Pilgrim	b7bbaa669b	[X86][SSE] Allow PACKSS to be used to truncate any type of all/none sign bits input At the moment we only use truncateVectorCompareWithPACKSS with direct vector comparison results (just one example of a known all/none signbits input). This change relaxes the direct matching of a SETCC opcode by moving the logic up into SelectionDAG::ComputeNumSignBits and accepting any input with a known splatted signbit. llvm-svn: 287535	2016-11-21 12:05:49 +00:00
Marcin Koscielnicki	1c2bd1e9f3	[InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 zeroext if appropriate. On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed as i32 signext instead of plain i32. Likewise, unsigned int may be passed as i32, i32 signext, or i32 zeroext depending on the platform. Mark __llvm_profile_instrument_target properly (its last parameter is unsigned int). This (together with the clang change) makes compiler-rt profile testsuite pass on s390x. Differential Revision: http://reviews.llvm.org/D21736 llvm-svn: 287534	2016-11-21 11:57:19 +00:00
Marcin Koscielnicki	5ae2c526db	[TLI] Add functions determining if int parameters/returns should be zeroext/signext. On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed as i32 signext instead of plain i32. Likewise, unsigned int may be passed as i32, i32 signext, or i32 zeroext depending on the platform. Add this information to TargetLibraryInfo, to be used whenever some LLVM pass inserts a compiler-rt call to a function involving int parameters or returns. Differential Revision: http://reviews.llvm.org/D21739 llvm-svn: 287533	2016-11-21 11:57:11 +00:00
Michael Zuckerman	8462faeaba	Fixing a small typo (A->U). This seem to fixes PR30992. - HasAVX512 ? X86::VMOVAPSZ128rm_NOVLX + HasAVX512 ? X86::VMOVUPSZ128rm_NOVLX llvm-svn: 287532	2016-11-21 11:52:11 +00:00
Craig Topper	9f2d632ee7	[AVX-512] Add EVEX form of VMOVZPQILo2PQIZrm to load folding tables to match SSE and AVX. llvm-svn: 287523	2016-11-21 07:51:31 +00:00
Alexei Starovoitov	7ab125dbf3	[bpf] fix dwarf elf relocs and line numbers - teach RelocVisitor to recognize bpf relocations - fix AsmInfo->PointerSize to make sure dwarf is emitted correctly - add a test for the above Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 287521	2016-11-21 06:21:23 +00:00
Craig Topper	0dfc09372f	[X86] Remove duplicate instructions for (v)movq and replace with patterns on other instructions. NFC llvm-svn: 287519	2016-11-21 04:07:56 +00:00
Dean Michael Berris	31761f300d	[XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction This patch adds a test for the assembly code emitted with XRay instrumentation. It also fixes a bug where the operand of a jump instruction must be not the number of bytes to jump over, but rather the number of 4-byte instructions. Author: rSerge Reviewers: dberris, rengolin Differential Revision: https://reviews.llvm.org/D26805 llvm-svn: 287516	2016-11-21 03:01:43 +00:00
Davide Italiano	2ae76dd239	[GlobalSplit] Port to the new pass manager. llvm-svn: 287511	2016-11-21 00:28:23 +00:00
Simon Dardis	1dcb911061	[mips] Restrict tail call optimization The tail call optimization was being used without proper consideration of ABI requirements for saving and restoring the GP. This patch restricts tail call optimization to functions within the same translation unit. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D24763 llvm-svn: 287505	2016-11-20 21:23:08 +00:00
Coby Tayree	99a6639047	The 'vpmultishiftqb' instruction was implemented falsely, this patch amend it. More specifically - (MS dialect) broadcasting variants were implemented falsely. Differential Revision: https://reviews.llvm.org/D26257 llvm-svn: 287501	2016-11-20 17:19:55 +00:00
Coby Tayree	97e9cf62f4	Some instructions were missing, other implemented falsely. this patch aims at amending those issues. full list: vcvtps2pd vcvtudq2pd vcvtps2qq vcvttps2qq vcvtps2uqq vcvttps2uqq variants are: [Dst]XMM(zero-masked/merge-masked/unmasked) [Src]Mem64 Differential Revision: https://reviews.llvm.org/D26799 llvm-svn: 287500	2016-11-20 17:09:56 +00:00
Simon Pilgrim	5fadce4a3f	[X86][AVX512] Combine unary + zero target shuffles to VPERMV3 with a zero vector where possible llvm-svn: 287497	2016-11-20 16:11:36 +00:00
Simon Pilgrim	5401bae523	[X86][AVX512] Add support for VBMI VPERMV3 target shuffle combines llvm-svn: 287496	2016-11-20 15:24:38 +00:00
Simon Pilgrim	3f40412e0f	[X86][AVX512] Add support for VBMI VPERMV target shuffle combines llvm-svn: 287495	2016-11-20 15:05:45 +00:00
Simon Pilgrim	c17e1b74b8	[X86][AVX512VL] Removed duplicate operation action Basic AVX512F already declared uint_to_fp v4i32 as legal llvm-svn: 287493	2016-11-20 14:19:29 +00:00
Simon Pilgrim	3f10e9953d	Strip trailing whitespace llvm-svn: 287492	2016-11-20 14:05:23 +00:00
Simon Pilgrim	096b6d4f81	[X86][AVX512F] Add support for uint_to_fp v2i32 to v2f64 on AVX512F-only targets Use 512-bit instructions (we already do something similar for uint_to_fp v4i32 to v4f64) llvm-svn: 287491	2016-11-20 14:03:23 +00:00
Simon Pilgrim	f2fbf43704	Fix comment typos. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287490	2016-11-20 13:47:59 +00:00
Simon Pilgrim	7d18a70dac	Fix spelling mistakes in Transforms comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287488	2016-11-20 13:19:49 +00:00
Simon Pilgrim	7a6b6d5656	Fix spelling mistakes in SelectionDAG comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287487	2016-11-20 13:14:57 +00:00
Simon Pilgrim	fbd2221de5	Fix comment typos. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287486	2016-11-20 13:10:51 +00:00
Oren Ben Simhon	c0f073b67f	[X86] RegCall - Handling long double arguments The change is part of RegCall calling convention support for LLVM. Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack). This review present the change and the corresponding tests. Differential Revision: https://reviews.llvm.org/D26151 llvm-svn: 287485	2016-11-20 11:06:07 +00:00
Coby Tayree	179ff0e541	[X86][InlineAsm]Test commit. Fixing a wrong comment on X86AsmParser.cpp::ParseZ: "true" --> "false" Differential Revision: https://reviews.llvm.org/D26797 llvm-svn: 287484	2016-11-20 09:31:11 +00:00
Serge Pavlov	f258ff1fa9	Fix file name resolution in nested response files If a response file in construct `@file` was specified by relative name, constructs `@file` nested within it were resolved incorrectly if the flag RelativeNames in call to ExpandResponseFile was set to true. This feature is used in configuration files, tests for it are in respective change (see D24933). llvm-svn: 287482	2016-11-20 06:25:07 +00:00
Alexei Starovoitov	e6ddac0def	[bpf] add BPF disassembler add BPF disassembler, so tools like llvm-objdump can be used: $ llvm-objdump -d -no-show-raw-insn ./sockex1_kern.o ./sockex1_kern.o: file format ELF64-BPF Disassembly of section socket1: bpf_prog1: 0: r6 = r1 8: r0 = (u8 )skb[23] 10: (u32 )(r10 - 4) = r0 18: r1 = (u32 )(r6 + 4) 20: if r1 != 4 goto 8 28: r2 = r10 30: r2 += -4 ld_imm64 (the only 16-byte insn) and special ld_abs/ld_ind instructions had to be treated in a special way. The decoders for the rest of the insns are automatically generated. Add tests to cover new functionality. Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 287477	2016-11-20 02:25:00 +00:00
Rui Ueyama	e5669cecde	Attempt to fix big-endian buildbots. llvm-svn: 287476	2016-11-20 01:41:28 +00:00
Rui Ueyama	567d9c4b8f	Style fix. NFC. llvm-svn: 287475	2016-11-20 01:15:56 +00:00
Rui Ueyama	218072a989	Fix buildbot. llvm-svn: 287474	2016-11-20 01:13:22 +00:00
Rui Ueyama	fe33661ab0	SHA1: unroll loop in hashBlock. This code is taken from public domain. https://github.com/jsonn/src/blob/trunk/common/lib/libc/hash/sha1/sha1.c I wrote a sha1 command and ran it on my Xeon E5-2680 v2 2.80GHz machine. Here is a result. The new hash function is 37% faster than before. Performance counter stats for './llvm-sha1-old /ssd/build/bin/lld' (10 runs): 6640.503687 task-clock (msec) # 1.001 CPUs utilized ( +- 0.03% ) 54 context-switches # 0.008 K/sec ( +- 5.03% ) 5 cpu-migrations # 0.001 K/sec ( +- 31.73% ) 183,803 page-faults # 0.028 M/sec ( +- 0.00% ) 18,527,954,113 cycles # 2.790 GHz ( +- 0.03% ) 4,993,237,485 stalled-cycles-frontend # 26.95% frontend cycles idle ( +- 0.11% ) <not supported> stalled-cycles-backend 50,217,149,423 instructions # 2.71 insns per cycle # 0.10 stalled cycles per insn ( +- 0.00% ) 6,094,322,337 branches # 917.750 M/sec ( +- 0.00% ) 11,778,239 branch-misses # 0.19% of all branches ( +- 0.01% ) 6.634017401 seconds time elapsed ( +- 0.03% ) Performance counter stats for './llvm-sha1-new /ssd/build/bin/lld' (10 runs): 4167.062720 task-clock (msec) # 1.001 CPUs utilized ( +- 0.02% ) 52 context-switches # 0.012 K/sec ( +- 16.45% ) 7 cpu-migrations # 0.002 K/sec ( +- 32.20% ) 183,804 page-faults # 0.044 M/sec ( +- 0.00% ) 11,626,611,958 cycles # 2.790 GHz ( +- 0.02% ) 4,491,897,976 stalled-cycles-frontend # 38.63% frontend cycles idle ( +- 0.05% ) <not supported> stalled-cycles-backend 24,320,180,617 instructions # 2.09 insns per cycle # 0.18 stalled cycles per insn ( +- 0.00% ) 1,574,674,576 branches # 377.886 M/sec ( +- 0.00% ) 11,769,693 branch-misses # 0.75% of all branches ( +- 0.00% ) 4.163251552 seconds time elapsed ( +- 0.02% ) Differential Revision: https://reviews.llvm.org/D26890 llvm-svn: 287473	2016-11-20 01:03:22 +00:00
Saleem Abdulrasool	a577509f0a	Demangle: remove references to allocator for default allocator The demangler had stopped using a custom allocator but had not been updated to remove the use of the explicit allocator passing. This removes that as we do not need to do anything special here anymore. This just makes the code more compact. NFC. llvm-svn: 287472	2016-11-20 00:20:27 +00:00
Saleem Abdulrasool	54ec3f9cf8	Demangle: remove unnecessary typedef for std::vector We could create a local typedef for std::vector called Vector. Inline the use of std::vector rather than use the typedef. NFC. llvm-svn: 287471	2016-11-20 00:20:25 +00:00
Saleem Abdulrasool	be1fd54f85	Demangle: replace custom typedef for std::string with std::string We created a local typedef for `std::basic_string<char, std::char_traits<char>>` which is just `std::string`. Remove the local typedef and propagate the type information through the rest of the demangler. NFC. llvm-svn: 287470	2016-11-20 00:20:23 +00:00
Saleem Abdulrasool	0da9050976	Demangle: use direct member initialization (NFC) Prefer direct member initialization over the explicit out-of-line initialization for the construction of the local type. NFC. llvm-svn: 287469	2016-11-20 00:20:20 +00:00
Benjamin Kramer	ffd3715d16	Give some helper classes/functions internal linkage. NFC. llvm-svn: 287462	2016-11-19 20:44:26 +00:00
Simon Pilgrim	a14e0cb852	[X86][SSE] Improve PSHUFB lowering from either input Canonicalization may leave the zeroable vector in the first input. llvm-svn: 287461	2016-11-19 20:41:48 +00:00
Simon Pilgrim	623a7c57b5	[X86][AVX512] Add VPERMV/VPERMV3 v64i8 byte shuffles on avx512vbmi targets llvm-svn: 287459	2016-11-19 20:12:34 +00:00
Mehdi Amini	fec2158292	[ThinLTO] Fix crash when importing an opaque type It seems that because ThinLTO does not import the full module, some invariant of the type mapper are broken. In Monolithic LTO, we import every globals: when calling IRLinker::copyFunctionProto() on @foo(), we end-up calling TypeMapTy::get(FTy) on the type of @foo(), which will map %0 and record the destination as opaque. ThinLTO skips this because @foo is not imported and goes directly to the next stage. Next we call computeTypeMapping() that map the types for each globals, and ends up checking for type isomorphism, and may add type mapping. However it doesn't record if there was an opaque destination type that was resolved. Instead of lazily "discovering" opaque type in the destination module on the go, we change the TypeFinder to eagerly record all types and not only the named ones. Differential Revision: https://reviews.llvm.org/D26840 llvm-svn: 287453	2016-11-19 18:44:16 +00:00
Mehdi Amini	19f176b982	[ThinLTO] Implement -pass-remarks-output in ThinLTOCodeGenerator Summary: This will also be added to the LTO API, right now this will bring ThinLTO on par with Monolithic LTO on Darwin. Reviewers: anemet Subscribers: tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D26886 llvm-svn: 287450	2016-11-19 18:20:05 +00:00
Mehdi Amini	6f40836823	Change setDiagnosticsOutputFile to take a unique_ptr from a raw pointer (NFC) Summary: This makes it explicit that ownership is taken. Also replace all `new` with make_unique<> at call sites. Reviewers: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26884 llvm-svn: 287449	2016-11-19 18:19:41 +00:00
Craig Topper	893ea9fb2c	[X86] Simplify some code a little by removing a dulicate variable and combinining two if statements. NFCI llvm-svn: 287443	2016-11-19 17:33:17 +00:00
Daniel Sanders	c95590bc45	Try again to fix unused variable warning on lld-x86_64-darwin13 after r287439. The previous attempt didn't work. I assume LLVM_ATTRIBUTE_UNUSED isn't available on that machine. llvm-svn: 287442	2016-11-19 14:47:41 +00:00
Daniel Sanders	c6d1986a84	Try to fix unused variable warning on lld-x86_64-darwin13 after r287439. Whether the variable is used or not depends on NDEBUG. llvm-svn: 287440	2016-11-19 13:50:32 +00:00
Daniel Sanders	72db2a390a	Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86. Summary: * ARM is omitted from this patch because this check appears to expose bugs in this target. * Mips is omitted from this patch because this check either detects bugs or deliberate emission of instructions that don't satisfy their predicates. One deliberate use is the SYNC instruction where the version with an operand is correctly defined as requiring MIPS32 while the version without an operand is defined as an alias of 'SYNC 0' and requires MIPS2. * X86 is omitted from this patch because it doesn't use the tablegen-erated MCCodeEmitter infrastructure. Patches for ARM and Mips will follow. Depends on D25617 Reviewers: tstellarAMD, jmolloy Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D25618 llvm-svn: 287439	2016-11-19 13:05:44 +00:00
Dylan McKay	1a55f201ef	[AVR] Remove a bunch of unused variables llvm-svn: 287416	2016-11-19 01:33:42 +00:00
Dylan McKay	19270f3438	[AVR] Remove a variable which was unused in release mode In release mode where assertions are not enabled, this caused an 'unused variable' warning. llvm-svn: 287414	2016-11-19 01:14:44 +00:00
Konstantin Zhuravlyov	aefee42e0f	[AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input Differential Revision: https://reviews.llvm.org/D26862 llvm-svn: 287389	2016-11-18 22:31:08 +00:00
Simon Pilgrim	e40900dddd	[SelectionDAG] Add knowbits support for CONCAT_VECTOR opcode llvm-svn: 287387	2016-11-18 22:21:22 +00:00
Michael Zolotukhin	5020c9971b	[LoopSimplify] Preserve LCSSA when removing edges from unreachable blocks. This fixes PR30454. llvm-svn: 287379	2016-11-18 21:01:12 +00:00
Mehdi Amini	bf4d8d033b	Revert "Add link-time detection of LLVM_ABI_BREAKING_CHECKS mismatch" This reverts commit r287352, LLDB CI is broken. llvm-svn: 287374	2016-11-18 20:02:34 +00:00
Matthias Braun	db39fd6c53	Statistic/Timer: Include timers in PrintStatisticsJSON(). Differential Revision: https://reviews.llvm.org/D25588 llvm-svn: 287370	2016-11-18 19:43:24 +00:00
Matthias Braun	9f15a79e5d	Timer: Track name and description. The previously used "names" are rather descriptions (they use multiple words and contain spaces), use short programming language identifier like strings for the "names" which should be used when exporting to machine parseable formats. Also removed a unused TimerGroup from Hexxagon. Differential Revision: https://reviews.llvm.org/D25583 llvm-svn: 287369	2016-11-18 19:43:18 +00:00
Geoff Berry	b51774ac8c	[MIRPrinter] Print raw branch probabilities as expected by MIRParser Fixes PR28751. Reviewers: MatzeB, qcolombet Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26775 llvm-svn: 287368	2016-11-18 19:37:24 +00:00
Matt Arsenault	afe614cb38	AMDGPU: Fix unused variable warning llvm-svn: 287362	2016-11-18 18:33:36 +00:00
Adam Nemet	e9bd022c41	[LTO] Add option to generate optimization records It is used to drive this from the clang driver via -mllvm. Same option name is used as in opt. Differential Revision: https://reviews.llvm.org/D26832 llvm-svn: 287356	2016-11-18 18:06:28 +00:00
Hans Wennborg	aeacdc258b	IRMover: Avoid accidentally mapping types from the destination module (PR30799) During Module linking, it's possible for SrcM->getIdentifiedStructTypes(); to return types that are actually defined in the destination module (DstM). Depending on how the bitcode file was read, getIdentifiedStructTypes() might do a walk over all values, including metadata nodes, looking for types. In my case, a debug info metadata node was shared between the two modules, and it referred to a type defined in the destination module (see test case). Differential Revision: https://reviews.llvm.org/D26212 llvm-svn: 287353	2016-11-18 17:33:05 +00:00
Mehdi Amini	c311528516	Add link-time detection of LLVM_ABI_BREAKING_CHECKS mismatch Summary: LLVM will define a symbol, either EnableABIBreakingChecks or DisableABIBreakingChecks depending on the configuration setting for LLVM_ABI_BREAKING_CHECKS. The llvm-config.h header will add weak references to these symbols in every clients that includes this header. This should ensure that a mismatch triggers a link failure (or a load time failure for DSO). On MSVC, the pragma "detect_mismatch" is used instead. Reviewers: rnk, jroelofs Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D26841 llvm-svn: 287352	2016-11-18 17:28:10 +00:00
Ehsan Amiri	395be572f0	[PPC] limit line width to 80 characters NFC. Forgot to fix this in the original commit. llvm-svn: 287350	2016-11-18 16:24:27 +00:00
Simon Dardis	0e2ee3b4b9	[mips][msa] Implement f16 support The MIPS MSA ASE provides instructions to convert to and from half precision floating point. This patch teaches the MIPS backend to treat f16 as a legal type and how to promote such values to f32 for the usual set of operations. As a result of this, the fexup[lr].w intrinsics no longer crash LLVM during type legalization. Reviewers: zoran.jovanvoic, vkalintiris Differential Revision: https://reviews.llvm.org/D26398 llvm-svn: 287349	2016-11-18 16:17:44 +00:00
Tom Stellard	df613198c0	GlobalISel: Fix unconditional fallback with global isel abort is disabled Reviewers: t.p.northover, ab, qcolombet Subscribers: mehdi_amini, vkalintiris, wdng, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D26765 llvm-svn: 287344	2016-11-18 14:14:35 +00:00
Tom Stellard	01e65d2cfc	AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit insts Summary: The 32-bit instructions don't zero the high 16-bits like the 16-bit instructions do. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26828 llvm-svn: 287342	2016-11-18 13:53:34 +00:00
Florian Hahn	77382be56b	[simplifycfg][loop-simplify] Preserve loop metadata in 2 transformations. insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now propagates existing llvm.loop metadata to newly the added backedge. llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp now propagates existing llvm.loop metadata to the branch instructions in the predecessor blocks of the empty block that is removed. Differential Revision: https://reviews.llvm.org/D26495 llvm-svn: 287341	2016-11-18 13:12:07 +00:00
Simon Pilgrim	7938bd666e	Cleanup function with clang-format. NFCI. llvm-svn: 287340	2016-11-18 12:16:18 +00:00
Nicolai Haehnle	ce2b589df5	AMDGPU: Fix legalization of MUBUF instructions in shaders Summary: The addr64-based legalization is incorrect for MUBUF instructions with idxen set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions. This affects e.g. shaders that access buffer textures. Since we never actually need the addr64-legalization in shaders, this patch takes the easy route and keys off the calling convention. If this ever affects (non-OpenGL) compute, the type of legalization needs to be chosen based on some TSFlag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664 Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26747 llvm-svn: 287339	2016-11-18 11:55:52 +00:00
Simon Pilgrim	dcd8433597	Fix spelling mistakes in MIPS target comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287338	2016-11-18 11:53:36 +00:00
Ehsan Amiri	ff0942e6ea	[Power9] Add patterns for vnegd, vnegw Exploit new instructions by adding patterns to .td file. https://reviews.llvm.org/D26551 llvm-svn: 287334	2016-11-18 11:05:55 +00:00
Simon Pilgrim	e995a8088d	Fix spelling mistakes in AMDGPU target comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287333	2016-11-18 11:04:02 +00:00
Simon Pilgrim	fd8bf984f4	Fix typo in comment. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287331	2016-11-18 10:52:12 +00:00
Ehsan Amiri	85818684c6	[PPC][DAGCombine] Convert SETCC to subtract when the result is zero extended When we see a SETCC whose only users are zero extend operations, we can replace it with a subtraction. This results in doing all calculations in GPRs and avoids CR use. Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are ways that this can be extended. For example for signed condition codes. In that case we will be introducing additional sign extend instructions, so more careful profitability analysis may be required. Another direction to extend this is for equal, not equal conditions. Also when users of SETCC are any_ext or sign_ext, we might be able to do something similar. llvm-svn: 287329	2016-11-18 10:41:44 +00:00
Craig Topper	1de753f7f5	[InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements. This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches. llvm-svn: 287316	2016-11-18 06:04:33 +00:00
Craig Topper	02b5a1b50f	[AVX-512] Replace masked 16-bit element variable shift intrinsics with new unmasked versions and selects. The same thing was done to 32-bit and 64-bit element sizes previously. This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics. llvm-svn: 287312	2016-11-18 05:04:44 +00:00
Matt Arsenault	eff1ad8d8e	AMDGPU: Move redundant setting of inst properties llvm-svn: 287311	2016-11-18 04:42:59 +00:00
Matt Arsenault	742deb2495	AMDGPU: Fix crash on illegal type for inlineasm There are still crashes on non-MVT types in other places. llvm-svn: 287310	2016-11-18 04:42:57 +00:00
Peter Collingbourne	63e10c9c96	Object: Simplify; remove unnecessary use of unique_ptr. llvm-svn: 287305	2016-11-18 03:20:36 +00:00
Matthias Braun	637488dbf8	MachineOperand: Add dump() method llvm-svn: 287302	2016-11-18 02:40:40 +00:00
Alexei Starovoitov	8f9f8210c1	convert bpf assembler to look like kernel verifier output since bpf instruction set was introduced people learned to read and understand kernel verifier output whereas llvm asm output stayed obscure and unknown. Convert llvm to emit assembler text similar to kernel to avoid this discrepancy Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 287300	2016-11-18 02:32:35 +00:00
Craig Topper	07f1c15995	[AVX-512] Support FCOPYSIGN for v16f32 and v8f64 Summary: This extends FCOPYSIGN support to 512-bit vectors. I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads. Reviewers: delena, zvi, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26791 llvm-svn: 287298	2016-11-18 02:25:34 +00:00
Simon Pilgrim	6ba672e542	Fix spelling mistakes in Hexagon target comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287248	2016-11-17 19:21:20 +00:00
Simon Pilgrim	9d15fb3c10	Fix spelling mistakes in X86 target comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287247	2016-11-17 19:03:05 +00:00
Anna Zaks	9cd5ed1241	[asan] Turn on Mach-O global metadata liveness tracking by default This patch turns on the metadata liveness tracking since all known issues have been resolved. The future has been implemented in https://reviews.llvm.org/D16737 and enables support of dead code stripping option on Mach-O platforms. As part of enabling the feature, I also plan on reverting the following patch to compiler-rt: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160704/369910.html Differential Revision: https://reviews.llvm.org/D26772 llvm-svn: 287235	2016-11-17 16:55:40 +00:00
Konstantin Zhuravlyov	0a1a7b6b23	Revert "AMDGPU: Enable ConstrainCopy DAG mutation" This reverts commit r287146. This breaks few conformance tests. llvm-svn: 287233	2016-11-17 16:41:49 +00:00
Daniil Fukalov	4c3322cc84	[SCEV] limit recursion depth of CompareSCEVComplexity Summary: CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time. Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter. Reviewers: sanjoy Subscribers: mzolotukhin, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D26389 llvm-svn: 287232	2016-11-17 16:07:52 +00:00
Simon Pilgrim	67ef3b984a	Wdocumentation fix llvm-svn: 287224	2016-11-17 12:21:45 +00:00
Simon Pilgrim	8eca5520dc	[X86][SSE] Improve lowering of vXi64 multiply with known zero 32-bit halves vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves. If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer). This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases). Partial fix for PR30845 Differential Revision: https://reviews.llvm.org/D26590 llvm-svn: 287223	2016-11-17 12:14:49 +00:00
Simon Pilgrim	c4d733cd6a	Fix spelling in comment. NFC. llvm-svn: 287222	2016-11-17 12:03:05 +00:00
Pablo Barrio	c41e856f53	[ARM] Relax restriction on variadic functions for tailcall optimization Summary: Variadic functions can be treated in the same way as normal functions with respect to the number and types of parameters. Reviewers: grosbach, olista01, t.p.northover, rengolin Subscribers: javed.absar, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26748 llvm-svn: 287219	2016-11-17 10:56:58 +00:00

... 3 4 5 6 7 ...

97253 Commits