llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	2d5841fa73	Revert r296366 "[InlineFunction] add nonnull assumptions based on argument attributes" It causes miscompiles e.g. during self-host of Clang (PR32082). llvm-svn: 296398	2017-02-27 22:33:02 +00:00
Matt Arsenault	eb522e68bc	AMDGPU: Support v2i16/v2f16 packed operations llvm-svn: 296396	2017-02-27 22:15:25 +00:00
Arnold Schwaighofer	b2605f31ed	ISel: We need to notify FastIS of the IMPLICIT_DEF we created in createSwiftErrorEntriesInEntryBlock Otherwise, it will insert instructions before it. rdar://30536186 llvm-svn: 296395	2017-02-27 22:12:06 +00:00
Zachary Turner	120faca41b	[PDB] Partial resubmit of r296215, which improved PDB Stream Library. This was reverted because it was breaking some builds, and because of incorrect error code usage. Since the CL was large and contained many different things, I'm resubmitting it in pieces. This portion is NFC, and consists of: 1) Renaming classes to follow a consistent naming convention. 2) Fixing the const-ness of the interface methods. 3) Adding detailed doxygen comments. 4) Fixing a few instances of passing `const BinaryStream& X`. These are now passed as `BinaryStreamRef X`. llvm-svn: 296394	2017-02-27 22:11:43 +00:00
Matt Arsenault	4a7cc16e89	Revert "DAG: Check if extract_vector_elt is legal or custom" This reverts r295782. This could potentially result in some legalization loops and I avoided the need for this. llvm-svn: 296393	2017-02-27 21:59:07 +00:00
Xin Tong	fe422f7462	Empty line. NFCI llvm-svn: 296392	2017-02-27 21:51:48 +00:00
Rong Xu	cbb1140720	[PGO] Fix a bug in reading text format value profile. Summary: Should use the Valuekind read from the profile. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits, xur Differential Revision: https://reviews.llvm.org/D30420 llvm-svn: 296391	2017-02-27 21:42:39 +00:00
Sanjay Patel	ae7873fe55	[ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389	2017-02-27 21:30:54 +00:00
Matt Arsenault	c9f2517e96	AMDGPU: Add some of the new gfx9 VOP3 instructions llvm-svn: 296382	2017-02-27 21:04:41 +00:00
Simon Pilgrim	5c4efcdddf	[X86][SSE] Attempt to extract vector elements through target shuffles DAGCombiner already supports peeking thorough shuffles to improve vector element extraction, but legalization often leaves us in situations where we need to extract vector elements after shuffles have already been lowered. This patch adds support for VECTOR_EXTRACT_ELEMENT/PEXTRW/PEXTRB instructions to attempt to handle target shuffles as well. I've covered some basic scenarios including handling shuffle mask scaling and the implicit zero-extension of PEXTRW/PEXTRB, there is more that could be done here (that I've mentioned in TODOs) but I haven't found many cases where its worth it. Differential Revision: https://reviews.llvm.org/D30176 llvm-svn: 296381	2017-02-27 21:01:57 +00:00
Matt Arsenault	7596f13d15	AMDGPU: Support inlineasm for packed instructions Add packed types as legal so they may be used with inlineasm. Keep all operations expanded for now. llvm-svn: 296379	2017-02-27 20:52:10 +00:00
Matt Arsenault	2ed2193218	AMDGPU: Don't fold immediate if clamp/omod are set Doesn't fix any practical problems because clamp/omod are currently folded after peephole optimizer. llvm-svn: 296375	2017-02-27 20:21:31 +00:00
Matt Arsenault	3cb390498e	AMDGPU: Fold omod into instructions llvm-svn: 296372	2017-02-27 19:35:42 +00:00
Taewook Oh	a49eb8578a	[TailDuplicator] Maintain DebugLoc for branch instructions Summary: Existing implementation of duplicateSimpleBB function drops DebugLoc metadata of branch instructions during the transformation. This patch addresses this issue by making newly created branch instructions to keep the metadata of replaced branch instructions. Reviewers: qcolombet, craig.topper, aprantl, MatzeB, sanjoy, dblaikie Reviewed By: dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D30026 llvm-svn: 296371	2017-02-27 19:30:01 +00:00
Matt Arsenault	e2d1d3a940	AMDGPU: Add f16 to shader calling conventions Mostly useful for writing tests for f16 features. llvm-svn: 296370	2017-02-27 19:24:47 +00:00
Matt Arsenault	9be7b0d485	AMDGPU: Add VOP3P instruction format Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368	2017-02-27 18:49:11 +00:00
Sanjay Patel	40975e05eb	[InlineFunction] add nonnull assumptions based on argument attributes This was suggested in D27855: have the inliner add assumptions, so we don't lose nonnull info provided by argument attributes. This still doesn't solve PR28430 (dyn_cast), but this gets us closer. https://reviews.llvm.org/D29999 llvm-svn: 296366	2017-02-27 18:13:48 +00:00
Krzysztof Parzyszek	e9be35596e	[Hexagon] Defs and clobbers can overlap llvm-svn: 296365	2017-02-27 18:03:35 +00:00
Xin Tong	16b85a6601	Fix a bug when unswitching on partial LIV for SwitchInst Summary: Fix a bug when unswitching on partial LIV for SwitchInst. Reviewers: hfinkel, efriedma, sanjoy Reviewed By: sanjoy Subscribers: david2050, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D29107 llvm-svn: 296363	2017-02-27 18:00:13 +00:00
Rong Xu	08d0840273	Fix comments. NFC. Change "Thin-LTO" to "ThinLTO" in the comments for consistency. llvm-svn: 296362	2017-02-27 17:59:01 +00:00
Craig Topper	7502119ce8	[X86] Use APInt instead of SmallBitVector tracking undef elements from getTargetConstantBitsFromNode and getConstVector. Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30392 llvm-svn: 296355	2017-02-27 16:15:32 +00:00
Craig Topper	3917ca2af4	[X86] Use APInt instead of SmallBitVector for tracking Zeroable elements in shuffle lowering Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30390 llvm-svn: 296354	2017-02-27 16:15:30 +00:00
Craig Topper	e1be95c3d0	[X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized. Differential Revision: https://reviews.llvm.org/D30387 llvm-svn: 296353	2017-02-27 16:15:27 +00:00
Craig Topper	53e5a38da9	[X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30386 llvm-svn: 296352	2017-02-27 16:15:25 +00:00
Artur Pilipenko	0860bfc676	Loop predication expand both sides of the widened condition This is a fix for a loop predication bug which resulted in malformed IR generation. Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D30099 llvm-svn: 296345	2017-02-27 15:44:49 +00:00
Sjoerd Meijer	32ecac7ac8	AArch64InstPrinter: rewrite of printSysAlias This is a cleanup/rewrite of the printSysAlias function. This was not using the tablegen instruction descriptions, but was "manually" decoding the instructions. This has been replaced with calls to lookup_XYZ_ByEncoding tablegen calls. This revealed several problems. First, instruction IVAU had the wrong encoding. This was cancelled out by the parser that incorrectly matched the wrong encoding. Second, instruction CVAP was missing from the SystemOperands tablegen descriptions, so this has been added. And third, the required target features were not captured in the tablegen descriptions, so support for this has also been added. Differential Revision: https://reviews.llvm.org/D30329 llvm-svn: 296343	2017-02-27 14:45:34 +00:00
John Brawn	c97b714ffb	[ARM] LSL #0 is an alias of MOV Currently we handle this correctly in arm, but in thumb we don't which leads to an unpredictable instruction being emitted for LSL #0 in an IT block and SP not being permitted in some cases when it should be. For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the .td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to get the IT handling right. We also need to adjust the handling of MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We should also adjust it to allow SP in the same way that it is allowed in MOV rd, rn, but I haven't done that here because it looks like it would take quite a lot of work to get right. Additionally correct the selection of the 16-bit shift instructions in processInstruction, where it was checking if the two registers were equal when it should have been checking if they were low. It appears that previously this code was never executed and the 16-bit encoding was selected by default, but the other changes I've done here have somehow made it start being used. Differential Revision: https://reviews.llvm.org/D30294 llvm-svn: 296342	2017-02-27 14:40:51 +00:00
Artur Pilipenko	f7196c8d9e	[DAGCombine] Fix for a load combine bug with non-zero offset patterns on BE targets This pattern is essentially a i16 load from p+1 address: %p1.i16 = bitcast i8* %p to i16* %p2.i8 = getelementptr i8, i8* %p, i64 2 %v1 = load i16, i16* %p1.i16 %v2.i8 = load i8, i8* %p2.i8 %v2 = zext i8 %v2.i8 to i16 %v1.shl = shl i16 %v1, 8 %res = or i16 %v1.shl, %v2 Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining. llvm-svn: 296336	2017-02-27 13:04:23 +00:00
Artur Pilipenko	c43b20a43b	[DAGCombine] NFC. MatchLoadCombine extract MemoryByteOffset lambda helper This refactoring will simplify the upcoming change to fix the bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296332	2017-02-27 11:42:54 +00:00
Artur Pilipenko	f2c26e0bf2	[DAGCombine] NFC. MatchLoadCombine remember the first byte provider, not the load node This refactoring will simplify the upcoming change to fix a bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296331	2017-02-27 11:40:14 +00:00
Sjoerd Meijer	6d171006f4	AArch64AsmParser: don't try to parse “[1]” for non-vector register operands There are no instructions that have "[1]" as part of the assembly string; FMOVXDhighr is out of date. This removes dead code. Differential Revision: https://reviews.llvm.org/D30165 llvm-svn: 296327	2017-02-27 10:51:11 +00:00
Konstantin Zhuravlyov	972948b36e	[AMDGPU] Runtime metadata fixes: - Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it: .amdgpu_runtime_metadata { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ... - Make IsaInfo optional, and always emit it. Differential Revision: https://reviews.llvm.org/D30349 llvm-svn: 296324	2017-02-27 07:55:17 +00:00
Craig Topper	ed0101a0b9	[X86] Check for less than 0 rather than explicit compare with -1. NFC llvm-svn: 296321	2017-02-27 06:05:30 +00:00
Xin Tong	3ca169f3a9	Update comments. NFCI llvm-svn: 296298	2017-02-26 19:08:44 +00:00
Daniel Jasper	3ca4525612	Revert "[CGP] Split some critical edges coming out of indirect branches" This reverts commit r296149 as it leads to crashes when compiling for PPC. llvm-svn: 296295	2017-02-26 11:09:12 +00:00
Davide Italiano	49a0aac1aa	[LoopDeletion] Modernize and simplify a bit. NFCI. llvm-svn: 296294	2017-02-26 07:08:20 +00:00
Craig Topper	6028584d8c	[X86] Fix execution domain for cmpss/sd instructions. llvm-svn: 296293	2017-02-26 06:45:59 +00:00
Craig Topper	036693302b	[AVX-512] Fix execution domain for scalar commutable min/max instructions. llvm-svn: 296292	2017-02-26 06:45:56 +00:00
Craig Topper	e70231be51	[AVX-512] Fix execution domain for vmovhpd/lpd/hps/lps. llvm-svn: 296291	2017-02-26 06:45:54 +00:00
Craig Topper	fe25988c68	[AVX-512] Fix the execution domain for AVX-512 integer broadcasts. llvm-svn: 296290	2017-02-26 06:45:51 +00:00
Craig Topper	49ba3f5406	[AVX-512] Disable the redundant patterns in the VPBROADCASTBr_Alt and VPBROADCASTWr_Alt instructions. NFC llvm-svn: 296289	2017-02-26 06:45:48 +00:00
Craig Topper	6bf9b809ce	[AVX-512] Fix execution domain for VPMADD52 instructions. llvm-svn: 296288	2017-02-26 06:45:45 +00:00
Craig Topper	aa8e903150	[AVX-512] Fix the execution domain for VSCALEF instructions. llvm-svn: 296286	2017-02-26 06:45:40 +00:00
Craig Topper	cac5d698df	[AVX-512] Fix execution domain of scalar VRANGE/REDUCE/GETMANT with sae. llvm-svn: 296285	2017-02-26 06:45:37 +00:00
Craig Topper	ed64904c74	[X86] Fix the execution domain for scalar SQRT intrinsic instruction. llvm-svn: 296284	2017-02-26 06:45:35 +00:00
Xin Tong	42ef2177af	[SCCP] Remove manual folding of terminator instructions. Summary: BranchInst, SwitchInst (with non-default case) with Undef as input is not possible at this point. As we always default-fold terminator to one target in ResolvedUndefsIn and set the input accordingly. So we should only have constantint/blockaddress here. If ConstantFoldTerminator fails, that could mean 2 things. 1. ConstantFoldTerminator is doing something unexpected, i.e. not folding on constantint or blockaddress and not making blocks that should be dead dead. 2. This is not a terminator on constantint or blockaddress. Its on a constant or overdefined, then this block should not be dead. In both cases, we should assert. Reviewers: davide, efriedma, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30381 llvm-svn: 296281	2017-02-26 02:11:24 +00:00
Nirav Dave	73cd0194cf	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r296252 until 256-bit operations are more efficiently generated in X86. llvm-svn: 296279	2017-02-26 01:27:32 +00:00
Eric Christopher	4a8208c266	vec perm can go down either pipeline on P8. No observable changes, spotted while looking at the scheduling description. llvm-svn: 296277	2017-02-26 00:11:58 +00:00
Sanjoy Das	39a684d117	[ValueTracking] Don't do an unchecked shift in ComputeNumSignBits Summary: Previously we used to return a bogus result, 0, for IR like `ashr %val, -1`. I've also added an assert checking that `ComputeNumSignBits` at least returns 1. That assert found an already checked in test case where we were returning a bad result for `ashr %val, -1`. Fixes PR32045. Reviewers: spatel, majnemer Reviewed By: spatel, majnemer Subscribers: efriedma, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30311 llvm-svn: 296273	2017-02-25 20:30:45 +00:00
Simon Pilgrim	0f5fb5f549	[APInt] Add APInt::extractBits() method to extract APInt subrange (reapplied) The current pattern for extract bits in range is typically: Mask.lshr(BitOffset).trunc(SubSizeInBits); Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable. This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation. Differential Revision: https://reviews.llvm.org/D30336 llvm-svn: 296272	2017-02-25 20:01:58 +00:00

1 2 3 4 5 ...

100132 Commits