llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	4eab18f6b8	[X86][SSE] Detect unary PBLEND shuffles. These can appear during shuffle combining. llvm-svn: 293628	2017-01-31 13:58:01 +00:00
Simon Pilgrim	c29eab52e8	[X86][SSE] Add support for combining PINSRW into a target shuffle. Also add the ability to recognise PINSR(Vex, 0, Idx). Targets shuffle combines won't replace multiple insertions with a bit mask until a depth of 3 or more, so we avoid codesize bloat. The unnecessary vpblendw in clearupper8xi16a will be fixed in an upcoming patch. llvm-svn: 293627	2017-01-31 13:51:10 +00:00
Nemanja Ivanovic	2f2a6ab991	[PowerPC][Altivec] Add vmr extended mnemonic Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in the PPC back end. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29133 llvm-svn: 293626	2017-01-31 13:43:11 +00:00
Florian Hahn	5364cf3b56	[LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC) Summary: rL293124 added the necessary infrastructure to properly add the cloned top level loop to LoopInfo, which means we do not have to do it manually in CloneLoopBlocks. @mkuper sorry for not pointing this out during my review of D29156, I just realized that today. Reviewers: mzolotukhin, chandlerc, mkuper Reviewed By: mkuper Subscribers: llvm-commits, mkuper Differential Revision: https://reviews.llvm.org/D29173 llvm-svn: 293615	2017-01-31 11:13:44 +00:00
Simon Dardis	12850eeac5	[mips] Addition of the immediate cases for the instructions [d]div, [d]divu Related to http://reviews.llvm.org/D15772 Depends on http://reviews.llvm.org/D16888 Adds support for immediate operand for [D]DIV[U] instructions. Patch By: Srdjan Obucina Reviewers: zoran.jovanovic, vkalintiris, dsanders, obucina Differential Revision: https://reviews.llvm.org/D16889 llvm-svn: 293614	2017-01-31 10:49:24 +00:00
Craig Topper	2cfa2071bd	[AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway. llvm-svn: 293608	2017-01-31 06:49:55 +00:00
Craig Topper	797e32dd98	[X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version. llvm-svn: 293607	2017-01-31 06:49:53 +00:00
Craig Topper	779e4c5bb4	[AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions. llvm-svn: 293606	2017-01-31 06:49:50 +00:00
Justin Lebar	1c9692a46f	[NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate. Summary: This lets us lower to sqrt.approx and rsqrt.approx under more circumstances. * Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32, when fast-math is enabled. Previously, we only would emit it for calls to @llvm.nvvm.sqrt.f. (With this patch we no longer emit sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.) * Now we emit the ftz version of rsqrt.approx when ftz is enabled. Previously, we only emitted rsqrt.approx when ftz was disabled. Reviewers: hfinkel Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D28508 llvm-svn: 293605	2017-01-31 05:58:22 +00:00
Craig Topper	06e038c6de	[X86] Update the broadcast fallback patterns to use shuffle instructions from the appropriate execution domain. llvm-svn: 293603	2017-01-31 05:18:29 +00:00
Craig Topper	88b0a47312	[X86] Add test cases for AVX1 broadcast fallback patterns when load can't be folded. Also add test cases that do an insertelement to all elements for the 8 element vector tests. llvm-svn: 293602	2017-01-31 05:18:27 +00:00
Craig Topper	e9e84c8284	[AVX-512] Fix the ExeDomain for VMOVDDUP, VMOVSLDUP, and VMOVSHDUP. llvm-svn: 293601	2017-01-31 05:18:24 +00:00
Matt Arsenault	f84e5d9a27	AMDGPU: Generalize matching of v_med3_f32 I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598	2017-01-31 03:07:46 +00:00
Matt Arsenault	973c4aebad	InferAddressSpaces: Rename constant llvm-svn: 293594	2017-01-31 02:17:41 +00:00
Matt Arsenault	72f259b8eb	InferAddressSpaces: Handle icmp llvm-svn: 293593	2017-01-31 02:17:32 +00:00
Craig Topper	d064cc93b2	[X86] Remove patterns for X86VPermilpi with integer types. I don't think we've formed these since the shuffle lowering rewrite. llvm-svn: 293592	2017-01-31 02:09:53 +00:00
Craig Topper	85935f69fb	[X86] Remove duplicate patterns for X86VPermilpv that already exist in the instructions themselves. llvm-svn: 293591	2017-01-31 02:09:51 +00:00
Craig Topper	ced68315ce	[X86] Remove patterns for selecting PSHUFD with FP types. We don't seem to do this anymore and the AVX case definitely should be using VPERMILPS anyway. llvm-svn: 293590	2017-01-31 02:09:49 +00:00
Craig Topper	b76494e017	[X86] Remove 'else' after 'return'. NFC llvm-svn: 293589	2017-01-31 02:09:46 +00:00
Craig Topper	f9d901f0ea	[X86] Use integer broadcast instructions for integer broadcast patterns. I'm not sure why we were using an FP instruction before and had to have a comment calling attention to it, but not justifying it. llvm-svn: 293588	2017-01-31 02:09:43 +00:00
Matt Arsenault	6d5a8d48fd	InferAddressSpaces: Support memory intrinsics llvm-svn: 293587	2017-01-31 01:56:57 +00:00
Matt Arsenault	6c907a9bb3	InferAddressSpaces: Support atomics llvm-svn: 293584	2017-01-31 01:40:38 +00:00
Matt Arsenault	d89a6e11a7	InferAddressSpaces: Don't replace volatile users llvm-svn: 293582	2017-01-31 01:30:16 +00:00
Matt Arsenault	b6491cc854	AMDGPU: Implement hook for InferAddressSpaces For now just port some of the existing NVPTX tests and from an old HSAIL optimization pass which approximately did the same thing. Don't enable the pass yet until more testing is done. llvm-svn: 293580	2017-01-31 01:20:54 +00:00
Matt Arsenault	850657a439	NVPTX: Move InferAddressSpaces to generic code llvm-svn: 293579	2017-01-31 01:10:58 +00:00
Eugene Zelenko	342257ea92	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293578	2017-01-31 00:56:17 +00:00
Saleem Abdulrasool	6f5f001fdc	TableGen: use fully qualified name for StringLiteral Use the qualified name for StringLiteral (llvm::StringLiteral) when generating the sources. This is needed as the generated files may be used out-of-tree (e.g. swift) where you may not have a `using namespace llvm;` resulting in an undefined lookup. llvm-svn: 293577	2017-01-31 00:45:01 +00:00
Eli Friedman	10d1ff64fe	[SCEV] Simplify/generalize howFarToZero solving. Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576	2017-01-31 00:42:42 +00:00
Reid Kleckner	71012aa945	Remove LLVM_CONFIG from config headers It appears to be dead, and it needlessly caused me to rebuild all of LLVM when I changed CMAKE_INSTALL_PREFIX. llvm-svn: 293574	2017-01-31 00:34:23 +00:00
Vedant Kumar	359785ddad	Fix llvm-readobj build error after r293569 Clang complains about an ambiguous call to printNumber() because it can't work out what size_t should convert to. I picked uint64_t. llvm-svn: 293573	2017-01-30 23:58:51 +00:00
Keno Fischer	578cf7aae7	[ExecutionDepsFix] Improve clearance calculation for loops Summary: In revision rL278321, ExecutionDepsFix learned how to pick a better register for undef register reads, e.g. for instructions such as `vcvtsi2sdq`. While this revision improved performance on a good number of our benchmarks, it unfortunately also caused significant regressions (up to 3x) on others. This regression turned out to be caused by loops such as: PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT ^ \| +----------------------------------+ In the previous version of the clearance calculation, we would visit the blocks in order, remembering for each whether there were any incoming backedges from blocks that we hadn't processed yet and if so queuing up the block to be re-processed. However, for loop structures such as the above, this is clearly insufficient, since the block B does not have any unknown backedges, so we do not see the false dependency from the previous interation's Def of xmm registers in B. To fix this, we need to consider all blocks that are part of the loop and reprocess them one the correct clearance values are known. As an optimization, we also want to avoid reprocessing any later blocks that are not part of the loop. In summary, the iteration order is as follows: Before: PH A B C D A' Corrected (Naive): PH A B C D A' B' C' D' Corrected (w/ optimization): PH A B C A' B' C' D To facilitate this optimization we introduce two new counters for each basic block. The first counts how many of it's predecssors have completed primary processing. The second counts how many of its predecessors have completed all processing (we will call such a block done. Now, the criteria to reprocess a block is as follows: - All Predecessors have completed primary processing - For x the number of predecessors that have completed primary processing at the time of primary processing of this block, the number of predecessors that are done has reached x. The intuition behind this criterion is as follows: We need to perform primary processing on all predecessors in order to find out any direct defs in those predecessors. When predecessors are done, we also know that we have information about indirect defs (e.g. in block B though that were inherited through B->C->A->B). However, we can't wait for all predecessors to be done, since that would cause cyclic dependencies. However, it is guaranteed that all those predecessors that are prior to us in reverse postorder will be done before us. Since we iterate of the basic blocks in reverse postorder, the number x above, is precisely the count of the number of predecessors prior to us in reverse postorder. Reviewers: myatsina Differential Revision: https://reviews.llvm.org/D28759 llvm-svn: 293571	2017-01-30 23:37:03 +00:00
Sanjay Patel	8c5f236197	[InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants llvm-svn: 293570	2017-01-30 23:35:52 +00:00
Derek Schuff	6d76b7b455	[WebAssembly] Add wasm support for llvm-readobj Create a WasmDumper subclass of ObjDumper to support Webassembly binary files. Patch by Sam Clegg Differential Revision: https://reviews.llvm.org/D27355 llvm-svn: 293569	2017-01-30 23:30:52 +00:00
Matt Arsenault	9f432ec24c	NVPTX: Trivial cleanups of NVPTXInferAddressSpaces - Move DEBUG_TYPE below includes - Change unknown address space constant to be consistent with other passes - Grammar fixes in debug output llvm-svn: 293567	2017-01-30 23:27:11 +00:00
Sanjay Patel	abbb118a78	[InstCombine] add vector test for (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2); NFC llvm-svn: 293566	2017-01-30 23:26:17 +00:00
Eugene Zelenko	dde94e4c4f	[Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293565	2017-01-30 23:21:32 +00:00
Benjamin Kramer	365c9bd941	[ICP] Fix bool conversion warning and actually write out the reason instead of dropping it. llvm-svn: 293564	2017-01-30 23:11:29 +00:00
Matt Arsenault	42b6478344	NVPTX: Refactor NVPTXInferAddressSpaces to check TTI Add a new TTI hook for getting the generic address space value. llvm-svn: 293563	2017-01-30 23:02:12 +00:00
Sanjay Patel	0c39d56a60	[InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants llvm-svn: 293562	2017-01-30 23:01:05 +00:00
Simon Pilgrim	3905e03a47	[X86][SSE] Fix unsigned <= 0 warning in assert. NFCI. Thanks to @mkuper llvm-svn: 293561	2017-01-30 22:58:44 +00:00
Simon Pilgrim	a80a47afef	[X86][SSE] Generalize the number of decoded shuffle inputs. NFCI. combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs. This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns. llvm-svn: 293560	2017-01-30 22:48:49 +00:00
Dehao Chen	6775f5d629	Expose isLegalToPromot as a global helper function so that SamplePGO pass can call it for legality check. Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29306 llvm-svn: 293559	2017-01-30 22:46:37 +00:00
Dehao Chen	6217fa44b8	Revert r292979 which causes compile time failure. llvm-svn: 293557	2017-01-30 22:26:05 +00:00
Sanjay Patel	98cc841421	[InstCombine] add tests for more shift-shift patterns; NFC llvm-svn: 293555	2017-01-30 22:24:36 +00:00
Eli Friedman	2345733246	Fix line endings. llvm-svn: 293554	2017-01-30 22:04:23 +00:00
Tom Stellard	887a2562b7	AMDGPU: Fix release build broken by r293551 llvm-svn: 293553	2017-01-30 22:02:58 +00:00
Artem Tamazov	61eb79d7a7	Reapply [AMDGPU][mc][tests][NFC] Add coverage/smoke tests for Gfx7 and Gfx8. llvm-svn: 293552	2017-01-30 21:59:21 +00:00
Tom Stellard	ca16621b2a	Re-commit AMDGPU/GlobalISel: Add support for simple shaders Fix build when global-isel is disabled and fix a warning. Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293551	2017-01-30 21:56:46 +00:00
Tim Northover	2bf8c9d381	GlobalISel: correctly translate invoke when callee is a register. This should fix the GlobalISel verifier. llvm-svn: 293550	2017-01-30 21:45:21 +00:00
Stanislav Mekhanoshin	a3b72798af	[AMDGPU] Internalize non-kernel symbols Since we have no call support and late linking we can produce code only for used symbols. This saves compilation time, size of the final executable, and size of any intermediate dumps. Run Internalize pass early in the opt pipeline followed by global DCE pass. To enable it RT can pass -amdgpu-internalize-symbols option. Differential Revision: https://reviews.llvm.org/D29214 llvm-svn: 293549	2017-01-30 21:05:18 +00:00

1 2 3 4 5 ...

144065 Commits