llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	b1cfcd1a56	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches for expandload/compressstore. Same as what was done for gather/scatter/load/store in r367489. Expandload/compressstore were delayed due to lack of constant masking handling that has since been fixed. llvm-svn: 367738	2019-08-02 23:43:53 +00:00
Craig Topper	de9b1d7912	[ScalarizeMaskedMemIntrin] Add constant mask support to expandload and compressstore scalarization This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals. Differential Revision: https://reviews.llvm.org/D65613 llvm-svn: 367715	2019-08-02 20:04:34 +00:00
Craig Topper	b70026c43c	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489	2019-07-31 22:58:15 +00:00
Craig Topper	5f79d74946	[X86] Add test cases for masked store and masked scatter with an all zeroes mask. Fix bug in ScalarizeMaskedMemIntrin Need to cast only to Constant instead of ConstantVector to allow ConstantAggregateZero. llvm-svn: 362341	2019-06-02 22:52:34 +00:00
Craig Topper	9f0b17a248	[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687	2019-03-21 17:38:52 +00:00
Craig Topper	8de7bc0bff	[ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce indentations to remove curly braces. Pre-commit for D59180 llvm-svn: 356646	2019-03-21 05:54:37 +00:00
Craig Topper	69f8c1653d	[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767	2019-03-09 02:08:41 +00:00
Craig Topper	d84f605910	[ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks were added. There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration. Noticed while starting work on adding expandload/compressstore to this pass. llvm-svn: 355754	2019-03-08 23:03:43 +00:00
James Y Knight	14359ef1b6	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Craig Topper	4104c00658	[ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only used inside loops. llvm-svn: 345638	2018-10-30 20:33:58 +00:00
Craig Topper	bb50c38635	[ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2. This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work. llvm-svn: 343294	2018-09-28 03:35:37 +00:00
Craig Topper	fdf4c76ca0	[ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion. It should be the minimum of the original alignment and the scalar size. llvm-svn: 343284	2018-09-28 01:06:13 +00:00
Craig Topper	8b4f0e1b8c	[ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow. Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case. llvm-svn: 343278	2018-09-27 22:31:42 +00:00
Craig Topper	10ec021621	[ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI cast will take care of asserting internally. llvm-svn: 343277	2018-09-27 22:31:40 +00:00
Craig Topper	6911bfe263	[ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it. Previously we started with undef and did a final merge with the passthru at the end. llvm-svn: 343273	2018-09-27 21:28:59 +00:00
Craig Topper	7d234d6628	[ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element. Previously we started with undef and did one final merge at the end with a select. llvm-svn: 343271	2018-09-27 21:28:52 +00:00
Craig Topper	dfc0f289fa	[ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector. This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead. So instead just check for Constant and use getAggregateElement which will do the dirty work for us. llvm-svn: 343270	2018-09-27 21:28:46 +00:00
Craig Topper	dfe460db57	[ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition. llvm-svn: 343268	2018-09-27 21:28:41 +00:00
Craig Topper	49dad8b8af	[ScalarizeMaskedMemIntrin] Cleanup comments. NFC llvm-svn: 343267	2018-09-27 21:28:39 +00:00
Craig Topper	0423681d4a	[ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244	2018-09-27 18:01:48 +00:00
Andrei Elovikov	822602a75e	[CodeGen] Do not allow opt-bisect-limit to skip ScalarizeMaskedMemIntrin. Summary: The pass is supposed to scalarize such intrinsics if the target does not support them natively, so if the scalarization does not happen instruction selection crashes due to inability to lower these intrinsics. Reviewers: andrew.w.kaylor, craig.topper Reviewed By: andrew.w.kaylor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45947 llvm-svn: 330700	2018-04-24 09:24:29 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Eugene Zelenko	fa57bd0ced	[CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 314363	2017-09-27 23:26:01 +00:00
Reid Kleckner	0e8c4bb055	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759	2017-09-07 23:27:44 +00:00
Matthias Braun	1527baab0c	CodeGen: Rename DEBUG_TYPE to match passnames Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921	2017-05-25 21:26:32 +00:00
Ayman Musa	c5490e5a29	[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050	2017-05-15 11:30:54 +00:00

27 Commits