llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	432eff22ab	[CostModel][X86] Add 512-bit bswap costs	2021-06-06 22:36:34 +01:00
Simon Pilgrim	ae973380c5	[CostModel][X86] Improve AVX512 FDIV costs Add missing v16f32/v8f64 costs and adjust other costs as well based off the SkylakeServer model	2021-06-06 21:41:05 +01:00
Craig Topper	8bde5f06a1	[RISCV] Replace && with \|\|. Spotted by coverity. We should be exiting when the shift amount is greater than the bit width regardless of whether it is a power of 2. Reported by Simon Pilgrim here https://reviews.llvm.org/D96661 This requires getting a shift amount that is out of bounds that wasn't already optimized by SelectionDAG. This would be pretty trick to construct a test for. Or it would require a non-power of 2 shift amount and a mask that has runs of ones and zeros of the next lowest power of 2 from that shift amount. I tried a little to produce a test for this, but didn't get it to work.	2021-06-06 13:09:51 -07:00
Simon Pilgrim	8ab8b3fad7	[X86][SSE] LowerFP_TO_INT - remove dead code. NFCI. Non-Strict v2f32->v2i64 cases have already early-returned to be handled by legalization.	2021-06-06 20:04:15 +01:00
Simon Pilgrim	4879c8f3b0	[X86][SSE] combineVectorTruncation - simplify PSHUFB-is-better logic. NFCI. OutSVT is guaranteed to be i8/i16 and we accept any InSVT that isn't i64	2021-06-06 20:04:14 +01:00
Simon Pilgrim	b69e16b5cc	X86MachObjectWriter.cpp - silence null deference warnings. NFCI. The MCSymbol data should always be present for non-absolute sections so assert that it is to silence static analysis warnings.	2021-06-06 15:33:47 +01:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Simon Pilgrim	0f938a6ed8	X86Operand.h - fix uninitialized variable warnings in constructor. NFCI.	2021-06-06 15:25:03 +01:00
Nikita Popov	9914200393	[CodeGen] Add missing includes (NFC) These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.	2021-06-06 15:48:27 +02:00
Simon Pilgrim	ab2d295552	BPFISelDAGToDAG.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null. Fixes static analysis warnings.	2021-06-06 13:24:29 +01:00
David Green	12f53e5392	[AArch64] Remove AArch64ISD::NEG This NEG node is just a vector negation, easily represented as a SUB zero. Removing it from the one place it is generated is essentially an NFC, but can allow some extra folding. The updated tests are now loading different constant literals, which have already been negated. Differential Revision: https://reviews.llvm.org/D103703	2021-06-05 19:54:42 +01:00
Simon Pilgrim	be51737f59	Fix "not all control paths return a value" MSVC warning. NFCI.	2021-06-05 19:42:00 +01:00
Fangrui Song	06e7de795b	Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build	2021-06-04 23:34:43 -07:00
Jim Lin	170b70b74b	[RISCV] Replace (XLenVT (VLOp GPR:$vl)) with VLOpFrag This is for D100288 to reduce the changes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103682	2021-06-05 12:49:31 +08:00
Roman Lebedev	852497711d	[X86] AMD Zen 3: double the LoopMicroOpBufferSize While the IndVars issue (PR50384) has been resolved, and the compile performance improved, a new blocker emerged, the codegen machine instruction scheduling is also quadratic. So we still can't really specify the right value here. Filed PR50584.	2021-06-05 01:23:58 +03:00
Rong Xu	8d581857d7	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Jessica Paquette	507d193ea7	[AArch64][GlobalISel] Handle multiple phis in fixupPHIOpBanks If we ended up with two phi instructions in a block, and we needed to fix up the banks for the first one, we'd end up inserting our COPY before the second phi. E.g. ``` %x = G_PHI ... %fixup = COPY ... %y = G_PHI ... ``` This is invalid MIR, and breaks assumptions made by the register allocator later down the line. With the verifier enabled, it also emits a verification error. This teaches fixupPHIOpBanks to walk past any phi instructions in the block when emitting the fixup copies. Here's an example of the crashing code (same as added testcase): https://godbolt.org/z/h5j1x3o6e Differential Revision: https://reviews.llvm.org/D103582	2021-06-04 09:59:36 -07:00
Mark Schimmel	12592a439a	Add commutable attribute to opcodes for ARC This patch sets the isCommutable attribute for several opcodes that have the "reg = OPCODE reg, reg" format. Differential Revision: https://reviews.llvm.org/D103653	2021-06-04 19:49:19 +03:00
Craig Topper	c653711fd3	[RISCV] Teach vsetvli insertion pass that operations on masks don't care about SEW/LMUL. All that really matters is that the VLMAX of the preceding instructions is the same as the VLMAX required by the mask operation. Also update the vmsge(u) handling to use the SEW/LMUL we use for other mask register operations. We were matching it to the compare before. Some cases will be improve if we fix masked compares to use tail agnostic policy. I think they ignore the tail policy anyway. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103299	2021-06-04 09:17:46 -07:00
Bradley Smith	a85f5874e2	[AArch64] Remove SETCC of CSEL when the latter's condition can be inverted setcc (csel 0, 1, cond, X), 1, ne ==> csel 0, 1, !cond, X Where X is a condition code setting instruction. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D103256	2021-06-04 15:53:21 +01:00
Nicholas Guy	3043cbc436	[AArch64] Further enable UnrollAndJam Due to the dependency on runtime unrolling, UnJ is only enabled by default on in-order scheduling models, and if a cpu is specified through -mcpu. Differential Revision: https://reviews.llvm.org/D103604	2021-06-04 14:18:49 +01:00
Mirko Brkusanin	35ef4c940b	[AMDGPU][GlobalISel] Legalize G_ABS Legalize and select G_ABS so that we can use llvm.abs intrinsic Differential Revision: https://reviews.llvm.org/D102391	2021-06-04 14:46:43 +02:00
Dmitry Preobrazhensky	cd093cbb11	[AMDGPU][MC][NFC] Fixed typos in parser Differential Revision: https://reviews.llvm.org/D103680	2021-06-04 15:40:42 +03:00
Bradley Smith	e42ee2d509	[AArch64][SVE] Add support for using reverse forms of SVE2 shifts When using and ACLE intrinsic for an SVE2 shift, if the predicate passed has all relevant lanes active, then use a reversed version of the instruction if beneficial.	2021-06-04 12:56:53 +01:00
Tim Northover	b16ddd0375	AArch64: support atomic zext/sextloads	2021-06-04 09:45:51 +01:00
madhur13490	6a3beb1f68	[AMDGPU] [IndirectCalls] Don't propagate attributes to address taken functions and their callees Don't propagate launch bound related attributes to address taken functions and their callees. The idea is to do a traversal over the call graph starting at address taken functions and erase the attributes set by previous logic i.e. process(). This two phase approach makes sure that we don't miss out on deep nested callees from address taken functions as a function might be called directly as well as indirectly. This patch is also reattempt to D94585 as latent issues are fixed in hasAddressTaken function in the recent past. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103138	2021-06-04 11:36:56 +05:30
hsmahesha	753437fc1d	Revert "[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering." This reverts commit `d71ff907ef`.	2021-06-04 11:16:46 +05:30
hsmahesha	d71ff907ef	[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering. Before packing LDS globals into a sorted structure, make sure that their alignment is properly updated based on their size. This will make sure that the members of sorted structure are properly aligned, and hence it will further reduce the probability of unaligned LDS access. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103261	2021-06-04 09:34:37 +05:30
Craig Topper	e9313fa33a	[RISCV] Simplify some code in RISCVInsertVSETVLI by calling an existing function that does the same thing. NFCI	2021-06-03 17:31:54 -07:00
Julien Pagès	37821155c9	[AMDGPU] Fix a crash when selecting a particular case of buffer_load_format_d16 In this particular example, we had a crash when compiling it for several architectures. This patch extends the legalization of extract_subvector to avoid this problem. Differential Revision: https://reviews.llvm.org/D103344	2021-06-03 16:40:18 -04:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
David Green	929c54379a	[ARM] Prettify gather/scatter debug comments. NFC	2021-06-03 12:33:03 +01:00
Fraser Cormack	8790e85255	[RISCV] Reserve an emergency spill slot for any RVV spills This patch addresses an issue in which fixed-length (VLS) vector RVV code could fail to reserve an emergency spill slot for their frame index elimination. This is because we were previously only reserving a spill slot when there were `scalable-vector` frame indices being used. However, fixed-length codegen uses regular-type frame indices if it needs to spill. This patch does the fairly brute-force method of checking ahead of time whether the function contains any RVV spill instructions, in which case it reserves one slot. Note that the second RVV slot is still only reserved for `scalable-vector` frame indices. This unfortunately causes quite a bit of churn in existing tests, where we chop and change stack offsets for spill slots. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103269	2021-06-03 10:44:34 +01:00
Fangrui Song	aba67ba784	[MC] Delete unneeded MCAsmParser &Parser	2021-06-02 16:10:18 -07:00
Fangrui Song	c980d93d91	[MC] Change "unexpected tokens" to "expected newline" and remove unneeded "in .xxx directive"	2021-06-02 16:08:05 -07:00
Anshil Gandhi	1c5ff0b03f	[PowerPC] [GlobalISel] Implementation of formal arguments lowering in the IRTranslator for the PPC backend Differential Revision: https://reviews.llvm.org/D99812	2021-06-02 16:46:39 -06:00
Anshil Gandhi	3e5ddb83e3	Revert "Differential Revision: https://reviews.llvm.org/D99812 " This reverts commit `c729f2a48a`.	2021-06-02 16:36:00 -06:00
Simon Pilgrim	9f5d783d46	[X86][SSE] combineScalarToVector - only reuse broadcasts for scalar_to_vector if the source operands scalar types match We were hitting an issue when the scalar_to_vector source was being implicitly truncated (in this case to i8 to vXi1) but we were also using the i8 source in a broadcast to a vXi8 value. Fixes PR50374	2021-06-02 22:05:40 +01:00
Min-Yih Hsu	344e919b1a	[CodeGen][NFC] Remove unused virtual function `TargetFrameLowering::emitCalleeSavedFrameMoves` with 4 arguments is not used anywhere in CodeGen. Thus it shouldn't be exposed as a virtual function. NFC. Differential Revision: https://reviews.llvm.org/D103328	2021-06-02 13:11:12 -07:00
Anshil Gandhi	c729f2a48a	Differential Revision: https://reviews.llvm.org/D99812	2021-06-02 14:09:52 -06:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Daniil Fukalov	0195e594fe	[TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102915	2021-06-02 16:04:11 +03:00
Irina Dobrescu	e971099a9b	[AArch64] Optimise bitreverse lowering in ISel Differential Revision: https://reviews.llvm.org/D103105	2021-06-02 12:51:12 +01:00
Fraser Cormack	3b0a33d0ad	[RISCV] Expand unaligned fixed-length vector memory accesses RVV vectors must be aligned to their element types, so anything less is unaligned. For regular loads and stores, our custom-lowering of fixed-length vectors meant that we opted out of LegalizeDAG's built-in unaligned expansion. This patch adds that logic in to our custom lower function. For masked intrinsics, we declare that anything unaligned is not legal, leaving the ScalarizeMaskedMemIntrin pass to do the expansion for us. Note that neither of these methods can handle the expansion of scalable-vector memory ops, so those cases are left alone by this patch. Scalable loads and stores already go through expansion by default but hit an assertion, and scalable masked intrinsics will silently generate incorrect code. It may be prudent to return an error in both of these cases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102493	2021-06-02 09:27:44 +01:00
Craig Topper	41ff1e0e29	[RISCV] Improve register allocation for masked vwadd(u).wv, vwsub(u).wv, vfwadd.wv, and vfwsub.wv. The first source has the same EEW as the destination, but we're using earlyclobber which prevents them from ever being the same register. To workaround this, add a special TIED pseudo to use whenever the first source and merge operand are the same value. This allows us to use a single operand for the merge operand and first source which we can then tie to the destination. A tied source disables earlyclobber for that operand. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D103211	2021-06-01 18:59:00 -07:00
Stanislav Mekhanoshin	9e2e49328f	[AMDGPU] All GWS instructions need aligned VGPR on gfx90a Fixes: SWDEV-288006 Differential Revision: https://reviews.llvm.org/D103197	2021-06-01 17:08:03 -07:00
Arthur Eubanks	8961293851	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Michael Benfield	00d19c6704	[various] Remove or use variables which are unused but set. This is in preparation for the -Wunused-but-set-variable warning. Differential Revision: https://reviews.llvm.org/D102942	2021-06-01 15:38:48 -07:00
Daniel Sanders	aaac268285	[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one It's still in use in a few places so we can't delete it yet but there's not many at this point. Differential Revision: https://reviews.llvm.org/D103352	2021-06-01 13:23:48 -07:00
Arthur Eubanks	2983053d23	[NFC][OpaquePtr] Explicitly pass GEP source type to IRBuilder in more places	2021-06-01 13:13:37 -07:00

1 2 3 4 5 ...

62984 Commits