llvm-project

Commit Graph

Author	SHA1	Message	Date
Jingu Kang	a2a0ac42ab	[SimpleLoopBoundSplit] Split Bound of Loop which has conditional branch with IV This pass transforms loops that contain a conditional branch with induction variable. For example, it transforms left code to right code: newbound = min(n, c) while (iv < n) { while(iv < newbound) { A A if (iv < c) B B C C } } if (iv != n) { while (iv < n) { A C } } Differential Revision: https://reviews.llvm.org/D102234	2021-06-07 10:55:25 +01:00
Florian Hahn	23c2f2e6b2	[LV] Mark increment of main vector loop induction variable as NUW. This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail. If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW. This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling. At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything. Note that this could probably be further improved by using information from the original IV. Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g Part of a set of fixes required for PR50412. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D103255	2021-06-07 10:47:52 +01:00
Esme-Yi	bcb20aa770	Fixed the build failure of yaml2obj in XCOFFEmitter.cpp: error: ambiguous overload for 'operator==' (operand types are 'llvm::yaml::Hex16' and 'llvm::XCOFF::MagicNumber') Is64Bit = Obj.Header.Magic == XCOFF::XCOFF64;	2021-06-07 05:45:05 +00:00
Esme-Yi	50bb1b930d	[yaml2obj] Initial the support of yaml2obj for 32-bit XCOFF. Summary: The patch implements the mapping of the Yaml information to XCOFF object file to enable the yaml2obj tool for XCOFF. Currently only 32-bit is supported. Reviewed By: jhenderson, shchenz Differential Revision: https://reviews.llvm.org/D95505	2021-06-07 04:14:44 +00:00
Simon Pilgrim	432eff22ab	[CostModel][X86] Add 512-bit bswap costs	2021-06-06 22:36:34 +01:00
Simon Pilgrim	ae973380c5	[CostModel][X86] Improve AVX512 FDIV costs Add missing v16f32/v8f64 costs and adjust other costs as well based off the SkylakeServer model	2021-06-06 21:41:05 +01:00
Craig Topper	8bde5f06a1	[RISCV] Replace && with \|\|. Spotted by coverity. We should be exiting when the shift amount is greater than the bit width regardless of whether it is a power of 2. Reported by Simon Pilgrim here https://reviews.llvm.org/D96661 This requires getting a shift amount that is out of bounds that wasn't already optimized by SelectionDAG. This would be pretty trick to construct a test for. Or it would require a non-power of 2 shift amount and a mask that has runs of ones and zeros of the next lowest power of 2 from that shift amount. I tried a little to produce a test for this, but didn't get it to work.	2021-06-06 13:09:51 -07:00
Simon Pilgrim	8ab8b3fad7	[X86][SSE] LowerFP_TO_INT - remove dead code. NFCI. Non-Strict v2f32->v2i64 cases have already early-returned to be handled by legalization.	2021-06-06 20:04:15 +01:00
Simon Pilgrim	4879c8f3b0	[X86][SSE] combineVectorTruncation - simplify PSHUFB-is-better logic. NFCI. OutSVT is guaranteed to be i8/i16 and we accept any InSVT that isn't i64	2021-06-06 20:04:14 +01:00
maekawatoshiki	0a9d079931	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit `2165360003`. To fix the crash problem in legacy pass manager	2021-06-07 01:26:47 +09:00
Simon Pilgrim	b69e16b5cc	X86MachObjectWriter.cpp - silence null deference warnings. NFCI. The MCSymbol data should always be present for non-absolute sections so assert that it is to silence static analysis warnings.	2021-06-06 15:33:47 +01:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Simon Pilgrim	0f938a6ed8	X86Operand.h - fix uninitialized variable warnings in constructor. NFCI.	2021-06-06 15:25:03 +01:00
Simon Pilgrim	76a1be05fa	AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow. Fixes static analysis warning.	2021-06-06 15:25:03 +01:00
Nikita Popov	506875c879	[TargetLowering] Move methods out of line (NFC) Move methods using IRBuilder out of line, so we can drop the dependency on the header.	2021-06-06 16:02:10 +02:00
Nikita Popov	9914200393	[CodeGen] Add missing includes (NFC) These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.	2021-06-06 15:48:27 +02:00
Simon Pilgrim	9ced408fe9	SimplifyCFG.cpp - remove dead early-return code added at rGcc63203908da. NFCI. We've already checked that ScanIdx == 0 a few lines above.	2021-06-06 14:15:11 +01:00
Simon Pilgrim	937c4cffd0	Fix implicit fall through compiler warning. NFCI.	2021-06-06 13:45:11 +01:00
Simon Pilgrim	ab2d295552	BPFISelDAGToDAG.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null. Fixes static analysis warnings.	2021-06-06 13:24:29 +01:00
Liqiang Tao	48252d7570	Revert "[llvm] Add interface to order inlining"	2021-06-06 14:45:03 +08:00
Liqiang Tao	478dc47292	[llvm] Add interface to order inlining This patch abstract Calls in Inliner:run() to InlineOrder. With this patch, it's possible to customize the inlining order, i.e. use queue or priority queue. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D103315	2021-06-06 12:03:02 +08:00
Simon Pilgrim	e8423dbf35	BranchProbability.cpp - add missing implicit cmath header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h (necessary for gcc builds but not MSVC)	2021-06-05 21:14:43 +01:00
David Green	12f53e5392	[AArch64] Remove AArch64ISD::NEG This NEG node is just a vector negation, easily represented as a SUB zero. Removing it from the one place it is generated is essentially an NFC, but can allow some extra folding. The updated tests are now loading different constant literals, which have already been negated. Differential Revision: https://reviews.llvm.org/D103703	2021-06-05 19:54:42 +01:00
Simon Pilgrim	be51737f59	Fix "not all control paths return a value" MSVC warning. NFCI.	2021-06-05 19:42:00 +01:00
Simon Pilgrim	24b9bc8498	MsgPackReader.cpp - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 18:05:40 +01:00
Simon Pilgrim	e32d73ef5e	NativeFormatting.cpp - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 18:05:39 +01:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Nikita Popov	db45746821	[LoopUnroll] Separate peeling from unrolling Loop peeling is currently performed as part of UnrollLoop(). Outside test scenarios, it is always performed with an unroll count of 1. This means that unrolling doesn't actually do anything apart from performing post-unroll simplification. When testing, it's currently possible to specify both an explicit peel count and an explicit unroll count. This doesn't perform any sensible operation and may result in miscompiles, see https://bugs.llvm.org/show_bug.cgi?id=45939. This patch moves peeling from UnrollLoop() into tryToUnrollLoop(), so that peeling does not also perform a susequent unroll. We only run the post-unroll simplifications. Specifying both an explicit peel count and unroll count is forbidden. In the future, we may want to support both (non-PGO) peeling a loop and unrolling it, but this needs to be done by first performing the peel and then recalculating unrolling heuristics on a now possibly analyzable loop. Differential Revision: https://reviews.llvm.org/D103362	2021-06-05 10:32:00 +02:00
Vitaly Buka	e3258b0894	Revert "Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never\|[runtime]\|always)." Windows is still broken. This reverts commit `927688a4cd`.	2021-06-05 00:39:50 -07:00
Kevin Athey	927688a4cd	Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never\|[runtime]\|always). In addition: - optionally add global flag to capture compile intent for UAR: __asan_detect_use_after_return_always. The global is a SANITIZER_WEAK_ATTRIBUTE. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D103304	2021-06-05 00:26:10 -07:00
Fangrui Song	06e7de795b	Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build	2021-06-04 23:34:43 -07:00
Jim Lin	170b70b74b	[RISCV] Replace (XLenVT (VLOp GPR:$vl)) with VLOpFrag This is for D100288 to reduce the changes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103682	2021-06-05 12:49:31 +08:00
Vitaly Buka	d8a4a2cb93	Revert "Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never\|[runtime]\|always)." Reverts commits of D103304, it breaks Darwin. This reverts commit `60e5243e59`. This reverts commit `26b3ea224e`. This reverts commit `17600ec32a`.	2021-06-04 20:20:11 -07:00
Kevin Athey	60e5243e59	Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never\|[runtime]\|always). In addition: - optionally add global flag to capture compile intent for UAR: __asan_detect_use_after_return_always. The global is a SANITIZER_WEAK_ATTRIBUTE. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D103304	2021-06-04 16:30:47 -07:00
Roman Lebedev	852497711d	[X86] AMD Zen 3: double the LoopMicroOpBufferSize While the IndVars issue (PR50384) has been resolved, and the compile performance improved, a new blocker emerged, the codegen machine instruction scheduling is also quadratic. So we still can't really specify the right value here. Filed PR50584.	2021-06-05 01:23:58 +03:00
Fangrui Song	9e51d1f348	[InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat `__profd_` variables are referenced by code only when value profiling is enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just waste space on ELF/Mach-O. We change the comdat symbol from `__profd_` to `__profc_` because an internal symbol does not provide deduplication features on COFF. The choice doesn't matter on ELF. (In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_` symbols.) On Windows this enables further optimization. We are no longer affected by the link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can cause duplicate definition error. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585). This avoids many `/INCLUDE:` directives in `.drectve`. Here is rnk's measurement for Chrome: ``` This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%: #BEFORE $ find . -iname '.obj' \| xargs du -b \| awk '{ sum += $1 } END { print sum}' 1047758867 $ du -cksh base_unittests.exe 82M base_unittests.exe 82M total # AFTER $ find . -iname '.obj' \| xargs du -b \| awk '{ sum += $1 } END { print sum}' 937886499 $ du -cksh base_unittests.exe 78M base_unittests.exe 78M total ``` The change is NFC for Mach-O. Reviewed By: davidxl, rnk Differential Revision: https://reviews.llvm.org/D103372	2021-06-04 13:27:56 -07:00
Nikita Popov	14f350daf2	[IndVars] Don't forget value when inferring nowrap flags When SimplifyIndVars infers IR nowrap flags from SCEV, this may happen in two ways: Either nowrap flags were already present in SCEV and just get transferred to IR. Or zero/sign extension of addrecs infers additional nowrap flags, and those get transferred to IR. In the latter case, calling forgetValue() ensures that the newly inferred nowrap flags get propagated to any other SCEV expressions based on the addrec. However, the invalidation can also have a major compile-time effect in some cases. For https://bugs.llvm.org/show_bug.cgi?id=50384 with n=512 compile- time drops from 7.1s to 0.8s without this invalidation. At the same time, removing the invalidation doesn't affect any codegen in test-suite. Differential Revision: https://reviews.llvm.org/D103424	2021-06-04 20:57:22 +02:00
Rong Xu	8d581857d7	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Adam Nemet	ffde966cd9	[Matrix] Fix transpose-multiply folding if transpose has multiple uses Don't add it to FusedInsts in this case. Differential Revision: https://reviews.llvm.org/D103627	2021-06-04 10:55:03 -07:00
Jessica Paquette	507d193ea7	[AArch64][GlobalISel] Handle multiple phis in fixupPHIOpBanks If we ended up with two phi instructions in a block, and we needed to fix up the banks for the first one, we'd end up inserting our COPY before the second phi. E.g. ``` %x = G_PHI ... %fixup = COPY ... %y = G_PHI ... ``` This is invalid MIR, and breaks assumptions made by the register allocator later down the line. With the verifier enabled, it also emits a verification error. This teaches fixupPHIOpBanks to walk past any phi instructions in the block when emitting the fixup copies. Here's an example of the crashing code (same as added testcase): https://godbolt.org/z/h5j1x3o6e Differential Revision: https://reviews.llvm.org/D103582	2021-06-04 09:59:36 -07:00
Mark Schimmel	12592a439a	Add commutable attribute to opcodes for ARC This patch sets the isCommutable attribute for several opcodes that have the "reg = OPCODE reg, reg" format. Differential Revision: https://reviews.llvm.org/D103653	2021-06-04 19:49:19 +03:00
Joseph Huber	4a08163c73	[Attributor] Check HeapToStack's state for isKnownHeapToStack This patch changes the `isKnownHeapToStack` and `isAssumedHeapToStack` member functions to return if a function call is going to be altered by HeapToStack. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103574	2021-06-04 12:38:33 -04:00
Craig Topper	c653711fd3	[RISCV] Teach vsetvli insertion pass that operations on masks don't care about SEW/LMUL. All that really matters is that the VLMAX of the preceding instructions is the same as the VLMAX required by the mask operation. Also update the vmsge(u) handling to use the SEW/LMUL we use for other mask register operations. We were matching it to the compare before. Some cases will be improve if we fix masked compares to use tail agnostic policy. I think they ignore the tail policy anyway. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103299	2021-06-04 09:17:46 -07:00
Alexey Bataev	c84a5448b5	[OPENMP]Fix PR50129: omp cancel parallel not working as expected. Need to emit a call for __kmpc_cancel_barrier in the exit block for __kmpc_cancel function call if cancellation of the parallel block is requested. Differential Revision: https://reviews.llvm.org/D103646	2021-06-04 08:24:55 -07:00
Bradley Smith	a85f5874e2	[AArch64] Remove SETCC of CSEL when the latter's condition can be inverted setcc (csel 0, 1, cond, X), 1, ne ==> csel 0, 1, !cond, X Where X is a condition code setting instruction. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D103256	2021-06-04 15:53:21 +01:00
Nico Weber	e9a9c85098	Revert "[InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat" This reverts commit `a14fc749aa`. Breaks check-profile on macOS. See https://reviews.llvm.org/D103372 for details.	2021-06-04 10:00:12 -04:00
Nicholas Guy	3043cbc436	[AArch64] Further enable UnrollAndJam Due to the dependency on runtime unrolling, UnJ is only enabled by default on in-order scheduling models, and if a cpu is specified through -mcpu. Differential Revision: https://reviews.llvm.org/D103604	2021-06-04 14:18:49 +01:00
Mirko Brkusanin	35ef4c940b	[AMDGPU][GlobalISel] Legalize G_ABS Legalize and select G_ABS so that we can use llvm.abs intrinsic Differential Revision: https://reviews.llvm.org/D102391	2021-06-04 14:46:43 +02:00
Dmitry Preobrazhensky	cd093cbb11	[AMDGPU][MC][NFC] Fixed typos in parser Differential Revision: https://reviews.llvm.org/D103680	2021-06-04 15:40:42 +03:00
Bradley Smith	e42ee2d509	[AArch64][SVE] Add support for using reverse forms of SVE2 shifts When using and ACLE intrinsic for an SVE2 shift, if the predicate passed has all relevant lanes active, then use a reversed version of the instruction if beneficial.	2021-06-04 12:56:53 +01:00

1 2 3 4 5 ...

147671 Commits