llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniil Suchkov	0e36288318	[LoopPredication] Report changes correctly when attempting loop exit predication To make the IR easier to analyze, this pass makes some minor transformations. After that, even if it doesn't decide to optimize anything, it can't report that it changed nothing and preserved all the analyses. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D109855	2021-09-16 22:49:55 +00:00
Jon Roelofs	4b19e7dfae	[LoopIdiomRecognize][Remarks] Track loop-strided store to/from blocks Differential revision: https://reviews.llvm.org/D109929	2021-09-16 15:46:26 -07:00
Jacob Lambert	4c1023b4b7	[AMDGPU] NFC: Fixing small spelling errors in AMDGPU header files Nonfunctional commit fixing several minor spelling errors in llvm/lib/Target/AMDGPU header files. Testing workflow as a new contributor. Differential Revision: https://reviews.llvm.org/D109733	2021-09-16 13:03:09 -07:00
Teresa Johnson	88cb3e2cb6	[MemProf] Don't instrument stack accesses unless requested Skip stack accesses unless requested, as the memory profiler runtime does not currently look at or report accesses for these addresses. Differential Revision: https://reviews.llvm.org/D109868	2021-09-16 12:21:51 -07:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Craig Topper	73e5b9ea90	[RISCV] Select (srl (sext_inreg X, i32), uimm5) to SRAIW if only lower 32 bits are used. SimplifyDemandedBits can turn srl into sra if the bits being shifted in aren't demanded. This patch can recover the original sra in some cases. I've renamed the tablegen class for detecting W users since the "overflowing operator" term I originally borrowed from Operator.h does not include srl. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D109162	2021-09-16 11:03:35 -07:00
Vang Thao	106959acc1	[AMDGPU] Inline non-kernel functions using extern lds In https://reviews.llvm.org/D100481, forceful inline of all non-kernel functions using lds was disabled since AMDGPULowerModuleLDS pass now handles static lds. However that pass does not handle extern lds so non-kernel functions using extern lds must sill be inline. Reviewed By: hsmhsm, arsenm Differential Revision: https://reviews.llvm.org/D109773	2021-09-16 10:58:51 -07:00
Arthur Eubanks	d49cb5b303	[SimplifyCFG] Add bonus when seeing vector ops to branch fold to common dest This makes some tests in vector-reductions-logical.ll more stable when applying D108837. The cost of branching is higher when vector ops are involved due to potential SLP transformations. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D108935	2021-09-16 10:50:36 -07:00
Dávid Bolvanský	a4a426c9e0	[InstCombine] Added llvm.powi optimizations If power is even: powi(-x, p) -> powi(x, p) powi(fabs(x), p) -> powi(x, p) powi(copysign(x, y), p) -> powi(x, p)	2021-09-16 19:42:21 +02:00
Kazu Hirata	cfc7402419	[llvm] Use drop_begin (NFC)	2021-09-16 08:46:26 -07:00
Michael Liao	ffa5c3a555	Fix warning on `llvm-else-after-return`. NFC.	2021-09-16 11:25:43 -04:00
Doug Gregor	a773db7d76	Add a command-line flag to control the Swift extended async frame info. Introduce a new command-line flag `-swift-async-fp={auto\|always\|never}` that controls how code generation sets the Swift extended async frame info bit. There are three possibilities: * `auto`: which determines how to set the bit based on deployment target, either statically or dynamically via `swift_async_extendedFramePointerFlags`. * `always`: the default, always set the bit statically, regardless of deployment target. * `never`: never set the bit, regardless of deployment target. Patch by Doug Gregor <dgregor@apple.com> Reviewed By: doug.gregor Differential Revision: https://reviews.llvm.org/D109392	2021-09-16 06:57:45 -07:00
Bjorn Pettersson	d9fc3d879e	[NewPM] Replace 'kasan-module' by 'asan-module<kernel>' Change the asan-module pass into a MODULE_PASS_WITH_PARAMS in the pass registry, and add a single parameter called 'kernel' that can be set instead of having a special pass name 'kasan-module' to trigger that special pass config. Main reason is to make sure that we have a unique mapping from ClassName to PassName in the new passmanager framework, making it possible to correctly identify the passes when dealing with options such as -print-after and -print-pipeline-passes. This is a follow-up to D105006 and D105007.	2021-09-16 14:58:42 +02:00
Bjorn Pettersson	8f8616655c	[NewPM] Use a separate struct for ModuleThreadSanitizerPass Split ThreadSanitizerPass into ThreadSanitizerPass (as a function pass) and ModuleThreadSanitizerPass (as a module pass). Main reason is to make sure that we have a unique mapping from ClassName to PassName in the new passmanager framework, making it possible to correctly identify the passes when dealing with options such as -print-after and -print-pipeline-passes. This is a follow-up to D105006 and D105007.	2021-09-16 14:58:42 +02:00
Bjorn Pettersson	ab41eef9ac	[NewPM] Use a separate struct for ModuleMemorySanitizerPass Split MemorySanitizerPass into MemorySanitizerPass (as a function pass) and ModuleMemorySanitizerPass (as a module pass). Main reason is to make sure that we have a unique mapping from ClassName to PassName in the new passmanager framework, making it possible to correctly identify the passes when dealing with options such as -print-after and -print-pipeline-passes. This is a follow-up to D105006 and D105007.	2021-09-16 14:58:42 +02:00
Alexandros Lamprineas	1bd5ea968e	[ARM] Mitigate the cve-2021-35465 security vulnurability. Recently a vulnerability issue is found in the implementation of VLLDM instruction in the Arm Cortex-M33, Cortex-M35P and Cortex-M55. If the VLLDM instruction is abandoned due to an exception when it is partially completed, it is possible for subsequent non-secure handler to access and modify the partial restored register values. This vulnerability is identified as CVE-2021-35465. The mitigation sequence varies between v8-m and v8.1-m as follows: v8-m.main --------- mrs r5, control tst r5, #8 /* CONTROL_S.SFPA / it ne .inst.w 0xeeb00a40 / vmovne s0, s0 / 1: vlldm sp / Lazy restore of d0-d16 and FPSCR. / v8.1-m.main ----------- vscclrm {vpr} / Clear VPR. / vlldm sp / Lazy restore of d0-d16 and FPSCR. */ More details on developer.arm.com/support/arm-security-updates/vlldm-instruction-security-vulnerability Differential Revision: https://reviews.llvm.org/D109157	2021-09-16 12:56:43 +01:00
Alexandros Lamprineas	61f25daa8d	[ARM][CMSE] Clear the secure fp-registers when using softfp abi. When expanding the non-secure call instruction we are emiting code to clear the secure floating-point registers only if the targeted architecture has floating-point support. The potential problem is when the source code containing non-secure calls are built with -mfloat-abi=soft but some other part of the system has been built with -mfloat-abi=softfp (soft and softfp are compatible as they use the same procedure calling standard). In this case floating-point registers could leak to non-secure state as the non-secure won't have cleared them assuming no floating point has been used. Differential Revision: https://reviews.llvm.org/D109153	2021-09-16 12:56:43 +01:00
Cullen Rhodes	17f1ccc759	[AArch64][SVE] NFC: Remove unnecessary if	2021-09-16 11:26:46 +00:00
Simon Pilgrim	1ef62cb200	[X86] SimplifyDemandedVectorEltsForTargetNode - add PSADBW handling Peek through PSADBW operands to handle non demanded elements.	2021-09-16 11:28:31 +01:00
Konstantin Schwarz	d2e66d7fa4	[GlobalISel] Add a combine for and(load , mask) -> zextload This only handles simple masks, not shifted masks, for now. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D109357	2021-09-16 10:42:46 +02:00
Anton Afanasyev	6a5f49a1ac	[AggressiveInstCombine] Add `{insert/extract}element` to `TruncInstCombine` DAG Alive2 for `{insert/extract}element`: https://alive2.llvm.org/ce/z/hwy_E- Actually, no one file of test suite is touched by this change, which means that is rare pattern not generated by frontend. But it's worth being in place. Differential Revision: https://reviews.llvm.org/D109236	2021-09-16 11:24:31 +03:00
Jay Foad	128a49727a	[AMDGPU] Fix upcoming TableGen warnings on unused template arguments. NFC. The warning is implemented by D109359 which is still in review. Differential Revision: https://reviews.llvm.org/D109826	2021-09-16 09:07:18 +01:00
Sam Parker	c98a8a09b5	[HardwareLoops] Loop guard intrinsic to recognise zext If a loop count was initially represented by a 32b unsigned int in C then the hardware-loop pass can recognise the loop guard and insert the llvm.test.set.loop.iterations intrinsic. If this was instead a unsigned short/char then clang inserts a zext instruction to expand the loop count to an i32. This patch adds the necessary pattern matching to enable the use of lvm.test.set.loop.iterations in those cases. Patch by: sherwin-dc Differential Revision: https://reviews.llvm.org/D109631	2021-09-16 08:33:16 +01:00
Alok Kumar Sharma	a5b72abc9e	[DebugInfo] Enhance DIImportedEntity to accept children entities New field `elements` is added to '!DIImportedEntity', representing list of aliased entities. This is needed to dump optimized debugging information where all names in a module are imported, but a few names are imported with overriding aliases. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D109343	2021-09-16 10:41:55 +05:30
Kazu Hirata	24c8eaec94	[Transforms] Use make_early_inc_range (NFC)	2021-09-15 19:55:24 -07:00
Jessica Paquette	c8b3d7d6d6	[AArch64][GlobalISel] Ensure atomic loads always get assigned GPR destinations The default register bank selection code for G_LOAD assumes that we ought to use a FPR when the load is casted to a float/double. For atomics, this isn't true; we should always use GPRs. Without this patch, we crash in the following example: https://godbolt.org/z/MThjas441 Also make the code a little more stylistically consistent while we're here. Also test some other weird cast combinations as well. Differential Revision: https://reviews.llvm.org/D109771	2021-09-15 17:05:09 -07:00
Ahmed Bougacha	e159d3cbfc	[AArch64][GlobalISel] Use MI::getIntrinsicID in more spots. NFC. There's technically a difference in the logic used by these findIntrinsicID and MachineInstr::getIntrinsicID, but it shouldn't be a meaningful difference here, with G_INTRINSIC instructions. getIntrinsicID's "first non-def" logic should be correct for those.	2021-09-15 16:45:34 -07:00
Ahmed Bougacha	94a2f9cdb6	[GlobalISel] Fix CombinerHelper::isPredecessor for same def/use MI. The doc comment for isPredecessor says: Returns true if \p DefMI precedes \p UseMI or they are the same instruction. And dominates relies on that behavior for its own: Returns true if \p DefMI dominates \p UseMI. By definition an instruction dominates itself. Make both statements correct by fixing isPredecessor. Found by inspection.	2021-09-15 16:45:27 -07:00
Arthur Eubanks	c3ddc13d7d	[NFC] Split up PassBuilder.cpp PassBuilder.cpp is the slowest file to compile in LLVM. When trying to test changes to pipelines, it takes a long time to recompile. This doesn't actually speedup building PassBuilder.cpp itself since most of the time is spent in other large/duplicated functions caused by PassRegistry.def. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D109798	2021-09-15 15:30:39 -07:00
Owen Anderson	68079ef0eb	Teach SimplifyCFG to fold switches into lookup tables in more cases. In particular, it couldn't handle cases where lookup table constant expressions involved bitcasts. This does not seem to come up frequently in C++, but comes up reasonably often in Rust via `#[derive(Debug)]`. Originally reported by pcwalton. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D109565	2021-09-15 22:07:08 +00:00
Anna Thomas	f9e4aebe4a	Revert "[InstCombine] Improve TryToSinkInstruction with multiple uses" This reverts commit `4ac4e52189`. There are couple of test failures, which needs update of the test cases. Doing a clean revert and will recommit the change along with fixed testcases.	2021-09-15 18:03:11 -04:00
David Blaikie	065bb08bb8	NFC: DWARFTypePrinter: Remove "type" from member function names to reduce redundancy	2021-09-15 14:46:28 -07:00
Anna Thomas	b6cb03e6b9	Revert use of getUniqueUndroppableUser in AssumeBundleBuilder Fix build bot failure in rG4ac4e521 caused due to assumeBundleBuilder using new API (getUniqueUndroppableUser). We now continue using the existing API for AssumeBundleBuilder (getSingleUndroppableUser). Sorry for the noise here. Tests-Run: failing testcase passes.	2021-09-15 17:45:09 -04:00
Matt Arsenault	87c00878d3	SplitKit: Remove decade old live interval hack This was trying to fixup broken live intervals coming out of the coalescer. The verifier is more complete now and no tests seem to fail without this.	2021-09-15 17:35:59 -04:00
Anna Thomas	3273430406	Re-add getSingleUndroppableUse API The API was removed in `4ac4e52189` in favor of getUniqueUndroppableUser. However, this caused a buildbot failure in AbstractCallSiteTest.cpp, which uses the API and the AbstractCallSite class requires a "use" rather than a user. Retain the API so that the unittest compiles and passes.	2021-09-15 17:06:20 -04:00
Anna Thomas	4ac4e52189	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also, the API for retrieving undroppable user has been updated accordingly since in both usecases (Attributor and InstCombine), we seem to care about the user, rather than the use. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-15 20:39:38 +00:00
Kazu Hirata	385f380e80	[MemorySSA] Fix "set but not used" warnings	2021-09-15 11:41:41 -07:00
Sanjay Patel	e5a32d720e	[InstCombine] move extend after insertelement if both operands are extended I was wondering how instcombine does on the examples in D109236, and we're missing a basic transform: inselt (ext X), (ext Y), Index --> ext (inselt X, Y, Index) https://alive2.llvm.org/ce/z/z2aBu9 Note that there are several possible extensions of this fold (see TODO comments). Differential Revision: https://reviews.llvm.org/D109537	2021-09-15 14:38:03 -04:00
Philip Reames	9bdb19cca2	[SCEV] (udiv X, Y) * Y is always NUW Motivated by the removal done in D109782. This implements the correct flag part generically. Differential Revision: https://reviews.llvm.org/D109786	2021-09-15 11:34:50 -07:00
Alina Sbirlea	b759381b75	[MemorySSA] Add verification levels to MemorySSA. [NFC] Add two levels of verification for MemorySSA: Fast and Full. The defaults are kept the same. Full verification always occurs under EXPENSIVE_CHECKS, but now it can also be requested in a specific pass for debugging purposes.	2021-09-15 11:09:54 -07:00
Filipp Zhinkin	f5d8952356	[InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y) Enabled mul folding optimization that was previously disabled by being incorrect. To preserve correctness, mul's operand that is not compared with zero in select's condition is now frozen. Related bug: https://bugs.llvm.org/show_bug.cgi?id=51286 Correctness: https://alive2.llvm.org/ce/z/bHef7J https://alive2.llvm.org/ce/z/QcR7sf https://alive2.llvm.org/ce/z/vvBLzt https://alive2.llvm.org/ce/z/jGDXgq https://alive2.llvm.org/ce/z/3Pe8Z4 https://alive2.llvm.org/ce/z/LGga8M https://alive2.llvm.org/ce/z/CTG5fs Differential Revision: https://reviews.llvm.org/D108408	2021-09-15 09:04:06 -04:00
Simon Pilgrim	0767e43d87	[CostModel][X86] Adjust bitreverse/ctpop/ctlz/cttz AVX2+ costs based on llvm-mca reports Based off the worse case numbers generated by D103695, the AVX2/512 bit reversing/counting costs were higher than necessary (based off instruction counts instead of actual throughput).	2021-09-15 13:04:40 +01:00
Martin Storsjö	b33a43e57c	[ARM] Move fetching of ARMSubtarget into the scopes that need it. NFC. This was requested in D38253, but missed back then. Differential Revision: https://reviews.llvm.org/D109046	2021-09-15 15:03:20 +03:00
David Green	a2332d5332	[ARM] Prevent continuous folding of SUBC Under some situations under Thumb1, we could be stuck in an infinite loop recombining the same instruction. This puts a limit on that, not combining SUBC with SUBE repeatedly.	2021-09-15 11:23:32 +01:00
David Green	61cc873a8e	[LV] Recognize intrinsic min/max reductions This extends the reduction logic in the vectorizer to handle intrinsic versions of min and max, both the floating point variants already created by instcombine under fastmath and the integer variants from D98152. As a bonus this allows us to match a chain of min or max operations into a single reduction, similar to how add/mul/etc work. Differential Revision: https://reviews.llvm.org/D109645	2021-09-15 10:45:50 +01:00
Simon Pilgrim	dcba994184	[X86] combineX86ShuffleChain - ensure we only peek through bitcasts to vectors (PR51858) When searching for hidden identity shuffles (added at rG41146bfe82aecc79961c3de898cda02998172e4b), only peek through bitcasts to the source operand if it is a vector type as well.	2021-09-15 10:21:05 +01:00
Simon Atanasyan	533471ff2f	[MIPS] Remove unused tblgen template args. NFC Identified in D109359.	2021-09-15 12:16:07 +03:00
Cullen Rhodes	18655140d6	[NVPTX] NFC: Remove unused imm type intrinsic arg Identified in D109359. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D109755	2021-09-15 08:56:51 +00:00
Florian Hahn	e90d55e1c9	[VPlan] Support sinking recipes with uniform users outside sink target. This is a first step towards addressing the last remaining limitation of the VPlan version of sinkScalarOperands: the legacy version can partially sink operands. For example, if a GEP has uniform users outside the sink target block, then the legacy version will sink all scalar GEPs, other than the one for lane 0. This patch works towards addressing this case in the VPlan version by detecting such cases and duplicating the sink candidate. All users outside of the sink target will be updated to use the uniform clone. Note that this highlights an issue with VPValue naming. If we duplicate a replicate recipe, they will share the same underlying IR value and both VPValues will have the same name ir<%gep>. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104254	2021-09-15 09:21:39 +01:00
Xiang1 Zhang	1f1c71aeac	[X86][InlineAsm] Use mem size information (*word ptr) for "global variable + registers" memory expression in inline asm. Differential Revision: https://reviews.llvm.org/D109739	2021-09-15 16:11:14 +08:00

1 2 3 4 5 ...

150758 Commits