llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	1e04923d21	[MachineValueType] Don't allow MVT::getVectorNumElements() to be called for scalable vectors. Migrate the one caller that failed lit tests to use MVT::getVectorMinNumElements directly.	2022-01-13 09:16:25 -08:00
Simon Pilgrim	55029f017d	[X86] canonicalizeShuffleWithBinOps - add X86ISD::PSHUFHW/PSHUFLW handling	2022-01-13 17:08:59 +00:00
Matt Arsenault	59994c25f9	AMDGPU: Select workitem ID intrinsics to 0 with req_work_group_size Shockingly we weren't doing this already. We should probably have this be done earlier in the IR too, but it's still helpful to have the lowering guarantee it so that we can modify the ABI implicit inputs based on it.	2022-01-13 12:08:18 -05:00
Matt Arsenault	a6f49423c1	AMDGPU: Optimize outgoing workitem ID based on reqd_work_group_size If we know we we aren't using a component from the kernel, we can save a few bit packing instructions. We're still enabling the VGPR input to the kernel though.	2022-01-13 12:08:18 -05:00
David Sherwood	ba471ba8d2	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `31009f0b5a`. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.	2022-01-13 15:59:43 +00:00
Eugene Zhulenev	764e52f0d4	[DebugInfo][InstrRef] Short-circuit unnecessary preferred location map construction Reviewed By: cota Differential Revision: https://reviews.llvm.org/D117162	2022-01-13 06:24:52 -08:00
Nikita Popov	aba7c3c033	[ConstantFold] Check uniform value in ConstantFoldLoadFromConst() This case is automatically handled if ConstantFoldLoadFromConstPtr() is used. Make sure that ConstantFoldLoadFromConst() also handles it.	2022-01-13 14:40:19 +01:00
Petar Avramovic	235886e174	AMDGPU/GlobalISel: Fix custom legalizatation for fceil	2022-01-13 14:29:30 +01:00
Sander de Smalen	b92102a6d7	[AArch64] Add native CPU detection for Neoverse-V1. Map Main ID part number 0xd40 to neoverse-v1, as described in the Neoverse-V1 Technical Reference Manual: https://developer.arm.com/documentation/101427/0101/Register-descriptions/AArch64-system-registers/MIDR-EL1--Main-ID-Register--EL1 Differential Revision: https://reviews.llvm.org/D117207	2022-01-13 12:58:54 +00:00
Simon Pilgrim	57a551a8df	[X86][AVX] lowerShuffleAsLanePermuteAndShuffle - don't split element rotate patterns Partial element rotate patterns (e.g. for element insertion on Issue #53124) were being split if every lane wasn't crossing, but really there's a good repeated mask hiding in there.	2022-01-13 11:59:08 +00:00
David Green	61888d97f6	[AArch64] Basic demand elements for some intrinsics A lot of neon intrinsics work lane-wise, meaning that non-demanded elements in and not demanded out. This teaches that to AArch64TTIImpl::simplifyDemandedVectorEltsIntrinsic for some simple single-input truncate intrinsics, which can help remove unnecessary instructions. Differential Revision: https://reviews.llvm.org/D117097	2022-01-13 11:53:12 +00:00
Florian Hahn	3f2fb767e3	[VPlan] Make IV operand explicit for VPWidenCanonicalIVRecipe (NFC). This makes the def-use relationship between VPCanonicalIVPHIRecipe and VPWidenCanonicalIVRecipe explicit. Needed for D117140.	2022-01-13 11:13:05 +00:00
Simon Pilgrim	4f414af6a7	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC.	2022-01-13 11:10:50 +00:00
Simon Pilgrim	37ebec68a8	[MIPS] Mips16DAGToDAGISel::selectAddr - Use cast<> instead of dyn_cast<> to avoid dereference of nullptr The pointer is always dereferenced immediately below, so assert the cast is correct instead of returning nullptr	2022-01-13 11:10:49 +00:00
Hans Wennborg	2bc57d85eb	Don't override __attribute__((no_stack_protector)) by inlining (PR52886) Since `26c6a3e736`, LLVM's inliner will "upgrade" the caller's stack protector attribute based on the callee. This lead to surprising results with Clang's no_stack_protector attribute added in `4fbf84c173` (D46300). Consider the following code compiled with clang -fstack-protector-strong -Os (https://godbolt.org/z/7s3rW7a1q). extern void h(int* p); inline __attribute__((always_inline)) int g() { return 0; } int __attribute__((__no_stack_protector__)) f() { int a[1]; h(a); return g(); } LLVM will inline g() into f(), and f() would get a stack protector, against the users explicit wishes, potentially breaking the program e.g. if h() changes the value of the stack cookie. That's a miscompile. More recently, `bc044a88ee` (D91816) addressed this problem by preventing inlining when the stack protector is disabled in the caller and enabled in the callee or vice versa. However, the problem remained if the callee is marked always_inline as in the example above. This affected users, see e.g. http://crbug.com/1274129 and http://llvm.org/pr52886. One way to fix this would be to prevent inlining also in the always_inline case. Despite the name, always_inline does not guarantee inlining, so this would be legal but potentially surprising to users. However, I think the better fix is to not enable the stack protector in a caller based on the callee. The motivation for the old behaviour is unclear, it seems counter-intuitive, and causes real problems as we've seen. This commit implements that fix, which means in the example above, g() gets inlined into f() (also without always_inline), and f() is emitted without stack protector. I think that matches most developers' expectations, and that's also what GCC does. Another effect of this change is that a no_stack_protector function can now be inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856): extern void h(int* p); inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() { return 0; } int f() { int a[1]; h(a); return g(); } I think that's fine. Such code would be unusual since no_stack_protector is normally applied to a program entry point which sets up the stack canary. And even if such code exists, inlining doesn't change the semantics: there is still no stack cookie setup/check around entry/exit of the g() code region, but there may be in the surrounding context, as there was before inlining. This also matches GCC. See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Differential revision: https://reviews.llvm.org/D116589	2022-01-13 12:04:49 +01:00
Sebastian Neubauer	f4139440f1	[Docs] Fix IR and TableGen grammar inconsistencies IR: - globals (and functions, ifuncs, aliases) can have a partition - catchret has a `to` before the label - the sint/int types do not exist - signext comes after the type - a variable was missing its type TableGen: - The second value after a `#` concatenation is optional See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351 - IncludeDirective and PreprocessorDirective were never referenced in the grammar - Add some missing ; - Parent classes of multiclasses can have generic arguments. Reuse the `ParentClassList` that is already used in other places. MIR: - liveins only allows physical registers, which start with a $ Differential Revision: https://reviews.llvm.org/D116674	2022-01-13 11:55:13 +01:00
Ties Stuij	7c70f96a91	[ARM] fix bug causing shrinkwrapping not always being off using PAC If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as that value will be returned from the function if the other checks fall through. Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D116213	2022-01-13 10:37:00 +00:00
Nikita Popov	1cbb456123	[GlobalOpt] Fix global to select transform under opaque pointers We need to check that the load/store type is also the same, as this is no longer implicitly checked through the pointer type.	2022-01-13 11:13:06 +01:00
Paulo Matos	97ef15ad76	[WebAssembly] Fix reftype load/store match with idx from call Implement support for matching an index from a WebAssembly CALL instruction. Add test. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D115327	2022-01-13 11:04:22 +01:00
Jay Foad	821dd3b0e5	[FileCheck] Allow literal '['s before "[[var...]]" Change FileCheck to accept patterns like "[[[var...]]" and treat the excess open brackets at the start as literals. This makes the patterns for matching assembler output with literal brackets much cleaner. For example an AMDGPU pattern that used to be written like: buffer_store_dwordx2 v{{\[}}[[LO]]:[[HI]]{{\]}} can now be: buffer_store_dwordx2 v[[[LO]]:[[HI]]] (Even before this patch the final close bracket did not need to be wrapped in {{}}, but people tended to do it anyway for symmetry.) This does not introduce any ambiguity since "[[" was always followed by an identifier or '@' or '#', so "[[[" was always an error. I've included a few test updates in this patch just for illustration and testing. There are a couple of hundred tests that could be updated as a follow up, mostly in test/CodeGen/. Differential Revision: https://reviews.llvm.org/D117117 Change-Id: Ia6bc6f65cb69734821c911f54a43fe1c673bcca7	2022-01-13 09:47:37 +00:00
David Sherwood	31009f0b5a	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-13 09:43:07 +00:00
Florian Hahn	7ce48be0fd	[LV] Inline CreateSplatIV call for scalar VFs (NFC). This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.	2022-01-13 09:34:31 +00:00
David Sherwood	ef1ca4d3e9	[AArch64] Fix incorrect use of MVT::getVectorNumElements in AArch64TTIImpl::getVectorInstrCost If we are inserting into or extracting from a scalable vector we do not know the number of elements at runtime, so we can only let the index wrap for fixed-length vectors. Tests added here: Analysis/CostModel/AArch64/sve-insert-extract.ll Differential Revision: https://reviews.llvm.org/D117099	2022-01-13 09:27:14 +00:00
Vladislav Khmelevsky	6b22c370c8	RuntimeDyldELF: Don't abort on R_AARCH64_NONE relocation Do nothing on R_AARCH64_NONE relocation. The relocation is used by BOLT when re-linking the final binary. It is used as a dummy relocation hack in order to stop the RuntimeDyld to skip the allocation of the section. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D117066	2022-01-13 11:54:48 +03:00
luxufan	0ef5aa69e7	[JITLink] Add fixup value range check This patch makes jitlink to report an out of range error when the fixup value out of range Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D107328	2022-01-13 16:32:49 +08:00
Jim Lin	bb13036483	[M68k][NFC] Use Register instead of unsigned int	2022-01-13 15:49:39 +08:00
Christian Sigg	cc1b9acf55	[NVPTX] Lower fp16 fminnum, fmaxnum to native on sm_80. Reviewed By: bkramer, tra Differential Revision: https://reviews.llvm.org/D117122	2022-01-13 08:52:31 +01:00
Kazu Hirata	cd772844d8	[CSKY] Ensure a newline at the end of a file (NFC)	2022-01-12 22:11:57 -08:00
James Y Knight	55fcbf0a84	Revert "[Inline] Attempt to delete any discardable if unused functions" Somehow this ends up causing an infinite loop in the inliner. This reverts commit `d5be48c66d`.	2022-01-13 03:06:47 +00:00
Lian Wang	16877c5d2c	[RISCV] Add bfp and bfpw intrinsic in zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994	2022-01-13 02:53:00 +00:00
Philip Reames	9979299705	[Attributor] Simplify how we handle required alignment during heap-to-stack [NFC] The existing code duplicated the same concern in two places, and (weirdly) changed the inference of the allocation size based on whether we could meet the alignment requirement. Instead, just directly check the allocation requirement.	2022-01-12 17:34:17 -08:00
Philip Reames	d1f4c6a611	[Attributor] Generalize calloc handling in heap-to-stack for any init value [NFC] Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values. The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well. This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled. Inspired by discussion on D116971	2022-01-12 16:58:39 -08:00
Philip Reames	8e76720cf2	[Attributor] Reuse object size evaluation code [NFC]	2022-01-12 16:58:39 -08:00
Philip Reames	db57065b36	[Attributor] Use getAllocAlignment where possible [NFC] Inspired by D116971.	2022-01-12 16:58:39 -08:00
Matt Arsenault	1adeebc2cf	AMDGPU: Fix assert on function argument as loop condition	2022-01-12 19:44:26 -05:00
Stanislav Mekhanoshin	d043822daa	[AMDGPU] Fixed physreg asm constraint parsing We are always failing parsing of the physreg constraint because we do not drop trailing brace, thus getAsInteger() returns a non-empty string and we delegate reparsing to the TargetLowering. In addition it did not parse register tuples. Fixed which has allowed to remove w/a in two places we call it. Differential Revision: https://reviews.llvm.org/D117055	2022-01-12 16:37:08 -08:00
Matt Arsenault	5a16306c09	GlobalISel: Always enable GISelKnownBits for InstructionSelect This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU needs this to match the addressing modes of stack access instructions, which is even more important at -O0 than with optimizations. It currently costs nothing to run ahead of time, so just always enable it.	2022-01-12 18:57:24 -05:00
Matt Arsenault	5f39a02ea9	RegScavenger: Remove used regs from scavenge candidates In a future change, AMDGPU will have 2 emergency scavenging indexes in some situations. The secondary scavenging index ends up being used recursively when the scavenger calls eliminateFrameIndex for the emergency spill slot. Without this, it would end up seeing the same register which was just scavenged in the parent call as free, inserts a second emergency spill to the same location and returns the same register when 2 unique free registers are required. We need to only do this if the register is used. SystemZ uses 2 scavenging slots, but calls the scavenger twice in sequence and not recursively. In this case the previously scavenged register can be re-clobbered, but is still tracked in the scavenger until it sees the deferred restore instruction.	2022-01-12 18:56:52 -05:00
Matt Arsenault	4515c24bbc	AMDGPU/GlobalISel: Fix assertions on legalize queries with huge align For some reason we pass around the alignment in bits as uint64_t. Two places were truncating it to unsigned, and losing bits in extreme cases.	2022-01-12 18:21:44 -05:00
Matt Arsenault	07ddfa95e3	GlobalISel: Add G_ASSERT_ALIGN hint instruction Insert it for call return values only for now, which is the only case the DAG handles also.	2022-01-12 18:20:58 -05:00
Tomas Matheson	2db4cf5962	clang support for Armv8.8/9.3 HBC This introduces clang command line support for new Armv8.8-A and Armv9.3-A Hinted Conditional Branches feature, previously introduced into LLVM in https://reviews.llvm.org/D116156. Patch by Tomas Matheson and Son Tuan Vu. Differential Revision: https://reviews.llvm.org/D116939	2022-01-12 22:07:35 +00:00
Luís Ferreira	6983968e83	[Demangle] Pass Ret parameter from decodeNumber by reference Since Ret parameter is never meant to be nullptr, let's pass it by reference instead of a raw pointer. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D117046	2022-01-12 21:57:31 +00:00
Luís Ferreira	b21ea1c270	[Demangle] Add support for D types back referencing This patch adds support for type back referencing, allowing demangling of compressed mangled symbols with repetitive types. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111419	2022-01-12 21:57:31 +00:00
Luís Ferreira	bec08795db	[Demangle] Add support for D symbols back referencing This patch adds support for identifier back referencing allowing compressed mangled names by avoiding repetitiveness. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111417	2022-01-12 21:57:31 +00:00
Luís Ferreira	669bfcf036	[Demangle] Add minimal support for D simple basic types This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111416	2022-01-12 21:57:30 +00:00
Sanjay Patel	6bd127b079	[InstSimplify] use knownbits to fold more udiv/urem We could use knownbits on both operands for even more folds (and there are already tests in place for that), but this is enough to recover the example from: https://github.com/llvm/llvm-project/issues/51934 (the tests are derived from the code in that example) I am assuming no noticeable compile-time impact from this because udiv/urem are rare opcodes. Differential Revision: https://reviews.llvm.org/D116616	2022-01-12 14:59:43 -05:00
Nico Weber	66b2ed477f	Revert "[JITLink][AArch64] Add support for splitting eh-frames on AArch64." This reverts commit `253ce92844`. Breaks tests on Windows, see https://github.com/llvm/llvm-project/issues/52921#issuecomment-1011118896	2022-01-12 14:40:09 -05:00
Alex Bradbury	33d008b169	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131	2022-01-12 19:33:44 +00:00
Matt Arsenault	bd2c01e937	AMDGPU/GlobalISel: Do not use terminator copy before waterfall loops Stop using the _term variants of the mov to save the initial exec value before the waterfall loop. This cannot be glued to the bottom of the block because we may need to spill the result register. Just use a regular mov, like the loops produced on the DAG path. Fixes some verification errors with regalloc fast.	2022-01-12 13:44:05 -05:00
Matt Arsenault	8a16201a0b	GlobalISel: Fix insert point in localizer This was inserting the new G_CONSTANT after the use, and the later block scan would run off the end. Fix calling SkipPHIsAndLabels for no apparent reason.	2022-01-12 13:44:05 -05:00

1 2 3 4 5 ...

154064 Commits