llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	8bc0bb9564	Add a conversion from double to bf16 This introduces a new compiler-rt function `__truncdfbf2`.	2022-06-15 12:56:31 +02:00
Benjamin Kramer	fb34d531af	Promote bf16 to f32 when the target doesn't support it This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin. This is of course very bare bones, but sufficient to get a semi-softened fadd on x86. Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math. Differential Revision: https://reviews.llvm.org/D126953	2022-06-15 12:56:31 +02:00
Keith Walker	94fac097ad	[DebugInfo][ARM] Not readonly check for RWPI globals When compiling for the RWPI relocation model [1], the debug information is wrong for readonly global variables. Writable global variables are accessed by the static base register (R9 on ARM) in the RWPI relocation model. This is being correctly generated Readonly global variables are not accessed by the static base register in the RWPI relocation model. This case is incorrectly generating the same debugging information as for writable global variables. References: [1] ARM Read-Write Position Independence: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#read-write-position-independence-rwpi Differential Revision: https://reviews.llvm.org/D126361	2022-06-15 11:52:12 +01:00
Simon Pilgrim	f096d5926d	[DAG] Fix SDLoc mismatch in (shl (srl x, c1), c2) -> and(shift(x,c3)) fold Noticed by @craig.topper on D125836 which uses a tweaked copy of the same code. Differential Revision: https://reviews.llvm.org/D127772	2022-06-15 11:07:59 +01:00
Ping Deng	c06f77ec0d	[SelectionDAG] fold 'Op0 - (X * MulC)' to 'Op0 + (X << log2(-MulC))' Reviewed By: craig.topper, spatel Differential Revision: https://reviews.llvm.org/D127474	2022-06-15 05:50:18 +00:00
Paul Robinson	0ce33c2941	[PS5] Default to 'sce' debugger tuning	2022-06-14 15:28:28 -07:00
Serguei Katkov	d713f0eab8	Revert "[MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock" It looks like it causes buildbot failures. As an example: https://lab.llvm.org/buildbot/#/builders/121/builds/20364 Revert to investigate... This reverts commit `6bf2791814`.	2022-06-14 20:27:21 +07:00
Florian Hahn	e5c4308ba1	[InterleavedLoadComb] Rename uses when inserting new uses. This fixes a crash due to uses needing to be renamed.	2022-06-14 13:15:23 +01:00
Serguei Katkov	6bf2791814	[MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock GetValueInMiddleOfBlock uses result of GetValueAtEndOfBlockInternal if there is no value defined for current basic block. If there is already a value it tries (in this order): to find single register coming from all predecessors find existing phi node which matches our incoming registers build new phi. The compile time improvement is to use current available value if it is defined out of current BB or it is a PHI register. This is due to it can be used in the middle basic block. Reviewed By: sameerds Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D126523	2022-06-14 18:00:34 +07:00
Guillaume Chatelet	c0e85f1c3b	[NFC][Alignment] Use Align in SafeStack	2022-06-14 10:56:36 +00:00
Guillaume Chatelet	6725d80640	[NFC][Alignment] Use Align in shouldAlignPointerArgs	2022-06-14 10:56:36 +00:00
Denis Antrushin	c0e965e222	[Statepoints] FixupStatepoint: Clear isKill flag if COPY is not deleted. When spilling CSRs, FixupStatepoint pass does simple copy propagation, trying to find COPY instruction which defines register being spilled and spill COPY source instead. I.e., if we have CSR $x and found $x = COPY $y we will spill $y instead. But we may be unable to delete COPY instruction for some reason. Then, spill will be inserted after it, adding another use of $y. If COPY instruction was last use of $y (killed it), after insertion of the spill it is not, so `isKill` flag must be cleared. We failed to do so and this patch fixes this issue. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D127308	2022-06-14 10:52:32 +03:00
Kazu Hirata	a2232da2a5	[CodeGen] Remove addSEHCatchHandler and addSEHCleanupHandler (NFC) The last uses of these functions are removed on Oct 9, 2015 in commit `14e773500e`.	2022-06-13 23:08:49 -07:00
Kazu Hirata	34ff78c5cf	[CodeGen] Remove restrictRef (NFC) The last use was removed on Apr 14, 2017 in commit `4fe9d6c640`.	2022-06-13 23:08:48 -07:00
Serguei Katkov	095bf6be28	[Greedy RegAlloc] Fix the handling of split register in last chance re-coloring. This is a fix for https://github.com/llvm/llvm-project/issues/55827. When register we are trying to re-color is split the original register (we tried to recover) has no uses after the split. However in rollback actions we assign back physical register to it. Later it causes different assertions. One of them is in attached test. This CL fixes this by avoiding assigning physical register back to register which has no usage or its live interval now is empty. Reviewed By: arsenm, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D127281	2022-06-14 12:04:17 +07:00
Kazu Hirata	145cc9db2b	[CodeGen] Remove futureWeight (NFC) The last use was removed on Jun 5, 2022 in commit `5c06f7168f`, which itself was a patch to remove unused code.	2022-06-13 17:10:23 -07:00
Guillaume Chatelet	5a293d21fc	[NFC][Alignment] Use getAlign in SelectionDAGBuilder	2022-06-13 15:13:05 +00:00
Kazu Hirata	23d9ca10ae	[CodeGen] Remove EvictionTrack (NFC) The last of getEvictor use was removed on Jun 5, 2022 in commit `5c06f7168f`, which was itself a patch to remove unused code. Once we remove getEvictor, EvictionTrack becomes a write-only data structure. The data in it won't affect compilation, so the entire class is essentially dead.	2022-06-13 07:21:29 -07:00
Kazu Hirata	246e83e973	[GlobalISel] Remove buildSequence (NFC) The last use was removed on Jun 27, 2019 in commit `8138996128`.	2022-06-13 06:58:36 -07:00
Nikita Popov	b9a7dea917	[SelectionDAG] Handle trapping aggregate (PR49839) Call canTrap() on Constant to account for trapping ConstantAggregate.	2022-06-13 15:06:53 +02:00
Simon Pilgrim	7d8fd4f5db	[DAG] visitINSERT_VECTOR_ELT - attempt to reconstruct BUILD_VECTOR before other fold interfere Another issue unearthed by D127115 We take a long time to canonicalize an insert_vector_elt chain before being able to convert it into a build_vector - even if they are already in ascending insertion order, we fold the nodes one at a time into the build_vector 'seed', leaving plenty of time for other folds to alter it (in particular recognising when they come from extract_vector_elt resulting in a shuffle_vector that is much harder to fold with). D127115 makes this particularly difficult as we're almost guaranteed to have the lost the sequence before all possible insertions have been folded. This patch proposes to begin at the last insertion and attempt to collect all the (oneuse) insertions right away and create the build_vector before its too late. Differential Revision: https://reviews.llvm.org/D127595	2022-06-13 11:48:18 +01:00
Jez Ng	d4bcb45db7	[MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible This reverts commit `d941d59783`. Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 17:24:19 -04:00
Simon Pilgrim	1cf9b24da3	[DAG] Enable ISD::FSHL/R SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts. This helps with several of the regressions from D125836	2022-06-12 19:25:20 +01:00
Jez Ng	d941d59783	Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible" This reverts commit `ef501bf85d`.	2022-06-12 10:47:08 -04:00
Jez Ng	ef501bf85d	[MC] Omit DWARF unwind info if compact unwind is present where eligible Previously, omitting unnecessary DWARF unwinds was only done in two cases: * For Darwin + aarch64, if no DWARF unwind info is needed for all the functions in a TU, then the `__eh_frame` section would be omitted entirely. If any one function needed DWARF unwind, then MC would emit DWARF unwind entries for all the functions in the TU. * For watchOS, MC would omit DWARF unwind on a per-function basis, as long as compact unwind was available for that function. This diff makes it so that we omit DWARF unwind on a per-function basis for Darwin + aarch64 as well. In addition, we introduce the flag `--emit-dwarf-unwind=` which can toggle between `always`, `no-compact-unwind` (only emit DWARF when CU cannot be emitted for a given function), and the target platform `default`. `no-compact-unwind` is particularly useful for newer x86_64 platforms: we don't want to omit DWARF unwind for x86_64 in general due to possible backwards compat issues, but we should make it possible for people to opt into this behavior if they are only targeting newer platforms. Motivation: I'm working on adding support for `__eh_frame` to LLD, but I'm concerned that we would suffer a perf hit. Processing compact unwind is already expensive, and that's a simpler format than EH frames. Given that MC currently produces one EH frame entry for every compact unwind entry, I don't think processing them will be cheap. I tried to do something clever on LLD's end to drop the unnecessary EH frames at parse time, but this made the code significantly more complex. So I'm looking at fixing this at the MC level instead. Addendum: It turns out that there was a latent bug in the X86 backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is not too surprising given that this combination has not been heretofore used. For functions that have unwind info that cannot be encoded with CU, MC would end up dropping both the compact unwind entry (OK; existing behavior) as well as the DWARF entries (not OK). This diff fixes things so that we emit the DWARF entry, as well as a CU entry with encoding `UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry is necessary, this was the simplest fix. ld64 seems to be able to handle both the absence and presence of this CU entry. Ultimately ld64 (and LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there is no impact to the final binary size. Reviewed By: davide, lhames Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 10:03:56 -04:00
Simon Pilgrim	54ae4ca755	[DAG] visitSRL - pull out ShiftVT. NFC.	2022-06-12 14:02:23 +01:00
Simon Pilgrim	cf5c63d187	[DAG] visitVECTOR_SHUFFLE - fold splat(insert_vector_elt()) and splat(scalar_to_vector()) to build_vector splats Addresses a number of regressions identified in D127115	2022-06-11 21:06:42 +01:00
Simon Pilgrim	44a0cd25df	[DAG] visitINSERT_VECTOR_ELT - add <1 x ???> insert_vector_elt(v0,extract_vector_elt(v1,0),0) special case handling Check if we're just replacing one v1x?? vector with another	2022-06-11 19:30:00 +01:00
Simon Pilgrim	a71ad6a3c8	[DAG] visitINSERT_VECTOR_ELT - fold insert_vector_elt(scalar_to_vector(x),v,i) -> build_vector() Allow scalar_to_vector nodes to be used for the start of a build_vector creation	2022-06-11 15:29:22 +01:00
Simon Pilgrim	693f4db1ec	[DAG] visitINSERT_VECTOR_ELT - refactor BUILD_VECTOR insertion to remove early-out. NFCI. Remove the early-out cases so we can more easily add additional folds in the future.	2022-06-11 12:01:13 +01:00
Paul Walker	10d55c4634	[SelectionDAG] Remove invalid TypeSize conversion from WidenVecOp_BITCAST. Differential Revision: https://reviews.llvm.org/D127322	2022-06-11 10:41:13 +01:00
Kazu Hirata	a98965d92f	[CodeGen] Use llvm::erase_value (NFC)	2022-06-10 22:59:48 -07:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Eli Friedman	0ff51d5dde	Fix interaction of CFI instructions with MachineOutliner. 1. When checking if a candidate contains a CFI instruction, actually iterate over all of the instructions, instead of stopping halfway through. 2. Make sure copied CFI directives refer to the correct instruction. Fixes https://github.com/llvm/llvm-project/issues/55842 Differential Revision: https://reviews.llvm.org/D126930	2022-06-10 13:37:49 -07:00
Guillaume Chatelet	95083fa3b8	[NFC] Remove deadcode	2022-06-10 15:13:42 +00:00
Simon Pilgrim	91adbc3208	[DAG] SimplifyDemandedVectorElts - adding SimplifyMultipleUseDemandedVectorElts handling to ISD::CONCAT_VECTORS Attempt to look through multiple use operands of ISD::CONCAT_VECTORS nodes Another minor improvement for D127115	2022-06-10 16:06:43 +01:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
Nikita Popov	c10921fa1a	[CGP] Also freeze ctlz/cttz operand when despeculating D125887 changed the ctlz/cttz despeculation transform to insert a freeze for the introduced branch on zero. While this does fix the "branch on poison" issue, we may still get in trouble if we pick a different value for the branch and for the ctz argument (i.e. non-zero for the branch, but zero for the ctz). To avoid this, we should use the same frozen value in both positions. This does cause a regression in RISCV codegen by introducing an additional sext. The DAG looks like this: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %3 t4: i64 = AssertSext t2, ValueType:ch:i32 t23: i64 = freeze t4 t9: ch = CopyToReg t0, Register:i64 %0, t23 t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32> t18: ch = TokenFactor t9, t16 t25: i64 = sign_extend_inreg t23, ValueType:ch:i32 t24: i64 = setcc t25, Constant:i64<0>, seteq:ch t28: i64 = and t24, Constant:i64<1> t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68> t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80> I don't see a really obvious way to improve this, as we can't push the freeze past the AssertSext (which may produce poison). Differential Revision: https://reviews.llvm.org/D126638	2022-06-10 09:46:10 +02:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Simon Pilgrim	7dbfcfa735	[DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one. This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.	2022-06-09 14:56:14 +01:00
Guillaume Chatelet	dc3367970e	[SelectionDAG] Handle bzero/memset libcalls globally instead of per target Differential Revision: https://reviews.llvm.org/D127279	2022-06-09 08:34:55 +00:00
Craig Topper	4bcfc41846	[SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive value. This matches what we do in IR. For the RISC-V test case, this allows us to use -8 for the AND mask instead of materializing a constant in a register. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D127335	2022-06-08 14:55:58 -07:00
Kai Nacke	d897a14c2e	[SystemZ] Fix check for zero size when lowering memcmp. During lowering of memcmp/bcmp, the check for a size of 0 is done in 2 different ways. In rare cases this can lead to a crash in SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(). The root cause is that SelectionDAGBuilder::visitMemCmpBCmpCall() checks for a constant int value which is not yet evaluated. When the value is turned into a SDValue, then the evaluation is done and results in a ConstantSDNode. But EmitTargetCodeForMemcmp() expects the special case of 0 length to be handled, which results in an assertion. The fix is to turn the value into a SDValue, so that both functions use the same check. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D126900	2022-06-08 14:52:13 -04:00
Simon Pilgrim	b84c10d4bc	[DAG] visitVSELECT - don't wait for truncation of sub before attempting to match with getTruncatedUSUBSAT Fixes some X86 PSUBUS regressions encountered in D127115 where the truncate was being replaced with a PACKSS/PACKUS before the fold got called again	2022-06-08 16:16:35 +01:00
Joseph Huber	9e0dbd2a2a	[Target] Remove `startswith` for adding `SHF_EXCLUDE` to offload section Summary: We use the special section name `.llvm.offloading` to store device imagees in the host object file. We want these to be stripped by the linker as they are not used after linking so we use the `SHF_EXCLUDE` flag to instruct the linker to drop them. We used to do this for all sections that started with `.llvm.offloading` when we encoded metadata in the section name itself. Now we embed a special binary containing the metadata, we should only add the flag on this name specifically.	2022-06-08 09:56:51 -04:00
Paul Walker	d88354213c	[SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST. Extend the TypeWidenVector case of PromoteIntRes_BITCAST to work with TypeSize directly rather than silently casting to unsigned. To accomplish this I've extended TypeSize with an interface that essentially allows TypeSize division when both operands have the same number of dimensions. There still exists combinations of scalable vector bitcasts that cause compiler crashes. I call these out by adding "is missing" entries to sve-bitcast. Depends on D126957. Fixes: #55114 Differential Revision: https://reviews.llvm.org/D127126	2022-06-08 10:30:07 +01:00
Paul Walker	a1121c31d8	[SVE] Fix incorrect code generation for bitcasts of unpacked vector types. Bitcasting between unpacked scalable vector types of different element counts is not a NOP because the live elements are laid out differently. 01234567 e.g. nxv2i32 = XX??XX?? nxv4f16 = X?X?X?X? Differential Revision: https://reviews.llvm.org/D126957	2022-06-08 10:30:07 +01:00
Chuanqi Xu	0e10f12844	[NFC] Remove commented cerr debugging loggings There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.	2022-06-08 15:58:06 +08:00
Kito Cheng	7207373e1e	Revert "[SplitKit] Handle early clobber + tied to def correctly" Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS. This reverts commit `e14d04909d`.	2022-06-08 13:05:35 +08:00
Kito Cheng	e14d04909d	[SplitKit] Handle early clobber + tied to def correctly Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when the operand has tied to def, and the def operand is early clobber. Give an example to demo what's wrong: 0 %0 = ... 16 early-clobber %0 = Op %0 (tied-def 0), ... 32 ... = Op %0 Before extend: %0 = [0r, 0d) [16e, 32d) The point we want to extend is 0d to 16e not 16r in this case, but if we use 16r here we will extend nothing because that already contained in [16e, 32d). This patch add check for detect such case and adjust the extend point. Detailed explanation for testcase: https://reviews.llvm.org/D126047 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D126048	2022-06-08 11:33:05 +08:00

1 2 3 4 5 ...

32464 Commits