llvm-project

Commit Graph

Author	SHA1	Message	Date
Konstantin Schwarz	90d5298759	[GlobalISel] Add convenience constructors to MemDesc This allows constructing a MemDesc from a MachineMemoryOperand, a pattern that starts to show up more frequently. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D109161	2021-09-03 12:52:18 +02:00
Chen Zheng	34badc409c	Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount." This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and is not a right patch according to comments in D91724 This reverts commit `42eaf4fe0a`.	2021-09-03 02:55:43 +00:00
Jessica Paquette	844d8e0337	[GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1 This adds the following combines: ``` x = ... 0 or 1 c = icmp eq x, 1 -> c = x ``` and ``` x = ... 0 or 1 c = icmp ne x, 0 -> c = x ``` When the target's true value for the relevant types is 1. This showed up in the following situation: https://godbolt.org/z/M5jKexWTW SDAG currently supports the `ne` case, but not the `eq` case. This can probably be further generalized, but I don't feel like thinking that hard right now. This gives some minor code size improvements across the board on CTMark at -Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.) Differential Revision: https://reviews.llvm.org/D109130	2021-09-02 15:05:31 -07:00
Heejin Ahn	28780e59f6	[WebAssembly] Add Wasm SjLj support This add support for SjLj using Wasm exception handling instructions: https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md This does not yet support the mixed use of EH and SjLj within a function. It will be added in a follow-up CL. This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s, except for the below: - `test_longjmp_standalone`: Uses Node - `test_dlfcn_longjmp`: Uses NodeRAWFS - `test_longjmp_throw`: Mixes EH and SjLj - `test_exceptions_longjmp1`: Mixes EH and SjLj - `test_exceptions_longjmp2`: Mixes EH and SjLj - `test_exceptions_longjmp3`: Mixes EH and SjLj Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D108960	2021-09-02 10:51:02 -07:00
David Green	9cb8f4d1ad	[ARM] Add a tail-predication loop predicate register The semantics of tail predication loops means that the value of LR as an instruction is executed determines the predicate. In other words: mov r3, #3 DLSTP lr, r3 // Start tail predication, lr==3 VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0. mov lr, #1 VADD.s32 q0, q1, q2 // Only first lane is updated. This means that the value of lr cannot be spilled and re-used in tail predication regions without potentially altering the behaviour of the program. More lanes than required could be stored, for example, and in the case of a gather those lanes might not have been setup, leading to alignment exceptions. This patch adds a new lr predicate operand to MVE instructions in order to keep a reference to the lr that they use as a tail predicate. It will usually hold the zeroreg meaning not predicated, being set to the LR phi value in the MVETPAndVPTOptimisationsPass. This will prevent it from being spilled anywhere that it needs to be used. A lot of tests needed updating. Differential Revision: https://reviews.llvm.org/D107638	2021-09-02 13:42:58 +01:00
Roman Lebedev	3f1f08f0ed	Revert @llvm.isnan intrinsic patchset. Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.) TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile. While the end result of discussion may lead back to the current design, it may also not lead to the current design. Therefore i take it upon myself to revert the tree back to last known good state. This reverts commit `4c4093e6e3`. This reverts commit `0a2b1ba33a`. This reverts commit `d9873711cb`. This reverts commit `791006fb8c`. This reverts commit `c22b64ef66`. This reverts commit `72ebcd3198`. This reverts commit `5fa6039a5f`. This reverts commit `9efda541bf`. This reverts commit `94d3ff09cf`.	2021-09-02 13:53:56 +03:00
Fraser Cormack	ef78f2106c	[LegalizeTypes][VP] Add splitting support for binary VP ops This patch extends D107904's introduction of vector-predicated (VP) operation legalization to include vector splitting. When the result of a binary VP operation needs splitting, all of its operands are split in kind. The two operands and the mask are split as usual, and the vector-length parameter EVL is "split" such that the low and high halves each execute the correct number of elements. Tests have been added to the RISC-V target to show splitting several scenarios for fixed- and scalable-vector types. Without support for `umax` (e.g. in the `B` extension) the generated code starts to branch. Ideally a cost model would prevent their insertion in the first place. Through these tests many opportunities for better codegen can be seen: combining known-undef VP operations and for constant-folding operations on `ISD::VSCALE`, to name but a few. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107957	2021-09-02 10:15:53 +01:00
Abinav Puthan Purayil	0baace5379	[DAGCombine] Add node level checks for fp-contract and fp-ninf in visitFMULForFMADistributiveCombine(). Differential Revision: https://reviews.llvm.org/D107551	2021-09-02 11:33:14 +05:30
Roman Lebedev	f5753125f0	[Codegen][TLI][X86] SimplifyMultipleUseDemandedBits(): 0'th vec subreg widening is free, try to perform it earlier I believe, the profitability reasoning here is correct "sub"reg is already located within the 0'th subreg of wider reg, so if we have suvector insertion at index 0 into undef, then it's always free do to. After this, D109065 finally avoids the regression in D108382. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D109074	2021-09-02 00:54:05 +03:00
Arthur Eubanks	52e6d70c40	[NFC] Use newly introduced *AtIndex methods Introduced in D108788. These are clearer.	2021-09-01 11:18:41 -07:00
Fraser Cormack	85fd44d7fe	[SelectionDAG][NFC] Fix typo in assertion message s/Uexpected/Unexpected.	2021-09-01 08:55:06 +01:00
Yonghong Song	89424a829f	[DWARF] Support new TAG DW_TAG_LLVM_annotation A new LLVM specific TAG DW_TAG_LLVM_annotation is added. The name is suggested by Paul Robinson ([1]). Currently, this tag is used to output __attribute__((btf_tag("string"))) annotations in dwarf. The following is an example for a global variable with two btf_tag attributes: 0x0000002a: DW_TAG_variable DW_AT_name ("g1") DW_AT_type (0x00000052 "int") DW_AT_external (true) DW_AT_decl_file ("/tmp/home/yhs/work/tests/llvm/btf_tag/t.c") DW_AT_decl_line (8) DW_AT_location (DW_OP_addr 0x0) 0x0000003f: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag1") 0x00000048: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag2") 0x00000051: NULL In the future, DW_TAG_LLVM_annotation may encode other type of non-string const value. [1] https://lists.llvm.org/pipermail/llvm-dev/2021-June/151250.html Differential Revision: https://reviews.llvm.org/D106621	2021-08-31 19:22:17 -07:00
Stanislav Mekhanoshin	d170945bb2	[RegAlloc] Immediately delete dead instructions with live uses When RA eliminated a dead def it can either immediately delete the instruction itself or replace it with KILL to defer the actual removal. If this instruction has a virtual register use killing the register it will shrink the LI of the use. However, if the LI covers the instruction and extends beyond it the shrink will not happen. In fact that is impossible to shrink such use because of the KILL still using it. If later the LI of the use will be split at the KILL and the KILL itself is eliminated after that point the new live segment ends up at an invalid slot index. This extremely rare condition was hit after D106408 which has enabled rematerialization of such instructions. The replacement with KILL is only done for rematerialized defs which became dead and such rematerialization did not generally happen before. The patch deletes an instruction immediately if it is a result of rematerialization and has such use. An alternative would be to prohibit a split at a KILL instruction, but it looks like it is better to split a live range rather then keeping a killed instruction just in case it can be rematerialized further. Fixes PR51655. Differential Revision: https://reviews.llvm.org/D108951	2021-08-31 13:46:00 -07:00
Jessica Paquette	94d3ff09cf	[GlobalISel] Don't use G_FPTOSI in G_ISNAN legalization As noted in the comments in D108227, using G_FPTOSI produces wrong results for G_ISNAN. Drop the G_FPTOSI and perform the operation on integer types. Elsewhere in LLVM, a bitcast would be the appropriate choice (as it is in SDAG). GlobalISel does not distinguish between integer and FP types, so a bitcast would be meaningless here.	2021-08-31 10:26:42 -07:00
Hussain Kadhem	524ded7d01	[VP] implementation of sdag support for VP memory intrinsics Followup to D99355: SDAG support for vector-predicated load/store/gather/scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D105871	2021-08-31 17:01:50 +02:00
Nemanja Ivanovic	84d4ed1761	Revert "[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item" This reverts commit `0a6fad754e`. It caused failures on a number of PowerPC bots.	2021-08-31 09:24:50 -05:00
Craig Topper	201f6446da	[LegalizeTypes][X86] Improve ExpandIntRes_FP_TO_SINT/ExpandIntRes_FP_TO_UINT when input is SoftPromoteHalf. Instead of splitting off the fp16 to float conversion and generating a libcall, we should split the operation into fp16 to float and float to integer operations. This will allow the float to integer conversion to go through any custom handling the target has. If the target doesn't have custom handling then we should come back to ExpandIntRes_FP_TO_SINT/ ExpandIntRes_FP_TO_UINT automatically to create the libcall. This avoids generating libcalls on 32-bit X86. These library functions may not exist in 32-bit libgcc. At least for LLVM, we never generate them when hardware floating point instructions are available. Differential Revision: https://reviews.llvm.org/D108933	2021-08-30 13:12:59 -07:00
Bjorn Pettersson	789f01283d	[SelectionDAG] Fix miscompile bugs related to smul.fix.sat with scale zero When expanding a SMULFIXSAT ISD node (usually originating from a smul.fix.sat intrinsic) we've applied some optimizations for the special case when the scale is zero. The idea has been that it would be cheaper to use an SMULO instruction (if legal) to perform the multiplication and at the same time detect any overflow. And in case of overflow we could use some SELECT:s to replace the result with the saturated min/max value. The only tricky part is to know if we overflowed on the min or max value, i.e. if the product is positive or negative. Unfortunately the implementation has been incorrect as it has looked at the product returned by the SMULO to determine the sign of the product. In case of overflow that product is truncated and won't give us the correct sign bit. This patch is adding an extra XOR of the multiplication operands, which is used to determine the sign of the non truncated product. This patch fixes PR51677. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D108938	2021-08-30 22:08:26 +02:00
Chih-Ping Chen	070090cfa5	[DebugInfo] Remove the restriction on the size of DIStringType in DebugHandlerBase::isUnsignedDIType. Differential Revision: https://reviews.llvm.org/D108559	2021-08-30 15:36:54 -04:00
Nikita Popov	0529e2e018	[InstrInfo] Use 64-bit immediates for analyzeCompare() (NFCI) The backend generally uses 64-bit immediates (e.g. what MachineOperand::getImm() returns), so use that for analyzeCompare() and optimizeCompareInst() as well. This avoids truncation for targets that support immediates larger 32-bit. In particular, we can avoid the bugprone value normalization hack in the AArch64 target. This is a followup to D108076. Differential Revision: https://reviews.llvm.org/D108875	2021-08-30 19:46:04 +02:00
Hongtao Yu	f39256e3a5	[CSSPGO] Avoid repeatedly computing md5 hash code for pseudo probe inline contexts. Md5 hashing is expansive. Using a hash map to look up already computed GUID for dwarf names. Saw a 2% build time improvement on an internal large application. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D108722	2021-08-30 10:11:47 -07:00
Kazu Hirata	c50faffb4e	[llvm] Remove redundant calls to str() and c_str() (NFC) Identified with readability-redundant-string-cstr.	2021-08-30 09:05:05 -07:00
Craig Topper	705d005781	[DAGCombiner][RISCV] Don't use vector types in DAGCombiner::tryStoreMergeOfLoads if we need a rotate. The check for whether a rotate is possible occurs before the memory legality checks for the integer type. So it's possible we decide we can use a rotate, but then fail the legality checks. If that happens we should not fall back to a vector type. This triggers an assertion in the rotate handling when it finds a vector type instead of an integer type. In theory we could use a shufflevector in place of the rotate, but right now I'd just like to fix the crash. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D108839	2021-08-30 08:47:15 -07:00
Djordje Todorovic	86f5288eae	[LiveDebugValues] Cleanup Transfers when removing Entry Value If we encounter a new debug value, describing the same parameter, we should stop tracking the parameter's Entry Value. At that point, in some cases, the Transfer which uses the parameter's Entry Value, is already emitted. Thanks to the RemoveRedundantDebugValues pass, many problems with incorrect instruction order and number of DBG_VALUEs are fixed. However, we still cannot rely on the rule that each new debug value is set by the previous non-debug instruction in Machine Basic Block. When new parameter debug value triggers removal of Backup Entry Value for the same parameter, do the cleanup of Transfers emitted from Backup Entry Values. Get the Transfer Instruction which created the new debug value and search for debug values already emitted from the to-be-deleted Backup Entry Value and attached to the Transfer Instruction. If found, delete the Transfer and remove "primary" Entry Value Var Loc from OpenRanges. This patch fixes PR47628. Patch by Nikola Tesic. Differential revision: https://reviews.llvm.org/D106856	2021-08-30 14:00:41 +02:00
Simon Pilgrim	7c25a32840	Fix MSVC "signed/unsigned mismatch" comparison warning. NFCI.	2021-08-30 12:11:09 +01:00
“bhkumarn”	0a6fad754e	[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran namelist variables. DICompositeType is extended to support this fortran feature. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D108553	2021-08-30 13:40:39 +05:30
Matt Arsenault	1494298b51	GlobalISel: Remove check for empty functions as these are invalid IR	2021-08-27 09:27:06 -04:00
Carl Ritson	5d9de3ea18	[DAGCombine] Allow FMA combine with both FMA and FMAD Without this change only the preferred fusion opcode is tested when attempting to combine FMA operations. If both FMA and FMAD are available then FMA ops formed prior to legalization will not be merged post legalization as FMAD becomes the preferred fusion opcode. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108619	2021-08-27 19:49:35 +09:00
Matt Arsenault	3fdcd9bb13	GlobalISel: Add CallBase to CallLoweringInfo The DAG version has this, and is necessary for call lowering to take advantage of any attributes at the call site.	2021-08-26 21:09:11 -04:00
Craig Topper	8bb24289f3	[SelectionDAG] Optimize bitreverse expansion to minimize the number of mask constants. We can halve the number of mask constants by masking before shl and after srl. This can reduce the number of mov immediate or constant materializations. Or reduce the number of constant pool loads for X86 vectors. I think we might be able to do something similar for bswap. I'll look at it next. Differential Revision: https://reviews.llvm.org/D108738	2021-08-26 09:33:24 -07:00
Andrew Wei	c9066c5d37	[CGP] Fix the crash for combining address mode when having cyclic dependency In the combination of addressing modes, when replacing the matched phi nodes, sometimes the phi node to be replaced has been modified. For example, there’s matcher set [A, B] and [C, A], which will have cyclic dependency: A is replaced by B and C will be replaced by A. Because we tried to match new phi node to another new phi node, we should ignore new phi nodes when mapping new phi node to old one. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D108635	2021-08-26 22:52:42 +08:00
Jay Foad	985eb25546	[MachineScheduler] Fix tracing Consistently print a newline before "RegionInstrs:".	2021-08-26 09:27:01 +01:00
Heejin Ahn	2f88a30ca6	[WebAssembly] Extract longjmp handling in EmSjLj to a function (NFC) Emscripten SjLj and (soon-to-be-added) Wasm SjLj transformation share many steps: 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA 1, 3, and 4 are identical for Emscripten SjLj and Wasm SjLj. Only the step 2 is different. This CL extracts the current Emscripten SjLj's longjmp callsites handling into a function. The reason to make this a separate CL is, without this, the diff tool cannot compare things well in the presence of moved code and added code in the followup Wasm SjLj CL, and it ends up mixing them together, making the diff unreadable. Also fixes some typos and variable names. So far we've been calling the buffer argument to `setjmp` and `longjmp` `jmpbuf`, but the name used in the man page for those functions is `env`, so updated them to be consistent. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108728	2021-08-25 15:45:38 -07:00
Heejin Ahn	c2c9a3fd9c	[WebAssembly] Rename wasm.catch.exn intrinsic back to wasm.catch The plan was to use `wasm.catch.exn` intrinsic to catch exceptions and add `wasm.catch.longjmp` intrinsic, that returns two values (setjmp buffer and return value), later to catch longjmps. But because we decided not to use multivalue support at the moment, we are going to use one intrinsic that returns a single value for both exceptions and longjmps. And even if it's not for that, I now think the naming of `wasm.catch.exn` is a little weird, because the intrinsic can still take a tag immediate, which means it can be used for anything, not only exceptions, as long as that returns a single value. This partially reverts D107405. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108683	2021-08-25 14:19:22 -07:00
Sanjay Patel	e728d1a3e8	[DAGCombiner] create binop nodes with all of expected values This is another bug exposed by https://llvm.org/PR51612 (and the one that triggered the initial assertion) in the report. That example was suppressed with: `985b48f183` ...but these would still crash because we created nodes like UADDO without the expected 2 output values.	2021-08-25 16:14:22 -04:00
Sanjay Patel	985b48f183	[DAGCombiner] check uses more strictly on select-of-binop fold There are 2 bugs here: 1. We were not checking uses of operand 2 (the false value of the select). 2. We were not checking for multiple uses of nodes that produce >1 result. Correcting those is enough to avoid the crash in the reduced test based on: https://llvm.org/PR51612 The additional use check on operand 0 (the condition value of the select) should not strictly be necessary because we are only replacing one use with another (whether it makes performance sense to do the transform with that pattern is not clear). But as noted in the TODO, changing that uncovers another bug. Note: there's at least one more bug here - we aren't propagating EVTs correctly, but I plan to fix that in another patch.	2021-08-25 14:14:41 -04:00
Nick Desaulniers	846e562dcc	[Clang] add support for error+warning fn attrs Add support for the GNU C style __attribute__((error(""))) and __attribute__((warning(""))). These attributes are meant to be put on declarations of functions whom should not be called. They are frequently used to provide compile time diagnostics similar to _Static_assert, but which may rely on non-ICE conditions (ie. relying on compiler optimizations). This is also similar to diagnose_if function attribute, but can diagnose after optimizations have been run. While users may instead simply call undefined functions in such cases to get a linkage failure from the linker, these provide a much more ergonomic and actionable diagnostic to users and do so at compile time rather than at link time. Users instead may be able use inline asm .err directives. These are used throughout the Linux kernel in its implementation of BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be converted to use _Static_assert because many of the parameters are not ICEs. The Linux kernel still needs to be modified to make use of these when building with Clang; I have a patch that does so I will send once this feature is landed. To do so, we create a new IR level Function attribute, "dontcall" (both error and warning boil down to one IR Fn Attr). Then, similar to calls to inline asm, we attach a !srcloc Metadata node to call sites of such attributed callees. The backend diagnoses these during instruction selection, while we still know that a call is a call (vs say a JMP that's a tail call) in an arch agnostic manner. The frontend then reconstructs the SourceLocation from that Metadata, and determines whether to emit an error or warning based on the callee's attribute. Link: https://bugs.llvm.org/show_bug.cgi?id=16428 Link: https://github.com/ClangBuiltLinux/linux/issues/1173 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D106030	2021-08-25 10:34:18 -07:00
Jeremy Morse	0116ed0069	[DebugInfo][InstrRef] Don't use instr-ref for unoptimised functions InstrRefBasedLDV is marginally slower than VarlocBasedLDV when analysing optimised code -- however, it's much slower when analysing code compiled -O0. To avoid this: don't use instruction referencing for -O0 functions. In the "pure" case of unoptimised code, this won't really harm the debugging experience because most variables won't have been promoted off the stack, so can't go missing. It becomes more complicated when optimised code is inlined into functions marked optnone; however these are rare, and as -O0 doesn't run many optimisations there should be little damage to the debug experience as a result. I've taken the opportunity to refactor testing for instruction-referencing into a MachineFunction method, which seems the most appropriate place to put it. Differential Revision: https://reviews.llvm.org/D108585	2021-08-25 15:10:36 +01:00
Peilin Guo	4c4dbeeeea	[DAGCombine] Check the legality of the index of EXTRACT_SUBVECTOR For ISD::EXTRACT_SUBVECTOR, its second operand must be a constant multiple of the known-minimum vector length of the result type. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D107795	2021-08-25 19:33:39 +08:00
Jeremy Morse	cc1e87bf55	[DebugInfo][InstrRef] Avoid stack-slot-coloring changing codegen due to DI Stack slot colouring adds "weight" to slots if a non-dbg-value instruction refers to it. This, unfortunately, means that DBG_PHI instructions can have an effect on codegen. The fix is very simple, replace isDebugValue with isDebugInstr. The regression test contains a scenario that reproduces this problem; I've represented both normal-debug mode and instr-ref debug mode instructions in comment lines prefixed with AAAAAA and BBBBBB, and un-comment them with sed to test that the two different modes produce the same behaviour. Differential Revision: https://reviews.llvm.org/D108627	2021-08-25 12:04:59 +01:00
Konstantin Schwarz	4b4bc1ea16	[GlobalISel] Do not generate illegal G_SEXTLOADs after legalization The sext_inreg_of_load combine did not have the isLegalOrBeforeLegalizer check, leading to the generation of potentially illegal G_SEXTLOADs when run after legalization. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108626	2021-08-25 10:13:39 +02:00
Vang Thao	549f6a819a	[MachineCopyPropagation] Check CrossCopyRegClass for cross-class copys On some AMDGPU subtargets, copying to and from AGPR registers using another AGPR register is not possible. A intermediate VGPR register is needed for AGPR to AGPR copy. This is an issue when machine copy propagation forwards a COPY $agpr, replacing a COPY $vgpr which results in $agpr = COPY $agpr. It is removing a cross class copy that may have been optimized by previous passes and potentially creating an unoptimized cross class copy later on. To avoid this issue, check CrossCopyRegClass if a different register class will be needed for the copy. If so then avoid forwarding the copy when the destination does not match the desired register class and if the original copy already matches the desired register class. Issue seen while attempting to optimize another AGPR to AGPR issue: Live-ins: $agpr0 $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $vgpr0 $agpr3 = COPY $vgpr0 $agpr4 = COPY $vgpr0 After machine-cp: $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $agpr0 $agpr3 = COPY $agpr0 $agpr4 = COPY $agpr0 Machine-cp propagated COPY $agpr0 to replace $vgpr0 creating 3 AGPR to AGPR copys. Later this creates a cross-register copy from AGPR->VGPR->AGPR for each copy when the prior VGPR->AGPR copy was already optimal. Reviewed By: lkail, rampitec Differential Revision: https://reviews.llvm.org/D108011	2021-08-24 21:22:36 -07:00
Stanislav Mekhanoshin	92c1fd19ab	Allow rematerialization of virtual reg uses Currently isReallyTriviallyReMaterializableGeneric() implementation prevents rematerialization on any virtual register use on the grounds that is not a trivial rematerialization and that we do not want to extend liveranges. It appears that LRE logic does not attempt to extend a liverange of a source register for rematerialization so that is not an issue. That is checked in the LiveRangeEdit::allUsesAvailableAt(). The only non-trivial aspect of it is accounting for tied-defs which normally represent a read-modify-write operation and not rematerializable. The test for a tied-def situation already exists in the /CodeGen/AMDGPU/remat-vop.mir, test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve. The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets where I more or less understand the asm it seems to reduce spilling (as expected) or be neutral. However, it needs a review by all targets' specialists. Differential Revision: https://reviews.llvm.org/D106408	2021-08-24 11:09:02 -07:00
Simon Pilgrim	194b08000c	[DAG] LoadedSlice::canMergeExpensiveCrossRegisterBankCopy - replace getABITypeAlign with allowsMemoryAccess (PR45116) One of the cases identified in PR45116 - we don't need to limit load combines to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory loads by checking allowsMisalignedMemoryAccesses as a fallback.	2021-08-24 15:28:30 +01:00
Simon Pilgrim	6de0b55188	[DAG] TransformFPLoadStorePair - replace getABITypeAlign with allowsMemoryAccess (PR45116) One of the cases identified in PR45116 - we don't need to limit load combines (in this case for fp->int load/store copies) to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory loads by checking allowsMisalignedMemoryAccesses as a fallback. Differential Revision: https://reviews.llvm.org/D108318	2021-08-24 13:11:27 +01:00
Simon Pilgrim	e431b280c9	[DAG] CombineConsecutiveLoads - replace getABITypeAlign with allowsMemoryAccess (PR45116) One of the cases identified in PR45116 - we don't need to limit load combines (in this case for ISD::BUILD_PAIR) to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory loads by checking allowsMisalignedMemoryAccesses as a fallback. This helps in particular for 32-bit X86 cases loading 64-bit size data, reducing codegen diffs vs x86_64. Differential Revision: https://reviews.llvm.org/D108307	2021-08-24 12:31:22 +01:00
Stanislav Mekhanoshin	401a45c61b	Fix late rematerialization operands check D106408 enables rematerialization of instructions with virtual register uses. That has uncovered the bug in the allUsesAvailableAt implementation: https://bugs.llvm.org/show_bug.cgi?id=51516. In the majority of cases canRematerializeAt() called to check if an instruction can be rematerialized before the given UseIdx. However, SplitEditor::enterIntvAtEnd() calls it to rematerialize an instruction at the end of a block passing LIS.getMBBEndIdx() into the check. In the testcase from the bug it has attempted to rematerialize ADDXri after STRXui in bb.17. The use operand %55 of the ADD is killed by the STRX but that is undetected by the check because it adjusts passed UseIdx to the reg slot, before the kill. The value is dead at the index passed to the check however. This change uses a later of passed UseIdx and its reg slot. This shall be correct because if are checking an availability of operands before an instruction that instruction cannot be the one defining these operands. If we are checking for late rematerialization we are really interested if operands live past the instruction. The bug is not exploitable without D106408 but needed to reland reverted D106408. Differential Revision: https://reviews.llvm.org/D108475	2021-08-23 12:23:58 -07:00
Jessica Paquette	6760e2a7bc	[GlobalISel] Translate @llvm.llround.* -> G_LLROUND Translate it using `IRTranslator::translateSimpleIntrinsic`. Differential Revision: https://reviews.llvm.org/D108563	2021-08-23 09:42:53 -07:00
Ben Shi	f69fb7ac72	[DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2) Reviewed by: lebedev.ri, spatel, craig.topper, luismarques, jrtc27 Differential Revision: https://reviews.llvm.org/D107711	2021-08-22 16:53:32 +08:00
Fangrui Song	1dfb30e54c	[TargetCallingConv] Change OutputArg ctor to match its members This avoids unneeded MVT->EVT conversion.	2021-08-21 16:41:48 -07:00
Jessica Paquette	af8e09d4bb	[GlobalISel] Add G_LLROUND Basically the same as G_LROUND. Handles the llvm.llround family of intrinsics. Also add a helper function to the MachineVerifier for checking if all of the (virtual register) operands of an instruction are scalars. Seems like a useful thing to have. Differential Revision: https://reviews.llvm.org/D108429	2021-08-20 14:07:21 -07:00
Daniel Paoliello	8ecce69594	Fix SEH table addresses for Windows Issue Details: The addresses for SEH tables for Windows are incorrect as 1 was unconditionally being added to all addresses. +1 is required for the SEH end address (as it is exclusive), but the SEH start addresses is inclusive and so should be used as-is. In the IP2State tables, the addresses are +1 for AMD64 to handle the return address for a call being after the actual call instruction but are as-is for ARM and ARM64 as the `StateFromIp` function in the VC runtime automatically takes this into account and adjusts the address that is it looking up. Fix Details: * Split the `getLabel` function into two: `getLabel` (used for the SEH start address and ARM+ARM64 IP2State addresses) and `getLabelPlusOne` (for the SEH end address, and AMD64 IP2State addresses). Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D107784	2021-08-20 22:32:12 +03:00
Craig Topper	10020d41ee	[TypePromotion] Remove unused IRBuilder object. NFC	2021-08-20 12:20:09 -07:00
Jeremy Morse	ce8254d096	[DebugInfo][InstrRef] Correctly ignore DBG_VALUE_LIST in InstrRef mode This patch makes InstrRefBasedLDV "safe" to work with DBG_VALUE_LISTs. It doesn't actually interpret them, but it recognises that they specify variable locations and avoids propagating false locations, which is better than the current state. Observe the attached tes * We avoid propagating DBG_VALUE_LISTs into successor blocks, as they're not "currently" supported, * We don't propagate other variable locations across DBG_VALUE_LISTs, because we know that the variable location is terminated by the DBG_VALUE_LIST. Differential Revision: https://reviews.llvm.org/D108143	2021-08-20 14:51:02 +01:00
Jeremy Morse	c76c24e40b	[DebugInfo][InstrRef] Remove a faulty assertion This patch removes an assertion, and adds a regression test showing why the assertion is broken. For context, LocIdx is a key/index number for machine locations, so that we can describe locations as a single integer and ignore whether they're on the stack, in registers or otherwise. Back when InstrRefBasedLDV was added, I happened to bake in a "special" zero number for various reasons, which Vedant identified as undesirable in this review comment: https://reviews.llvm.org/D83047#inline-765495 . I subsequently removed that special zero number, but it looks like I didn't delete this assertion at the time, which assumes that a zero LocIdx is invalid. The attached test shows that this assertion is reachable on valid code -- on x86 $rsp always gets the LocIdx number zero, and if you transfer a variable value into it, InstrRefBasedLDV crashes on that assertion. The code might be a bit wild to be storing variables to $rsp like that, however we shouldn't crash on it. Differential Revision: https://reviews.llvm.org/D108134	2021-08-20 14:23:32 +01:00
Anshil Gandhi	508b06699a	[Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions Produce remarks when atomic instructions are expanded into hardware instructions in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd instructions. Differential Revision: https://reviews.llvm.org/D108150	2021-08-19 20:51:19 -06:00
Jessica Paquette	3207ed196c	[GlobalISel] Add IRTranslator support for @llvm.lround.* -> G_LROUND Translate the `@llvm.lround.*` family to G_LROUND via `IRTranslator::translateSimpleIntrinsic`. Differential Revision: https://reviews.llvm.org/D108418	2021-08-19 17:08:08 -07:00
Jessica Paquette	3118926483	[GlobalISel] Add a G_LROUND instruction Meant to represent the `@llvm.lround.*` family. Add the opcode, docs, and verification. Differential Revision: https://reviews.llvm.org/D108417	2021-08-19 17:06:24 -07:00
Amara Emerson	95ac3d15e9	[AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization. For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize completely if the source is <= 64b. This change adds support for that in the legalizer. If the source has a pow-2 num elements, then we can do a tree reduction using the scalar operation in the individual elements. Otherwise, we just create a sequential chain of operations. For AArch64, we only need to scalarize if the input is <64b. If it's great than 64b then we can first do a fewElements step to 64b, taking advantage of vector instructions until we reach the point of scalarization. I also had to relax the verifier checks for reductions because the intrinsics support <1 x EltTy> types, which we lower to scalars for GlobalISel. Differential Revision: https://reviews.llvm.org/D108276	2021-08-19 16:38:52 -07:00
Adrian Prantl	1e586bcc3e	Move function definition out-of-line to fix the modularized build (NFC)	2021-08-19 12:26:23 -07:00
Craig Topper	84cea602f9	Revert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand." This reverts commit `add08c8741`. There was a compile time jump on tramp3d-v4 on https://llvm-compile-time-tracker.com/ Want to see if it goes away with this reverted.	2021-08-19 08:42:05 -07:00
David Green	d10f23a25d	[ISel] Expand saddsat and ssubsat via asr and xor This changes the lowering of saddsat and ssubsat so that instead of using: r,o = saddo x, y c = setcc r < 0 s = c ? INTMAX : INTMIN ret o ? s : r into using asr and xor to materialize the INTMAX/INTMIN constants: r,o = saddo x, y s = ashr r, BW-1 x = xor s, INTMIN ret o ? x : r https://alive2.llvm.org/ce/z/TYufgD This seems to reduce the instruction count in most testcases across most architectures. X86 has some custom lowering added to compensate for cases where it can increase instruction count. Differential Revision: https://reviews.llvm.org/D105853	2021-08-19 16:08:07 +01:00
Craig Topper	add08c8741	[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand. Previously we pre-calculated this and cached it for every instruction in the function. Most of the calculated results will never be used. So instead calculate it only on the first use, and then cache it. The cache was originally added to fix a compile time issue which caused r216066 to be reverted. This change exposed that we weren't pre-computing the Value for Arguments. I've explicitly disabled that for now as it seemed to regress some tests on AArch64 which has sext built into its compare instructions. Spotted while investigating how to improve heuristics to work better with RISCV preferring sign extend for unsigned compares for i32 on RV64. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107976	2021-08-19 07:18:33 -07:00
Craig Topper	c60a4c1ba5	[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC This matches how they are called and allows some isa/cast/dyn_cast to be removed. Differential Revision: https://reviews.llvm.org/D108333	2021-08-19 07:09:38 -07:00
Fraser Cormack	e6b1ac8546	[LegalizeTypes][VP] Add widening support for binary VP ops This patch adds the beginnings of more thorough support in the legalizers for vector-predicated (VP) operations. The first step is the ability to widen illegal vectors. The more complicated scenario in which the result/operands need widening but the mask doesn't has not been handled here. That would require a lot of code without an in-tree target on which to test it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107904	2021-08-19 13:08:47 +01:00
Rong Xu	5fdaaf7fd8	[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader This patch implements Flow Sensitive Sample FDO (FSAFDO) profile loader. We have two profile loaders for FS profile, one before RegAlloc and one before BlockPlacement. To enable it, when -fprofile-sample-use=<profile> is specified, add "-enable-fs-discriminator=true \ -disable-ra-fsprofile-loader=false \ -disable-layout-fsprofile-loader=false" to turn on the FS profile loaders. Differential Revision: https://reviews.llvm.org/D107878	2021-08-18 18:37:35 -07:00
Kyungwoo Lee	829616c241	[NFC][DebugInfo] getDwarfCompileUnitID This is a refactoring for the use in https://reviews.llvm.org/D108261 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D108271	2021-08-18 17:35:03 -07:00
Arthur Eubanks	2fc075948c	[NFC] Remove some unnecessary AttributeList methods These rely on methods I'm trying to cleanup.	2021-08-18 11:15:20 -07:00
Jessica Paquette	791006fb8c	[GlobalISel] Implement lowering for G_ISNAN + use it in AArch64 GlobalISel equivalent to `TargetLowering::expandISNAN`. Use it in AArch64 and add a testcase. Differential Revision: https://reviews.llvm.org/D108227	2021-08-18 10:54:25 -07:00
Jessica Paquette	d9873711cb	[GlobalISel] Add IRTranslator support for G_ISNAN Translate the `@llvm.isnan` intrinsic to G_ISNAN when we see it. This is pretty much the same as the associated SelectionDAGBuilder code. Main difference is that we don't expand it here. It makes more sense to do that during legalization in GlobalISel. GlobalISel will just legalize the generated illegal types. Differential Revision: https://reviews.llvm.org/D108226	2021-08-18 10:48:10 -07:00
Jessica Paquette	0a2b1ba33a	[GlobalISel] Add G_ISNAN Add a generic opcode equivalent to the `llvm.isnan` intrinsic + MachineVerifier support for it. We need an opcode here because we may want target-specific lowering later on. Differential Revision: https://reviews.llvm.org/D108222	2021-08-18 10:42:05 -07:00
Petr Hosek	2d4470ab89	Revert "Allow rematerialization of virtual reg uses" This reverts commit `877572cc19` which introduced PR51516.	2021-08-18 00:12:41 -07:00
Arthur Eubanks	3f4d00bc3b	[NFC] More get/removeAttribute() cleanup	2021-08-17 21:05:41 -07:00
Arthur Eubanks	de0ae9e89e	[NFC] Cleanup more AttributeList::addAttribute()	2021-08-17 21:05:41 -07:00
Qiu Chaofan	5ca250a03d	[RegAlloc] Remove addAllocPriorityToGlobalRanges hook It was introduced in `1a6dc92` and only enabled on PowerPC/AMDGPU. That should be enabled for all targets. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D108010	2021-08-18 10:21:27 +08:00
jacquesguan	a7ebc4d145	[DAGCombiner] Teach isKnownToBeAPowerOfTwo handle SPLAT_VECTOR Make DAGCombine turn mul by power of 2 into shl for scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D107883	2021-08-18 10:10:40 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Simon Pilgrim	d7f288502f	SelectionDAGBuilder::visitInlineAsm - don't dereference dyn_cast<> results. dyn_cast<> can return nullptr if the cast is illegal, use cast<> instead which will assert that the cast is correct. Fixes static analyser warning.	2021-08-17 18:40:59 +01:00
Fraser Cormack	f3e9047249	[VP] Add vector-predicated reduction intrinsics This patch adds vector-predicated ("VP") reduction intrinsics corresponding to each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the unpredicated reductions, all VP reductions have a start value. This start value is returned when the no vector element is active. Support for expansion on targets without native vector-predication support is included. This patch is based on the ["reduction slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch (https://reviews.llvm.org/D57504). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104308	2021-08-17 17:56:35 +01:00
Sebastian Neubauer	fbae34635d	[GlobalISel] Add combine for PTR_ADD with regbanks Combine two G_PTR_ADDs, but keep the register bank of the constant. That way, the combine can be used in post-regbank-select combines. Introduce two helper methods in CombinerHelper, getRegBank and setRegBank that get and set an optional register bank to a register. That way, they can be used before and after register bank selection. Differential Revision: https://reviews.llvm.org/D103326	2021-08-17 13:58:16 +02:00
Tiehu Zhang	9cfa9b44a5	[CodeGenPrepare] The instruction to be sunk should be inserted before its user in a block In current implementation, the instruction to be sunk will be inserted before the target instruction without considering the def-use tree, which may case Instruction does not dominate all uses error. We need to choose a suitable location to insert according to the use chain Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D107262	2021-08-17 18:58:15 +08:00
Jeremy Morse	708cbda577	[DebugInfo][InstrRef] Honour too-much-debug-info cutouts This reapplies `54a61c94f9`, its follow up in `547b712500`, which were reverted `95fe61e639`. Original commit message: VarLoc based LiveDebugValues will abandon variable location propagation if there are too many blocks and variable assignments in the function. If it didn't, and we had (say) 1000 blocks and 1000 variables in scope, we'd end up with 1 million DBG_VALUEs just at the start of blocks. Instruction-referencing LiveDebugValues should honour this limitation too (because the same limitation applies to it). Hoist the relevant command line options into LiveDebugValues.cpp and pass it down into the implementation classes as an argument to ExtendRanges. I've duplicated all the run-lines in live-debug-values-cutoffs.mir to have an instruction-referencing flavour. Differential Revision: https://reviews.llvm.org/D107823	2021-08-17 11:34:49 +01:00
Arthur Eubanks	0d822da2bd	[NFC] Remove/replace some confusing attribute getters on Function	2021-08-16 16:12:37 -07:00
Afanasyev Ivan	913b5d2f7a	[AsmPrinter] fix nullptr dereference for MBBs with hasAddressTaken property without BB Basic block pointer is dereferenced unconditionally for MBBs with hasAddressTaken property. MBBs might have hasAddressTaken property without reference to BB. Backend developers must assign fake BB to MBB to workaround this issue and it should be fixed. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D108092	2021-08-16 15:32:09 -07:00
Anshil Gandhi	f22ba51873	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-16 14:56:01 -06:00
Stanislav Mekhanoshin	877572cc19	Allow rematerialization of virtual reg uses Currently isReallyTriviallyReMaterializableGeneric() implementation prevents rematerialization on any virtual register use on the grounds that is not a trivial rematerialization and that we do not want to extend liveranges. It appears that LRE logic does not attempt to extend a liverange of a source register for rematerialization so that is not an issue. That is checked in the LiveRangeEdit::allUsesAvailableAt(). The only non-trivial aspect of it is accounting for tied-defs which normally represent a read-modify-write operation and not rematerializable. The test for a tied-def situation already exists in the /CodeGen/AMDGPU/remat-vop.mir, test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve. The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets where I more or less understand the asm it seems to reduce spilling (as expected) or be neutral. However, it needs a review by all targets' specialists. Differential Revision: https://reviews.llvm.org/D106408	2021-08-16 12:42:42 -07:00
Stanislav Mekhanoshin	b9e433b02a	Prevent machine licm if remattable with a vreg use Check if a remateralizable nstruction does not have any virtual register uses. Even though rematerializable RA might not actually rematerialize it in this scenario. In that case we do not want to hoist such instruction out of the loop in a believe RA will sink it back if needed. This already has impact on AMDGPU target which does not check for this condition in its isTriviallyReMaterializable implementation and have instructions with virtual register uses enabled. The other targets are not impacted at this point although will be when D106408 lands. Differential Revision: https://reviews.llvm.org/D107677	2021-08-16 12:09:00 -07:00
Craig Topper	92abb1cf90	[TypePromotion] Don't mutate the result type of SwitchInst. SwitchInst should have a void result type. Add a check to the verifier to catch this error. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D108084	2021-08-16 08:54:34 -07:00
Simon Pilgrim	d6fe8d37c6	[DAG] Fold concat_vectors(concat_vectors(x,y),concat_vectors(a,b)) -> concat_vectors(x,y,a,b) Follow-up to D107068, attempt to fold nested concat_vectors/undefs, as long as both the vector and inner subvector types are legal. This exposed the same issue in ARM's MVE LowerCONCAT_VECTORS_i1 (raised as PR51365) and AArch64's performConcatVectorsCombine which both assumed concat_vectors only took 2 subvector operands. Differential Revision: https://reviews.llvm.org/D107597	2021-08-16 16:06:54 +01:00
Jeremy Morse	95fe61e639	Revert `54a61c94f9` and its follow up in `547b712500` These were part of D107823, however asan has found something excitingly wrong happening: https://lab.llvm.org/buildbot/#/builders/5/builds/10543/steps/13/logs/stdio	2021-08-16 15:48:56 +01:00
Jeremy Morse	547b712500	Suppress signedness-comparison warning This is a follow-up to `54a61c94f9`.	2021-08-16 15:29:43 +01:00
Jeremy Morse	54a61c94f9	[DebugInfo][InstrRef] Honour too-much-debug-info cutouts VarLoc based LiveDebugValues will abandon variable location propagation if there are too many blocks and variable assignments in the function. If it didn't, and we had (say) 1000 blocks and 1000 variables in scope, we'd end up with 1 million DBG_VALUEs just at the start of blocks. Instruction-referencing LiveDebugValues should honour this limitation too (because the same limitation applies to it). Hoist the relevant command line options into LiveDebugValues.cpp and pass it down into the implementation classes as an argument to ExtendRanges. I've duplicated all the run-lines in live-debug-values-cutoffs.mir to have an instruction-referencing flavour. Differential Revision: https://reviews.llvm.org/D107823	2021-08-16 15:06:40 +01:00
Paul Walker	cd0e196413	[DAGCombiner] Stop visitEXTRACT_SUBVECTOR creating illegal BITCASTs post legalisation. visitEXTRACT_SUBVECTOR can sometimes create illegal BITCASTs when removing "redundant" INSERT_SUBVECTOR operations. This patch adds an extra check to ensure such combines only occur after operation legalisation if any resulting BITBAST is itself legal. Differential Revision: https://reviews.llvm.org/D108086	2021-08-15 18:25:49 +01:00
Qiu Chaofan	a240b29f21	[NFC] Simply update a FIXME comment X86 overrided LowerOperationWrapper was moved to common implementation in `a7eae62`.	2021-08-15 22:43:46 +08:00
Dávid Bolvanský	49de6070a2	Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop" This reverts commit `435785214f`. Still same compile time issues for -O0 -g, eg. +1.3% for sqlite3.	2021-08-15 11:44:13 +02:00
Anshil Gandhi	435785214f	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-14 23:37:23 -06:00
Anshil Gandhi	29e11a1aa3	Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop" This reverts commit `c4e5425aa5`.	2021-08-13 23:58:04 -06:00
Anshil Gandhi	c4e5425aa5	[Remarks] Emit optimization remarks for atomics generating CAS loop Implements ORE in AtomicExpandPass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891	2021-08-13 22:44:08 -06:00
Jessica Paquette	50efbf9cbe	[GlobalISel] Narrow binops feeding into G_AND with a mask This is a fairly common pattern: ``` %mask = G_CONSTANT iN <mask val> %add = G_ADD %lhs, %rhs %and = G_AND %add, %mask ``` We have combines to eliminate G_AND with a mask that does nothing. If we combined the above to this: ``` %mask = G_CONSTANT iN <mask val> %narrow_lhs = G_TRUNC %lhs %narrow_rhs = G_TRUNC %rhs %narrow_add = G_ADD %narrow_lhs, %narrow_rhs %ext = G_ZEXT %narrow_add %and = G_AND %ext, %mask ``` We'd be able to take advantage of those combines using the trunc + zext. For this to work (or be beneficial in the best case) - The operation we want to narrow then widen must only be used by the G_AND - The G_TRUNC + G_ZEXT must be free - Performing the operation at a narrower width must not produce a different value than performing it at the original width after masking. Example comparison between SDAG + GISel: https://godbolt.org/z/63jzb1Yvj At -Os for AArch64, this is a 0.2% code size improvement on CTMark/pairlocalign. Differential Revision: https://reviews.llvm.org/D107929	2021-08-13 18:31:13 -07:00
Matt Arsenault	cc56152f83	GlobalISel: Add helper function for getting EVT from LLT This can only give an imperfect approximation, but is enough to avoid crashing in places where we call into EVT functions starting from LLTs.	2021-08-13 21:10:13 -04:00
Arthur Eubanks	f80ae58068	[NFC] Cleanup calls to AttributeList::getAttribute(FunctionIndex) getAttribute() is confusing, use a clearer method.	2021-08-13 16:27:11 -07:00
Arthur Eubanks	d7593ebaee	[NFC] Clean up users of AttributeList::hasAttribute() AttributeList::hasAttribute() is confusing, use clearer methods like hasParamAttr()/hasRetAttr(). Add hasRetAttr() since it was missing from AttributeList.	2021-08-13 11:59:18 -07:00
Arthur Eubanks	92ce6db9ee	[NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr() This is more consistent with similar methods.	2021-08-13 11:09:18 -07:00
Ruiling Song	e1beebbac5	SplitKit: Don't further split subrange mask in buildCopy We may use several COPY instructions to copy the needed sub-registers during split. But the way we split the lanes during the COPYs may be different from the subranges of the old register. This would fail when we extend the subranges of the new register because the LaneMasks do not match exactly between subranges of new register and old register. Since we are bundling the COPYs, I think there is no need to further refine the subranges of the new register based on the set of LaneMasks of the inserted COPYs. I am not sure if there will be further breaking cases. But as the subranges of new register are created based on the LaneMasks of the subranges of old register, it will be highly possible we will always find an exact LaneMask match. We can think about how to make the extendPHIKillRanges() work for subrange mask mismatch case if we meet more such cases in the future. The test case was from D105065 by @arsenm. Differential Revision: https://reviews.llvm.org/D107829	2021-08-13 07:36:38 +08:00
Rong Xu	4c5909ba83	[SampleFDO] Add two passes of MIRAddFSDiscriminatorsPass This patch adds Pass1 of MIRADDFSDiscriminatorsPass before register allocation, and Pass2 of MIRAddFSDiscriminatorsPass before Block-Placement. This is still under --enable-fs-discrmininator option (default false). This would reduce the turn-around time for FSAFDO transition. Differential Revision: https://reviews.llvm.org/D104579	2021-08-11 11:11:04 -07:00
Fraser Cormack	885be620f9	[LegalizeTypes][NFC] Remove else-after-return Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D107890	2021-08-11 16:48:28 +01:00
Rainer Orth	7bbbf29561	[ELF] Don't emit SHF_GNU_RETAIN on Solaris The introduction of `SHF_GNU_RETAIN` has caused massive problems on Solaris. Initially, as reported in Bug 49437, it caused dozens of testsuite failures on both sparc and x86. The objects were marked as `ELFOSABI_NONE`, but `SHF_GNU_RETAIN` is a GNU extension. In the native Solaris ABI, that flag (in the range for OS-specific values) is `SHF_SUNW_ABSENT` with a completely different semantics, which confuses Solaris `ld` very much. Later, the objects became (correctly) marked `ELFOSABI_GNU`, which Solaris `ld` doesn't support, causing it to SEGV and break the build. The linker is currently being hardened to not accept non-native OS ABIs to avoid this. The need for linker support is already documented in `clang/include/clang/Basic/AttrDocs.td`, but not currently checked. This patch avoids all this by not emitting `SHF_GNU_RETAIN` on Solaris at all. Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D107747	2021-08-11 09:27:51 +02:00
madhur13490	61526b1262	[DAG] Reword comment for EnforceNodeIdInvariant and InvalidateNodeId. NFC. Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D107845	2021-08-11 12:14:28 +05:30
Craig Topper	a8ae41fb51	[SelectionDAGBuilder] Save iterator to avoid second DenseMap lookup. NFC We were calling find and then using operator[]. Instead keep the iterator from find and use it to get the value. Just happened to notice while investigating how we decide what extends to use between basic blocks.	2021-08-10 22:37:48 -07:00
Christopher Di Bella	c874dd5362	[llvm][clang][NFC] updates inline licence info Some files still contained the old University of Illinois Open Source Licence header. This patch replaces that with the Apache 2 with LLVM Exception licence. Differential Revision: https://reviews.llvm.org/D107528	2021-08-11 02:48:53 +00:00
Amara Emerson	7ec4ce157b	[AArch64][GlobalISel] Relax oneuse restriction for PTR_ADD chain combining to check addressing legality. With contributions by Sebastian Neubauer Differential Revision: https://reviews.llvm.org/D105676	2021-08-10 16:41:18 -07:00
Adrian Prantl	d6b6880172	Streamline the API of salvageDebugInfoImpl (NFC) This patch refactors / simplifies salvageDebugInfoImpl(). The goal here is to simplify the implementation of coro::salvageDebugInfo() in a followup patch. 1. Change the return value to I.getOperand(0). Currently users of salvageDebugInfoImpl() assume that the first operand is I.getOperand(0). This patch makes this information explicit. A nice side-effect of this change is that it allows us to salvage expressions such as add i8 1, %a in the future. 2. Factor out the creation of a DIExpression and return an array of DIExpression operations instead. This change allows users that call salvageDebugInfoImpl() in a loop to avoid the costly creation of temporary DIExpressions and to defer the creation of a DIExpression until the end. This patch does not change any functionality. rdar://80227769 Differential Revision: https://reviews.llvm.org/D107383	2021-08-10 15:21:18 -07:00
Jinsong Ji	2cfd427626	[AIX] Don't crash on unimplemented lowerRelativeReference We may call lowerRelativeReference in MC to determine whether target supports this lowering. We should return nullptr instead of crashing when we haven't implemented the real lowering. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D107830	2021-08-10 17:43:06 +00:00
Matt Arsenault	1b41945da0	RegAllocGreedy: Add spaces between registers in debug message	2021-08-10 13:12:34 -04:00
Konstantin Schwarz	64bef13f08	[GlobalISel] Look through truncs and extends in narrowScalarShift If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be shifted individually by the constant shift amount. However in case the shift amount came from a G_TRUNC(G_CONSTANT), the generic shift legalization code was used, producing intermediate shifts that are potentially illegal on some targets. This change teaches narrowScalarShift to look through G_TRUNCs and G_*EXTs. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D89100	2021-08-10 13:49:22 +02:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Jeremy Morse	d4ce9e463d	[DWARF] Revert sharing subprograms across CUs This patch is a revert of `e08f205f5c`. In that patch, DW_TAG_subprograms were permitted to be referenced across CU boundaries, to improve stack trace construction using call site information. Unfortunately, as documented in PR48790, the way that subprograms are "owned" by dwarf units is sufficiently complicated that subprograms end up in unexpected units, invalidating cross-unit references. There's no obvious way to easily fix this, and several attempts have failed. Revert this to ensure correct DWARF is always emitted. Three tests change in addition to the reversion, but they're all very light alterations. Differential Revision: https://reviews.llvm.org/D107076	2021-08-09 12:43:43 +01:00
Luo, Yuanke	53642d5b80	[NFC] Fix the formula for reciprocal calculation. Differential Revision: https://reviews.llvm.org/D107713	2021-08-09 16:03:56 +08:00
Amara Emerson	4c2e01232c	[GlobalISel] Fix a combine causing DBG_VALUE with dangling vregs. We should use MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() instead of eraseFromParent(). We should probably use that in other places too but fix this issue which affects clang bootstrap builds for now.	2021-08-07 01:41:02 -07:00
Nemanja Ivanovic	62fe3dcf98	Fix PPC buildbot break caused by `4c4093e6e3` This commit adds the isnan intrinsic and provides a default expansion for it in the SDAG. However, it makes the assumption that types it operates on are IEEE-compliant types. This is not always the case. An example of that is PPC "double double" which has a representation that - Does not need to conform to IEEE requirements for isnan as it is not an IEEE-compliant type - Does not have a representation that allows for straightforward reinterpreting as an integer and use of integer operations The result was that this commit broke __builtin_isnan for ppc_fp128 making many valid numeric values report a NaN. This patch simply changes the expansion to always expand to unordered comparison (regardless of whether FP exceptions are tracked). This is inline with previous semantics.	2021-08-06 22:10:20 -05:00
Amara Emerson	2b067e3335	Change TargetLowering::canMergeStoresTo() to take a MF instead of DAG. DAG is unnecessary and we need this hook to implement store merging on GlobalISel too.	2021-08-06 12:57:53 -07:00
Jon Roelofs	eae4a44c1d	[GlobalISel][KnownBits] Implement G_CTPOP Implementation copied almost verbatim from ValueTracking. Differential revision: https://reviews.llvm.org/D107606	2021-08-06 09:48:39 -07:00
Craig Topper	b2ca4dc935	[LegalizeTypes] Add a simple expansion for SMULO when a libcall isn't available. This isn't optimal, but prevents crashing when the libcall isn't available. It just calculates the full product and makes sure the high bits match the sign of the low half. Each of the pieces should go through their own type legalization. This can make D107420 unnecessary. Needs tests, but I wanted to start discussion about D107420. Reviewed By: FreddyYe Differential Revision: https://reviews.llvm.org/D107581	2021-08-06 09:43:01 -07:00
Kazu Hirata	276be84d0a	[CodeGen] Remove computeDefOperandLatency (NFC) The last use was removed on Oct 9, 2016 in commit `5c924d7117`.	2021-08-06 08:26:55 -07:00
Jay Foad	57b9107e3f	[GlobalISel] Improve widening of cttz/cttz_zero_undef Differential Revision: https://reviews.llvm.org/D107631	2021-08-06 14:25:56 +01:00
Jay Foad	cd2594e1c6	[GlobalISel] Improve legalization of narrow CTTZ Differential Revision: https://reviews.llvm.org/D107457	2021-08-06 09:40:48 +01:00
Serge Pavlov	4c4093e6e3	Introduce intrinsic llvm.isnan This is recommit of the patch `16ff91ebcc`, reverted in `0c28a7c990` because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below: Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-06 14:32:27 +07:00
Sean Fertile	23651c5ae0	[PowerPC][AIX] Create multiple constant sections. Fixes issue where late materialized constants can be more strictly aligned then their containing csect. Differential Revision: https://reviews.llvm.org/D103103	2021-08-05 21:19:16 -04:00
Jon Roelofs	5fc7b1a260	Revert "[GlobalISel][KnownBits] Implement G_CTPOP" This reverts commit `ce6eb4f15a`. It's broken on the windows bots: https://reviews.llvm.org/D107606#2930121	2021-08-05 17:47:47 -07:00
Jon Roelofs	ce6eb4f15a	[GlobalISel][KnownBits] Implement G_CTPOP Implementation copied almost verbatim from ValueTracking. Differential revision: https://reviews.llvm.org/D107606	2021-08-05 17:17:29 -07:00
Craig Topper	f7076cfd3a	[DAGCombiner][RISCV][AMDGPU] Call SimplifyDemandedBits at the end of visitMULHU to enable known bits contant folding. We don't have real demanded bits support for MULHU, but we can still use the known bits based constant folding support at the end of SimplifyDemandedBits to simplify a MULHU. This helps with cases where we know the LHS and RHS have enough leading zeros so that the high multiply result is always 0. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D106471	2021-08-05 08:31:26 -07:00
Simon Pilgrim	2cbf9fd402	[DAG] DAGCombiner::visitVECTOR_SHUFFLE - recognise INSERT_SUBVECTOR patterns IR typically creates INSERT_SUBVECTOR patterns as a widening of the subvector with undefs to pad to the destination size, followed by a shuffle for the actual insertion - SelectionDAGBuilder has to do something similar for shuffles when source/destination vectors are different sizes. This combine attempts to recognize these patterns by looking for a shuffle of a subvector (from a CONCAT_VECTORS) that starts at a modulo of its size into an otherwise identity shuffle of the base vector. This uncovered a couple of target-specific issues as we haven't often created INSERT_SUBVECTOR nodes in generic code - aarch64 could only handle insertions into the bottom of undefs (i.e. a vector widening), and x86-avx512 vXi1 insertion wasn't keeping track of undef elements in the base vector. Fixes PR50053 Differential Revision: https://reviews.llvm.org/D107068	2021-08-05 15:40:48 +01:00
Paul Robinson	75aa3d520d	Add a DIExpression const-folder to prevent silly expressions. It's entirely possible (because it actually happened) for a bool variable to end up with a 256-bit DW_AT_const_value. This came about when a local bool variable was initialized from a bitfield in a 32-byte struct of bitfields, and after inlining and constant propagation, the variable did have a constant value. The sequence of optimizations had it carrying "i256" values around, but once the constant made it into the llvm.dbg.value, no further IR changes could affect it. Technically the llvm.dbg.value did have a DIExpression to reduce it back down to 8 bits, but the compiler is in no way ready to emit an oversized constant and a DWARF expression to manipulate it. Depending on the circumstances, we had either just the very fat bool value, or an expression with no starting value. The sequence of optimizations that led to this state did seem pretty reasonable, so the solution I came up with was to invent a DWARF constant expression folder. Currently it only does convert ops, but there's no reason it couldn't do other ops if that became useful. This broke three tests that depended on having convert ops survive into the DWARF, so I added an operator that would abort the folder to each of those tests. Differential Revision: https://reviews.llvm.org/D106915	2021-08-05 06:14:40 -07:00
Petar Avramovic	66de26b1f9	GlobalISel: Fix matchEqualDefs for instructions with multiple defs Instructions that produceSameValue produce same values for operands with same index. matchEqualDefs used to return true for any two values from different instructions that produce same values. Fix this by checking if values are defined by operands with the same index. Differential Revision: https://reviews.llvm.org/D107362	2021-08-05 15:05:45 +02:00
Dominik Montada	cc947e29ea	[GlobalISel] Combine shr(shl x, c1), c2 to G_SBFX/G_UBFX Reviewed By: foad Differential Revision: https://reviews.llvm.org/D107330	2021-08-05 13:52:10 +02:00
Fraser Cormack	0b8471e91b	[SelectionDAG] Correctly determine the VECREDUCE_SEQ_FMUL action The LegalizeAction for this node should follow the logic for `VECREDUCE_SEQ_FADD` and be determined using the vector operand's type. here isn't an in-tree target that makes use of this, but I think it's safe to say this is how it should behave, should a target want to customize the action for this node. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D107478	2021-08-05 09:42:33 +01:00
Fangrui Song	a194438615	[CodeGen] Add -align-loops to `lib/CodeGen/CommandFlags.cpp`. It can replace -x86-experimental-pref-loop-alignment=. The loop alignment is only used by MachineBlockPlacement. The implementation uses a new `llvm::TargetOptions` for now, as an IR function attribute/module flags metadata may be overkill. This is the llvm part of D106701.	2021-08-04 12:45:18 -07:00
Craig Topper	c23405174a	[DAGCombiner][AMDGPU] Canonicalize constants to the RHS of MULHU/MULHS. This allows special constants like to 0 to be recognized. It's also expected by isel patterns if a target had a mulh with immediate instructions. The commuting done by tablegen won't commute patterns with immediates since it expects DAGCombine to have done it. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D107486	2021-08-04 11:39:23 -07:00
David Green	eeddcba525	[RDA] Attempt to make RDA subreg aware This attempts to make more of RDA aware of potentially overlapping subregisters. Some of this was already in place, with it iterating through MCRegUnitIterators. This also replaces calls to LiveRegs.contains(..) with !LiveRegs.available(..), and updates the isValidRegUseOf and isValidRegDefOf to search subregs. Differential Revision: https://reviews.llvm.org/D107351	2021-08-04 14:21:32 +01:00
Serge Pavlov	0c28a7c990	Revert "Introduce intrinsic llvm.isnan" This reverts commit `16ff91ebcc`. Several errors were reported mainly test-suite execution time. Reverted for investigation.	2021-08-04 17:18:15 +07:00
Serge Pavlov	16ff91ebcc	Introduce intrinsic llvm.isnan Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-04 15:27:49 +07:00
Heejin Ahn	9bd02c433b	[WebAssembly] Misc. cosmetic changes in EH (NFC) - Rename `wasm.catch` intrinsic to `wasm.catch.exn`, because we are planning to add a separate `wasm.catch.longjmp` intrinsic which returns two values. - Rename several variables - Remove an unnecessary parameter from `canLongjmp` and `isEmAsmCall` from LowerEmscriptenEHSjLj pass - Add `-verify-machineinstrs` in a test for a safety measure - Add more comments + fix some errors in comments - Replace `std::vector` with `SmallVector` for cases likely with small number of elements - Renamed `EnableEH`/`EnableSjLj` to `EnableEmEH`/`EnableEmSjLj`: We are soon going to add `EnableWasmSjLj`, so this makes the distincion clearer Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D107405	2021-08-03 21:03:46 -07:00
Arthur Eubanks	ad25344620	[MC][CodeGen] Emit constant pools earlier Previously we would emit constant pool entries for ldr inline asm at the very end of AsmPrinter::doFinalization(). However, if we're emitting dwarf aranges, that would end all sections with aranges. Then if we have constant pool entries to be emitted in those same sections, we'd hit an assert that the section has already been ended. We want to emit constant pool entries before emitting dwarf aranges. This patch splits out arm32/64's constant pool entry emission into its own MCTargetStreamer virtual method. Fixes PR51208 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107314	2021-08-03 20:55:31 -07:00
Simon Pilgrim	11396641e4	[DAG] Cleanup DAGCombiner::CombineConsecutiveLoads early-outs. NFCI. We had some similar hasOneUse/isNON_EXTLoad early-outs spread out over different parts of the method - we should pull them all together. Noticed while triaging PR45116	2021-08-03 13:47:55 +01:00
Eli Friedman	1f62af6346	[AArch64][SelectionDAG] Support passing/returning scalable vectors with unusual types. This adds handling for two cases: 1. A scalable vector where the element type is promoted. 2. A scalable vector where the element count is odd (or more generally, not divisble by the element count of the part type). (Some element types still don't work; for example, <vscale x 2 x i128>, or <vscale x 2 x fp128>.) Differential Revision: https://reviews.llvm.org/D105591	2021-08-02 15:53:16 -07:00
Max Kazantsev	c5b63714b5	[GC][NFC] Make getGCStrategy by name available in IR We might want to use info from GC strategy in middle end analysis. The motivation for this is provided in D99135: we may want to ask a GC if it's going to work with a given pointer (currently this code makes naive check by the method name). Differetial Revision: https://reviews.llvm.org/D100559 Reviewed By: reames	2021-08-02 14:26:04 +07:00
Matt Arsenault	ebc17a0d68	GlobalISel: Scalarize unaligned vector stores This has the same problems and limitations as the load path.	2021-07-31 10:37:15 -04:00
Simon Pilgrim	3a7c82efb8	[DAG] isGuaranteedNotToBeUndefOrPoison - handle ISD::BUILD_VECTOR nodes If all demanded elements of the BUILD_VECTOR pass a isGuaranteedNotToBeUndefOrPoison check, then we can treat this specific demanded use of the BUILD_VECTOR as guaranteed not to be undef or poison either. Differential Revision: https://reviews.llvm.org/D107174	2021-07-31 15:08:25 +01:00
Matt Arsenault	bc2cb91a20	GlobalISel: Have lowerStore handle some unaligned stores This is NFC until some of the AMDGPU legalization rules are ripped out.	2021-07-31 10:01:42 -04:00
Alexandros Lamprineas	7d940432c4	[AArch64] Legalize MVT::i64x8 in DAG isel lowering This patch legalizes the Machine Value Type introduced in D94096 for loads and stores. A new target hook named getAsmOperandValueType() is added which maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization. Differential Revision: https://reviews.llvm.org/D94097	2021-07-31 09:51:28 +01:00

1 2 3 4 5 ...

31227 Commits