llvm-project

Commit Graph

Author	SHA1	Message	Date
Shengchen Kan	9e832a67fe	[Codegen][tablgen][NFC] Allow meta instruction to be target dependent An instruction is a meta-instruction if it doesn't produce any output in the form of executable instructions. So in the concept, a meta-instruction does not have to be target independent. Before this patch, `isMetaInstruction` is implemented by checking the opcode of the instruction, add we have no way to add target dependent opcode to the list, which does not make sense. After this patch, a bit `isMeta` is added for class `Instruction` in tablegen, which is used to indicate whether it's a meta instruction. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D121600	2022-03-18 13:09:01 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
serge-sans-paille	3c4410dfca	Cleanup includes: LLVMTarget Most notably, Pass.h is no longer included by TargetMachine.h before: 1063570306 after: 1063332844 Differential Revision: https://reviews.llvm.org/D121168	2022-03-10 10:00:29 +01:00
Jeremy Morse	ab49dce01f	[DebugInfo][InstrRef][NFC] Use unique_ptr instead of raw pointers InstrRefBasedLDV allocates some big tables of ValueIDNum, to store live-in and live-out block values in, that then get passed around as pointers everywhere. This patch wraps the allocation in a std::unique_ptr, names some types based on unique_ptr, and passes references to those around instead. There's no functional change, but it makes it clearer to the reader that references to these tables are borrowed rather than owned, and we get some extra validity assertions too. Differential Revision: https://reviews.llvm.org/D118774	2022-03-01 12:49:50 +00:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Sebastian Neubauer	4a02562275	[AMDGPU] Lazily init pal metadata on first function Delay reading global metadata until the first function or the end of the file is emitted. That way, earlier module passes can set metadata that is emitted in the ELF. `emitStartOfAsmFile` gets called when the passes are initialized, which prevented earlier passes from changing the metadata. This fixes issues encountered after converting AMDGPUResourceUsageAnalysis to a Module pass in D117504. Differential Revision: https://reviews.llvm.org/D118492	2022-02-04 18:39:35 +01:00
Jeremy Morse	14aaaa1236	Re-apply `3fab2d138e`, now with a triple added Was reverted in `1c1b670a73` as it broke all non-x86 bots. Original commit message: [DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out In certain circumstances with things like autogenerated code and asan, you can end up with thousands of Values live at the same time, causing a large working set and a lot of information spilled to the stack. Unfortunately InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory when there are many many stack slots. See the reproducer in D116821. It seems very unlikely that a developer would be able to reason about hundreds of live named local variables at the same time, so a huge working set and many stack slots is an indicator that we're likely analysing autogenerated or instrumented code. In those cases: gracefully degrade by setting an upper bound on the amount of stack slots to track. This limits peak memory consumption, at the cost of dropping some variable locations, but in a rare scenario where it's unlikely someone is actually going to use them. In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero. Differential Revision: https://reviews.llvm.org/D118601	2022-02-02 11:04:00 +00:00
Kevin Athey	1c1b670a73	Revert "[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out" This reverts commit `3fab2d138e`. Breaking PPC sanitizer build: https://lab.llvm.org/buildbot/#/builders/105/builds/20857	2022-02-01 18:37:02 -08:00
Jeremy Morse	3fab2d138e	[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out In certain circumstances with things like autogenerated code and asan, you can end up with thousands of Values live at the same time, causing a large working set and a lot of information spilled to the stack. Unfortunately InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory when there are many many stack slots. See the reproducer in D116821. It seems very unlikely that a developer would be able to reason about hundreds of live named local variables at the same time, so a huge working set and many stack slots is an indicator that we're likely analysing autogenerated or instrumented code. In those cases: gracefully degrade by setting an upper bound on the amount of stack slots to track. This limits peak memory consumption, at the cost of dropping some variable locations, but in a rare scenario where it's unlikely someone is actually going to use them. In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero. Differential Revision: https://reviews.llvm.org/D118601	2022-02-01 19:25:29 +00:00
Matt Arsenault	99e8e17313	Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction" This reverts commit `a97e20a3a8`.	2022-01-24 09:26:52 -05:00
James Y Knight	a97e20a3a8	Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction" This commit sometimes causes a crash when compiling a vtable thunk. E.g.: clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF struct a { virtual int f(); }; struct c { virtual int &g() const; }; struct d : a, c { int &g() const; }; int &d::g() const {} EOF Some follow-up commits have been reverted as well: Revert "IR: Make getRetAlign check callee function attributes" Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC." Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC." This reverts commit `4f414af6a7`. This reverts commit `a5507d2e25`. This reverts commit `3d2d208f6a`. This reverts commit `07ddfa95e3`.	2022-01-14 04:50:07 +00:00
Simon Pilgrim	a5507d2e25	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC.	2022-01-13 11:10:50 +00:00
Matt Arsenault	07ddfa95e3	GlobalISel: Add G_ASSERT_ALIGN hint instruction Insert it for call return values only for now, which is the only case the DAG handles also.	2022-01-12 18:20:58 -05:00
Alexey Lapshin	39385d4cd1	[CodeGen][Debuginfo][NFC] Refactor DIE values SizeOf method to not depend on AsmPrinter. SizeOf() method of DIE values(unsigned SizeOf(const AsmPrinter *AP, dwarf::Form Form) const) depends on AsmPrinter. AsmPrinter is too specific class here. This patch removes dependency on AsmPrinter and use dwarf::FormParams structure instead. It allows calculate DIE values size without using AsmPrinter. That refactoring is useful for D96035([dsymutil][DWARFlinker] implement separate multi-thread processing for compile units.) Differential Revision: https://reviews.llvm.org/D116997	2022-01-12 13:15:26 +03:00
Petar Avramovic	29f88b93fd	[GlobalISel] Rework more/fewer elements for vectors Artifact combiner is not able to access individual elements after using LCMTy style merge/unmerge, extract and insert to change vector number of elements (pad with undef or split to sub-vector instructions). Use unmerge to individual elements instead and then merge elements into requested types. Change argument lowering for vectors and moreElementsVector to use buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements. FewerElementsVector had a few helpers that had different behavior, introduce new helper for most of the opcodes. FewerElementsVector helper is more flexible since it can create leftover instruction smaller then requested type (useful in case target wants to avoid pad with undef and use fewer registers). If target does not want leftover of different type it should call more elements first. Some helpers were performing more elements first to have split without leftover. Opcodes that used this helper use clampMaxNumElementsStrict (does more elements first) in LegalizerInfo to avoid test changes. Fixes failures caused by failing to combine artifacts created during more/fewer elements vector. Differential Revision: https://reviews.llvm.org/D114198	2021-12-23 14:30:02 +01:00
Jay Foad	17006033f9	[GlobalISel] Verify operand types for G_SHL, G_LSHR, G_ASHR Differential Revision: https://reviews.llvm.org/D115868	2021-12-21 11:59:33 +00:00
Simon Pilgrim	fd722c5959	Fix MSVC "not all control paths return a value" warning. NFC.	2021-12-07 18:09:44 +00:00
Mircea Trofin	fa99cb64ff	[mlgo][regalloc] Add score calculation for training Add the calculation of a score, which will be used during ML training. The score qualifies the quality of a regalloc policy, and is independent of what we train (currently, just eviction), or the regalloc algo itself. We can then use scores to guide training (which happens offline), by formulating a reward based on score variation - the goal being lowering scores (currently, that reward is percentage reduction relative to Greedy's heuristic) Currently, we compute the score by factoring different instruction counts (loads, stores, etc) with the machine basic block frequency, regardless of the instructions' provenance - i.e. they could be due to the regalloc policy or be introduced previously. This is different from RAGreedy::reportStats, which accummulates the effects of the allocator alone. We explored this alternative but found (at least currently) that the more naive alternative introduced here produces better policies. We do intend to consolidate the two, however, as we are actively investigating improvements to our reward function, and will likely want to re-explore scoring just the effects of the allocator. In either case, we want to decouple score calculation from allocation algorighm, as we currently evaluate it after a few more passes after allocation (also, because score calculation should be reusable regardless of allocation algorithm). We intentionally accummulate counts independently because it facilitates per-block reporting, which we found useful for debugging - for instance, we can easily report the counts indepdently, and then cross-reference with perf counter measurements. Differential Revision: https://reviews.llvm.org/D115195	2021-12-07 09:00:27 -08:00
Jeremy Morse	8dda516b83	[DebugInfo][InstrRef] Avoid dropping fragment info during PHI elimination InstrRefBasedLDV used to crash on the added test -- the exit block is not in scope for the variable being propagated, but is still considered because it contains an assignment. The failure-mode was vlocJoin ignoring assign-only blocks and not updating DIExpressions, but pickVPHILoc would still find a variable location for it. That led to DBG_VALUEs created with the wrong fragment information. Fix this by removing a filter inherited from VarLocBasedLDV: vlocJoin will now consider assign-only blocks and will update their expressions. Differential Revision: https://reviews.llvm.org/D114727	2021-11-30 11:32:31 +00:00
Abinav Puthan Purayil	bc5dbb0bae	[GlobalISel] Add matchers for constant splat. This change exposes isBuildVectorConstantSplat() to the llvm namespace and uses it to implement the constant splat versions of m_SpecificICst(). CombinerHelper::matchOrShiftToFunnelShift() can now work with vector types and CombinerHelper::matchMulOBy2()'s match for a constant splat is simplified. Differential Revision: https://reviews.llvm.org/D114625	2021-11-30 15:18:50 +05:30
Jeremy Morse	0eee844539	[DebugInfo][InstrRef] Terminate overlapping variable fragments If we have a variable where its fragments are split into overlapping segments: DBG_VALUE $ax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 16) ... DBG_VALUE $eax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 32) we should only propagate the most recently assigned fragment out of a block. LiveDebugValues only deals with live-in variable locations, as overlaps within blocks is DbgEntityHistoryCalculators domain. InstrRefBasedLDV has kept the accumulateFragmentMap method from VarLocBasedLDV, we just need it to recognise DBG_INSTR_REFs. Once it's produced a mapping of variable / fragments to the overlapped variable / fragments, VLocTracker uses it to identify when a debug instruction needs to terminate the other parts it overlaps with. The test is updated for some standard "InstrRef picks different registers" variation, and the order of some unrelated DBG_VALUEs changes. Differential Revision: https://reviews.llvm.org/D114603	2021-11-29 23:37:20 +00:00
Amara Emerson	dc84770d55	[GlobalISel] Add a store-merging optimization pass and enable for AArch64. This is a first attempt at a constant value consecutive store merging pass, a counterpart to the DAGCombiner's store merging optimization. The high level goals of this pass: * Have a simple and efficient algorithm. As close to linear time as we can get. Thus, prioritizing scalability of the algorithm over merging every corner case we can find. The DAGCombiner's store merging code has been the source of compile time and complexity issues in the past and I wanted to avoid that. * Don't introduce any new data structures for ordering memory operations. In MIR, we don't have the concept of chains like we do in the DAG, and the instruction order is stricter than enforcing ordering with graph edges. Although I considered adding something similar, I couldn't justify the overhead. The pass is current split into 3 main parts. The main store merging code focuses on identifying candidate stores and managing the candidate group that's under consideration for merging. Analyzing addressing of stores is a potentially complex part and for now there's just a basic implementation to identify easy cases. Finally, the other main bit of complexity is the alias analysis, which tries to follow the same logic as the DAG's AA. Currently this implementation only supports merging of constant stores. Stores of arbitrary variables are technically possible with a very small change, but the DAG chooses not to do this. Doing so here makes most code worse since there's extra overhead in merging values into wider registers. On AArch64 -Os, this optimization results in very minor savings on CTMark. Differential Revision: https://reviews.llvm.org/D109131	2021-11-15 21:10:39 -08:00
Simon Pilgrim	d391e4fe84	[X86] Update RET/LRET instruction to use the same naming convention as IRET (PR36876). NFC Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidth. Helps prevent future scheduler model mismatches like those that were only addressed in D44687. Differential Revision: https://reviews.llvm.org/D113302	2021-11-07 15:06:54 +00:00
Jeremy Morse	c99fdd456f	[DebugInfo][NFC] Initialize a new object field in unittests Over in `e7084ceab3` the InstrRefBasedLDV class grew a MachineRegisterInfo pointer to lookup register sizes -- however, that field wasn't initialized in the corresponding unit tests. This patch initializes it! Fixes a buildbot failure reported on D112006	2021-10-27 14:29:43 +01:00
Jeremy Morse	4136897bd4	[DebugInfo][InstrRef][NFC] Switch to using DenseMaps and similar There are a few STL containers hanging around that can become DenseMaps, SmallVectors and similar. This recovers a modest amount of compile time performance. While I'm here, adjust the bit layout of ValueIDNum: this was always supposed to act like a value type, however it seems that clang doesn't compile the comparison functions to act that way. Add a uint64_t to a union that explicitly aliases the bitfields, so that we can compare the whole value as a single integer. Differential Revision: https://reviews.llvm.org/D112333	2021-10-25 18:07:17 +01:00
Jeremy Morse	97ddf49e43	[DebugInfo][InstrRef] Recover stack-slot tracking performance This patch is like D111627 -- instead of calculating IDF for every location on the stack, only do it for the smallest units of interference, and copy the PHIs for those units to any aliases. The test added runs placeMLocPHIs directly, and tests that: * A def of the lower 8 bits of a stack slot causes all aliasing regs to have PHIs placed, * It doesn't cause the equivalent location to x86's $ah, which isn't aliased, to have a PHI placed. Differential Revision: https://reviews.llvm.org/D112324	2021-10-25 17:31:09 +01:00
Jeremy Morse	e7084ceab3	[DebugInfo][Instr] Track subregisters across stack spills/restores Sometimes we generate code that writes to a subregister, then spills / restores a super-register to the stack, for example: $eax = MOV32ri 0 MOV64mr $rsp, 1, $noreg, 16, $noreg, $rax $rcx = MOV64rm $rsp, 1, $noreg, 8, $noreg This patch takes a different approach: it adds another index to MLocTracker that identifies a size/offset within a stack slot. A location on the stack is then a pari of {FrameIndex, SlotNum}. Spilling and restoring now involves pairing up the src/dest register numbers, and the dest/src stack position to be transferred to/from. Location coverage improves as a result, compile-time performance decreases, alas. One limitation is that if a PHI occurs inside a stack slot: DBG_PHI %stack.0, 1 We don't know how large the resulting value is, and so might have difficulty picking which value to use. DBG_PHI might need to be augmented in the future with such a size. Unit tests added ensure that spills and restores correctly transfer to positions in the Location => Value map, and that different register classes written to the stack will correctly clobber all other positions in the stack slot. Differential Revision: https://reviews.llvm.org/D112133	2021-10-22 19:20:55 +01:00
Jeremy Morse	d9eebe3cd7	[DebugInfo][InstrRef] Add unit tests for transfer-function building This patch adds some unit tests for the machine-location transfer-function building parts of InstrRefBasedLDV: i.e., test that if we feed some MIR into the transfer-function building code, does it create the correct transfer function. There are a number of minor defects that get corrected in the process: * The unit test was selecting the x86 (i.e. 32 bit) backend rather than x86_64's 64 bit backend, * COPY instructions weren't actually having their subregister values correctly represented in the transfer function. Subregisters were being defined by the COPY, rather than taking the value in the source register. * SP aliases were at risk of being clobbered, if an SP subregister was clobbered. Differential Revision: https://reviews.llvm.org/D112006	2021-10-22 18:29:03 +01:00
Jeremy Morse	89950ade21	[DebugInfo][InstrRef] Track a single variable at a time Here's another performance patch for InstrRefBasedLDV: rather than processing all variable values in a scope at a time, instead, process one variable at a time. The benefits are twofold: * It's easier to reason about one variable at a time in your mind, * It improves performance, apparently from increased locality. The downside is that the value-propagation code gets indented one level further, plus there's some churn in the unit tests. Differential Revision: https://reviews.llvm.org/D111799	2021-10-20 15:03:52 +01:00
Jeremy Morse	849b17949f	[DebugInfo][InstrRef] Avoid un-necessary densemap copies and comparisons This is purely a performance patch: InstrRefBasedLDV used to use three DenseMaps to store variable values, two for long term storage and one as a working set. This patch eliminates the working set, and updates the long term storage in place, thus avoiding two DenseMap comparisons and two DenseMap assignments, which can be expensive. Differential Revision: https://reviews.llvm.org/D111716	2021-10-19 11:10:14 +01:00
Luke Benes	9da51402f4	[DebugInfo][InstrRef] Fix Wdangling-else warning in InstrRefLDVTest Fix a dangling else that gcc-11 warned about. The EXPECT_EQ macro expands to an if-else, so the whole construction contains a hidden dangling else. Differential Revision: https://reviews.llvm.org/D112044	2021-10-19 10:17:57 +01:00
Jeremy Morse	b5426ced71	[DebugInfo][InstrRef] Place variable-values PHI using LLVM utilities This patch is very similar to D110173 / `a3936a6c19`, but for variable values rather than machine values. This is for the second instr-ref problem, calculating the correct variable value on entry to each block. The previous lattice based implementation was broken; we now use LLVMs existing PHI placement utilities to work out where values need to merge, then eliminate un-necessary ones through value propagation. Most of the deletions here happen in vlocJoin: it was trying to pick a location for PHIs to happen in, badly, leading to an infinite loop in the MIR test added, where it would repeatedly switch between register locations. The new approach is simpler: either PHIs can be eliminated, or they can't, and the location of the value is a different problem. Various bits and pieces move to the header so that they can be tested in the unit tests. The DbgValue class grows a "VPHI" kind to represent variable value PHIS that haven't been eliminated yet. Differential Revision: https://reviews.llvm.org/D110630	2021-10-14 14:43:43 +01:00
Jeremy Morse	a3936a6c19	[DebugInfo][InstrRef] Use PHI placement utilities for machine locations InstrRefBasedLDV used to try and determine which values are in which registers using a lattice approach; however this is hard to understand, and broken in various ways. This patch replaces that approach with a standard SSA approach using existing LLVM utilities. PHIs are placed at dominance frontiers; value propagation then eliminates un-necessary PHIs. This patch also adds a bunch of unit tests that should cover many of the weirder forms of control flow. Differential Revision: https://reviews.llvm.org/D110173	2021-10-13 12:49:04 +01:00
Jeremy Morse	838b4a533e	[DebugInfo][NFC] Move LiveDebugValues class to header This patch shifts the InstrRefBasedLDV class declaration to a header. Partially because it's already massive, but mostly so that I can start writing some unit tests for it. This patch also adds the boilerplate for said unit tests. Differential Revision: https://reviews.llvm.org/D110165	2021-10-12 16:07:26 +01:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Amara Emerson	08b3c0d995	[GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c) In order to not generate an unnecessary G_CTLZ, I extended the constant folder in the CSEMIRBuilder to handle G_CTLZ. I also added some extra handing of vector constants too. It seems we don't have any support for doing constant folding of vector constants, so the tests show some other useless G_SUB instructions too. Differential Revision: https://reviews.llvm.org/D111036	2021-10-07 23:51:37 -07:00
Jack Andersen	bd4dad87f4	[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand Based on the reasoning of D53903, register operands of DBG_VALUE are invariably treated as RegState::Debug operands. This change enforces this invariant as part of MachineInstr::addOperand so that all passes emit this flag consistently. RegState::Debug is inconsistently set on DBG_VALUE registers throughout LLVM. This runs the risk of a filtering iterator like MachineRegisterInfo::reg_nodbg_iterator to process these operands erroneously when not parsed from MIR sources. This issue was observed in the development of the llvm-mos fork which adds a backend that relies on physical register operands much more than existing targets. Physical RegUnit 0 has the same numeric encoding as $noreg (indicating an undef for DBG_VALUE). Allowing debug operands into the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision of register numbers with different zero semantics). Eventually, this causes an assert where DBG_VALUE instructions are prohibited from participating in live register ranges. Reviewed By: MatzeB, StephenTozer Differential Revision: https://reviews.llvm.org/D110105	2021-10-07 16:08:52 +01:00
Bjorn Pettersson	8ed0e6b2cf	[SelectionDAG] Replace error prone index check in BaseIndexOffset::computeAliasing Deriving NoAlias based on having the same index in two BaseIndexOffset expressions seemed weird (and as shown in the added unittest the correctness of doing so depended on undocumented pre-conditions that the user of BaseIndexOffset::computeAliasing would need to take care of. This patch removes the code that dereived NoAlias based on indices being the same. As a compensation, to avoid regressions/diffs in various lit test, we also add a new check. The new check derives NoAlias in case the two base pointers are based on two different GlobalValue:s (neither of them being a GlobalAlias). Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D110256	2021-10-05 12:15:55 +02:00
Bjorn Pettersson	1896fb2cff	[SelectionDAG] Assume that a GlobalAlias may alias other global values This fixes a bug detected in DAGCombiner when using global alias variables. Here is an example: @foo = global i16 0, align 1 @aliasFoo = alias i16, i16 * @foo define i16 @bar() { ... store i16 7, i16 * @foo, align 1 store i16 8, i16 * @aliasFoo, align 1 ... } BaseIndexOffset::computeAliasing would incorrectly derive NoAlias for the two accesses in the example above, resulting in DAGCombiner miscompiles. This patch fixes the problem by a defensive approach letting BaseIndexOffset::computeAliasing return false, i.e. that the aliasing couldn't be determined, when comparing two global values and at least one is a GlobalAlias. In the future we might improve this with a deeper analysis to look at the aliasee for the GlobalAlias etc. But that is a bit more complicated considering that we could have 'local_unnamed_addr' and situations with several 'alias' variables. Fixes PR51878. Differential Revision: https://reviews.llvm.org/D110064	2021-10-05 12:15:55 +02:00
Amara Emerson	8bde5e58c0	Delay outgoing register assignments to last. The delayed stack protector feature which is currently used for SDAG (and thus allows for more commonly generating tail calls) depends on being able to extract the tail call into a separate return block. To do this it also has to extract the vreg->physreg copies that set up the call's arguments, since if it doesn't then the call inst ends up using undefined physregs in it's new spliced block. SelectionDAG implementations can do this because they delay emitting register copies until after the stack arguments are set up. GISel however just processes and emits the arguments in IR order, so stack arguments always end up last, and thus this breaks the code that looks for any register arg copies that precede the call instruction. This patch adds a thunk argument to the assignValueToReg() and custom assignment hooks. For outgoing arguments, register assignments use this return param to return a thunk that does the actual generating of the copies. We collect these until all the outgoing stack assignments have been done and then execute them, so that the copies (and perhaps some artifacts like G_SEXTs) are placed after any stores. Differential Revision: https://reviews.llvm.org/D110610	2021-10-04 12:33:20 -07:00
Petar Avramovic	8bc7185668	GlobalISel/Utils: Refactor constant splat match functions Add generic helper function that matches constant splat. It has option to match constant splat with undef (some elements can be undef but not all). Add util function and matcher for G_FCONSTANT splat. Differential Revision: https://reviews.llvm.org/D104410	2021-09-21 12:09:35 +02:00
Petar Avramovic	d477a7c2e7	GlobalISel/Utils: Refactor integer/float constant match functions Rework getConstantstVRegValWithLookThrough in order to make it clear if we are matching integer/float constant only or any constant(default). Add helper functions that get DefVReg and APInt/APFloat from constant instr getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal and getIConstantVRegSExtVal. These now only match G_CONSTANT as described in comment. Relevant matchers now return both DefVReg and APInt/APFloat. Replace existing uses of getConstantstVRegValWithLookThrough and getConstantVRegVal with new helper functions. Any constant match is only required in: ConstantFoldBinOp: for constant argument that was bit-cast of float to int getAArch64VectorSplat: AArch64::G_DUP operands can be any constant amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant In other places use integer only constant match. Differential Revision: https://reviews.llvm.org/D104409	2021-09-17 11:22:13 +02:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Petar Avramovic	2bf4eeeeb6	[GlobalISel] Avoid creating COPY in LegalizationArtifactCombiner When Src and Dst used in buildAnyExtOrTrunc or buildSExtOrTrunc have the same type (creates COPY) use Src register directly or use replaceRegOrBuildCopy instead. Differential Revision: https://reviews.llvm.org/D108306	2021-08-24 11:09:56 +02:00
Konstantin Schwarz	64bef13f08	[GlobalISel] Look through truncs and extends in narrowScalarShift If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be shifted individually by the constant shift amount. However in case the shift amount came from a G_TRUNC(G_CONSTANT), the generic shift legalization code was used, producing intermediate shifts that are potentially illegal on some targets. This change teaches narrowScalarShift to look through G_TRUNCs and G_*EXTs. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D89100	2021-08-10 13:49:22 +02:00
Jon Roelofs	eae4a44c1d	[GlobalISel][KnownBits] Implement G_CTPOP Implementation copied almost verbatim from ValueTracking. Differential revision: https://reviews.llvm.org/D107606	2021-08-06 09:48:39 -07:00
Jay Foad	57b9107e3f	[GlobalISel] Improve widening of cttz/cttz_zero_undef Differential Revision: https://reviews.llvm.org/D107631	2021-08-06 14:25:56 +01:00

1 2 3 4 5 ...

463 Commits