Commit Graph

463 Commits

Author SHA1 Message Date
Shengchen Kan 9e832a67fe [Codegen][tablgen][NFC] Allow meta instruction to be target dependent
An instruction is a meta-instruction if it doesn't produce any output
in the form of executable instructions. So in the concept, a
meta-instruction does not have to be target independent.

Before this patch, `isMetaInstruction` is implemented by checking the
opcode of the instruction, add we have no way to add target dependent
opcode to the list, which does not make sense.

After this patch, a bit `isMeta` is added for class `Instruction` in
tablegen, which is used to indicate whether it's a meta instruction.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D121600
2022-03-18 13:09:01 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
serge-sans-paille 3c4410dfca Cleanup includes: LLVMTarget
Most notably, Pass.h is no longer included by TargetMachine.h
before: 1063570306
after:  1063332844

Differential Revision: https://reviews.llvm.org/D121168
2022-03-10 10:00:29 +01:00
Jeremy Morse ab49dce01f [DebugInfo][InstrRef][NFC] Use unique_ptr instead of raw pointers
InstrRefBasedLDV allocates some big tables of ValueIDNum, to store live-in
and live-out block values in, that then get passed around as pointers
everywhere. This patch wraps the allocation in a std::unique_ptr, names
some types based on unique_ptr, and passes references to those around
instead. There's no functional change, but it makes it clearer to the
reader that references to these tables are borrowed rather than owned, and
we get some extra validity assertions too.

Differential Revision: https://reviews.llvm.org/D118774
2022-03-01 12:49:50 +00:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Sebastian Neubauer 4a02562275 [AMDGPU] Lazily init pal metadata on first function
Delay reading global metadata until the first function or the end of
the file is emitted. That way, earlier module passes can set metadata
that is emitted in the ELF.

`emitStartOfAsmFile` gets called when the passes are initialized,
which prevented earlier passes from changing the metadata.

This fixes issues encountered after converting
AMDGPUResourceUsageAnalysis to a Module pass in D117504.

Differential Revision: https://reviews.llvm.org/D118492
2022-02-04 18:39:35 +01:00
Jeremy Morse 14aaaa1236 Re-apply 3fab2d138e, now with a triple added
Was reverted in 1c1b670a73 as it broke all non-x86 bots. Original commit
message:

[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out

In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.

It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.

In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.

Differential Revision: https://reviews.llvm.org/D118601
2022-02-02 11:04:00 +00:00
Kevin Athey 1c1b670a73 Revert "[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out"
This reverts commit 3fab2d138e.

Breaking PPC sanitizer build:
https://lab.llvm.org/buildbot/#/builders/105/builds/20857
2022-02-01 18:37:02 -08:00
Jeremy Morse 3fab2d138e [DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out
In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.

It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.

In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.

Differential Revision: https://reviews.llvm.org/D118601
2022-02-01 19:25:29 +00:00
Matt Arsenault 99e8e17313 Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This reverts commit a97e20a3a8.
2022-01-24 09:26:52 -05:00
James Y Knight a97e20a3a8 Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This commit sometimes causes a crash when compiling a vtable thunk. E.g.:

clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF
struct a {
  virtual int f();
};
struct c {
  virtual int &g() const;
};
struct d : a, c {
  int &g() const;
};
int &d::g() const {}
EOF

Some follow-up commits have been reverted as well:
Revert "IR: Make getRetAlign check callee function attributes"
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."

This reverts commit 4f414af6a7.
This reverts commit a5507d2e25.
This reverts commit 3d2d208f6a.
This reverts commit 07ddfa95e3.
2022-01-14 04:50:07 +00:00
Simon Pilgrim a5507d2e25 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC. 2022-01-13 11:10:50 +00:00
Matt Arsenault 07ddfa95e3 GlobalISel: Add G_ASSERT_ALIGN hint instruction
Insert it for call return values only for now, which is the only case
the DAG handles also.
2022-01-12 18:20:58 -05:00
Alexey Lapshin 39385d4cd1 [CodeGen][Debuginfo][NFC] Refactor DIE values SizeOf method to not depend on AsmPrinter.
SizeOf() method of DIE values(unsigned SizeOf(const AsmPrinter *AP, dwarf::Form Form) const)
depends on AsmPrinter. AsmPrinter is too specific class here. This patch removes dependency
on AsmPrinter and use dwarf::FormParams structure instead. It allows calculate DIE values
size without using AsmPrinter. That refactoring is useful for D96035([dsymutil][DWARFlinker]
implement separate multi-thread processing for compile units.)

Differential Revision: https://reviews.llvm.org/D116997
2022-01-12 13:15:26 +03:00
Petar Avramovic 29f88b93fd [GlobalISel] Rework more/fewer elements for vectors
Artifact combiner is not able to access individual elements after using
LCMTy style merge/unmerge, extract and insert to change vector number of
elements (pad with undef or split to sub-vector instructions).
Use unmerge to individual elements instead and then merge elements into
requested types.
Change argument lowering for vectors and moreElementsVector to use
buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements.
FewerElementsVector had a few helpers that had different behavior,
introduce new helper for most of the opcodes.
FewerElementsVector helper is more flexible since it can create leftover
instruction smaller then requested type (useful in case target wants to
avoid pad with undef and use fewer registers). If target does not want
leftover of different type it should call more elements first.
Some helpers were performing more elements first to have split without
leftover. Opcodes that used this helper use clampMaxNumElementsStrict
(does more elements first) in LegalizerInfo to avoid test changes.
Fixes failures caused by failing to combine artifacts created during
more/fewer elements vector.

Differential Revision: https://reviews.llvm.org/D114198
2021-12-23 14:30:02 +01:00
Jay Foad 17006033f9 [GlobalISel] Verify operand types for G_SHL, G_LSHR, G_ASHR
Differential Revision: https://reviews.llvm.org/D115868
2021-12-21 11:59:33 +00:00
Simon Pilgrim fd722c5959 Fix MSVC "not all control paths return a value" warning. NFC. 2021-12-07 18:09:44 +00:00
Mircea Trofin fa99cb64ff [mlgo][regalloc] Add score calculation for training
Add the calculation of a score, which will be used during ML training. The
score qualifies the quality of a regalloc policy, and is independent of
what we train (currently, just eviction), or the regalloc algo itself.
We can then use scores to guide training (which happens offline), by
formulating a reward based on score variation - the goal being lowering
scores (currently, that reward is percentage reduction relative to
Greedy's heuristic)

Currently, we compute the score by factoring different instruction
counts (loads, stores, etc) with the machine basic block frequency,
regardless of the instructions' provenance - i.e. they could be due to
the regalloc policy or be introduced previously. This is different from
RAGreedy::reportStats, which accummulates the effects of the allocator
alone. We explored this alternative but found (at least currently) that
the more naive alternative introduced here produces better policies. We
do intend to consolidate the two, however, as we are actively
investigating improvements to our reward function, and will likely want
to re-explore scoring just the effects of the allocator.

In either case, we want to decouple score calculation from allocation
algorighm, as we currently evaluate it after a few more passes after
allocation (also, because score calculation should be reusable
regardless of allocation algorithm).

We intentionally accummulate counts independently because it facilitates
per-block reporting, which we found useful for debugging - for instance,
we can easily report the counts indepdently, and then cross-reference
with perf counter measurements.

Differential Revision: https://reviews.llvm.org/D115195
2021-12-07 09:00:27 -08:00
Jeremy Morse 8dda516b83 [DebugInfo][InstrRef] Avoid dropping fragment info during PHI elimination
InstrRefBasedLDV used to crash on the added test -- the exit block is not
in scope for the variable being propagated, but is still considered because
it contains an assignment. The failure-mode was vlocJoin ignoring
assign-only blocks and not updating DIExpressions, but pickVPHILoc would
still find a variable location for it. That led to DBG_VALUEs created with
the wrong fragment information.

Fix this by removing a filter inherited from VarLocBasedLDV: vlocJoin will
now consider assign-only blocks and will update their expressions.

Differential Revision: https://reviews.llvm.org/D114727
2021-11-30 11:32:31 +00:00
Abinav Puthan Purayil bc5dbb0bae [GlobalISel] Add matchers for constant splat.
This change exposes isBuildVectorConstantSplat() to the llvm namespace
and uses it to implement the constant splat versions of
m_SpecificICst().

CombinerHelper::matchOrShiftToFunnelShift() can now work with vector
types and CombinerHelper::matchMulOBy2()'s match for a constant splat is
simplified.

Differential Revision: https://reviews.llvm.org/D114625
2021-11-30 15:18:50 +05:30
Jeremy Morse 0eee844539 [DebugInfo][InstrRef] Terminate overlapping variable fragments
If we have a variable where its fragments are split into overlapping
segments:

    DBG_VALUE $ax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 16)
    ...
    DBG_VALUE $eax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 32)

we should only propagate the most recently assigned fragment out of a
block. LiveDebugValues only deals with live-in variable locations, as
overlaps within blocks is DbgEntityHistoryCalculators domain.

InstrRefBasedLDV has kept the accumulateFragmentMap method from
VarLocBasedLDV, we just need it to recognise DBG_INSTR_REFs. Once it's
produced a mapping of variable / fragments to the overlapped variable /
fragments, VLocTracker uses it to identify when a debug instruction needs
to terminate the other parts it overlaps with. The test is updated for
some standard "InstrRef picks different registers" variation, and the
order of some unrelated DBG_VALUEs changes.

Differential Revision: https://reviews.llvm.org/D114603
2021-11-29 23:37:20 +00:00
Amara Emerson dc84770d55 [GlobalISel] Add a store-merging optimization pass and enable for AArch64.
This is a first attempt at a constant value consecutive store merging pass,
a counterpart to the DAGCombiner's store merging optimization.

The high level goals of this pass:

* Have a simple and efficient algorithm. As close to linear time as we can get.
  Thus, prioritizing scalability of the algorithm over merging every corner case
  we can find. The DAGCombiner's store merging code has been the source of
  compile time and complexity issues in the past and I wanted to avoid that.
* Don't introduce any new data structures for ordering memory operations. In MIR,
  we don't have the concept of chains like we do in the DAG, and the instruction
  order is stricter than enforcing ordering with graph edges. Although I
  considered adding something similar, I couldn't justify the overhead.

The pass is current split into 3 main parts. The main store merging code focuses
on identifying candidate stores and managing the candidate group that's under
consideration for merging. Analyzing addressing of stores is a potentially
complex part and for now there's just a basic implementation to identify easy
cases. Finally, the other main bit of complexity is the alias analysis, which
tries to follow the same logic as the DAG's AA.

Currently this implementation only supports merging of constant stores. Stores
of arbitrary variables are technically possible with a very small change, but
the DAG chooses not to do this. Doing so here makes most code worse since
there's extra overhead in merging values into wider registers.

On AArch64 -Os, this optimization results in very minor savings on CTMark.

Differential Revision: https://reviews.llvm.org/D109131
2021-11-15 21:10:39 -08:00
Simon Pilgrim d391e4fe84 [X86] Update RET/LRET instruction to use the same naming convention as IRET (PR36876). NFC
Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidth.

Helps prevent future scheduler model mismatches like those that were only addressed in D44687.

Differential Revision: https://reviews.llvm.org/D113302
2021-11-07 15:06:54 +00:00
Jeremy Morse c99fdd456f [DebugInfo][NFC] Initialize a new object field in unittests
Over in e7084ceab3 the InstrRefBasedLDV class grew a MachineRegisterInfo
pointer to lookup register sizes -- however, that field wasn't initialized
in the corresponding unit tests. This patch initializes it!

Fixes a buildbot failure reported on D112006
2021-10-27 14:29:43 +01:00
Jeremy Morse 4136897bd4 [DebugInfo][InstrRef][NFC] Switch to using DenseMaps and similar
There are a few STL containers hanging around that can become DenseMaps,
SmallVectors and similar. This recovers a modest amount of compile time
performance.

While I'm here, adjust the bit layout of ValueIDNum: this was always
supposed to act like a value type, however it seems that clang doesn't
compile the comparison functions to act that way. Add a uint64_t to a
union that explicitly aliases the bitfields, so that we can compare the
whole value as a single integer.

Differential Revision: https://reviews.llvm.org/D112333
2021-10-25 18:07:17 +01:00
Jeremy Morse 97ddf49e43 [DebugInfo][InstrRef] Recover stack-slot tracking performance
This patch is like D111627 -- instead of calculating IDF for every location
on the stack, only do it for the smallest units of interference, and copy
the PHIs for those units to any aliases.

The test added runs placeMLocPHIs directly, and tests that:
 * A def of the lower 8 bits of a stack slot causes all aliasing regs to
   have PHIs placed,
 * It doesn't cause the equivalent location to x86's $ah, which isn't
   aliased, to have a PHI placed.

Differential Revision: https://reviews.llvm.org/D112324
2021-10-25 17:31:09 +01:00
Jeremy Morse e7084ceab3 [DebugInfo][Instr] Track subregisters across stack spills/restores
Sometimes we generate code that writes to a subregister, then spills /
restores a super-register to the stack, for example:

    $eax = MOV32ri 0
    MOV64mr $rsp, 1, $noreg, 16, $noreg, $rax
    $rcx = MOV64rm $rsp, 1, $noreg, 8, $noreg

This patch takes a different approach: it adds another index to
MLocTracker that identifies a size/offset within a stack slot. A location
on the stack is then a pari of {FrameIndex, SlotNum}. Spilling and
restoring now involves pairing up the src/dest register numbers, and the
dest/src stack position to be transferred to/from. Location coverage
improves as a result, compile-time performance decreases, alas.

One limitation is that if a PHI occurs inside a stack slot:

    DBG_PHI %stack.0, 1

We don't know how large the resulting value is, and so might have
difficulty picking which value to use. DBG_PHI might need to be augmented
in the future with such a size.

Unit tests added ensure that spills and restores correctly transfer to
positions in the Location => Value map, and that different register classes
written to the stack will correctly clobber all other positions in the
stack slot.

Differential Revision: https://reviews.llvm.org/D112133
2021-10-22 19:20:55 +01:00
Jeremy Morse d9eebe3cd7 [DebugInfo][InstrRef] Add unit tests for transfer-function building
This patch adds some unit tests for the machine-location transfer-function
building parts of InstrRefBasedLDV: i.e., test that if we feed some MIR
into the transfer-function building code, does it create the correct
transfer function.

There are a number of minor defects that get corrected in the process:
 * The unit test was selecting the x86 (i.e. 32 bit) backend rather than
   x86_64's 64 bit backend,
 * COPY instructions weren't actually having their subregister values
   correctly represented in the transfer function. Subregisters were being
   defined by the COPY, rather than taking the value in the source register.
 * SP aliases were at risk of being clobbered, if an SP subregister was
   clobbered.

Differential Revision: https://reviews.llvm.org/D112006
2021-10-22 18:29:03 +01:00
Jeremy Morse 89950ade21 [DebugInfo][InstrRef] Track a single variable at a time
Here's another performance patch for InstrRefBasedLDV: rather than
processing all variable values in a scope at a time, instead, process one
variable at a time. The benefits are twofold:
 * It's easier to reason about one variable at a time in your mind,
 * It improves performance, apparently from increased locality.

The downside is that the value-propagation code gets indented one level
further, plus there's some churn in the unit tests.

Differential Revision: https://reviews.llvm.org/D111799
2021-10-20 15:03:52 +01:00
Jeremy Morse 849b17949f [DebugInfo][InstrRef] Avoid un-necessary densemap copies and comparisons
This is purely a performance patch: InstrRefBasedLDV used to use three
DenseMaps to store variable values, two for long term storage and one as a
working set. This patch eliminates the working set, and updates the long
term storage in place, thus avoiding two DenseMap comparisons and two
DenseMap assignments, which can be expensive.

Differential Revision: https://reviews.llvm.org/D111716
2021-10-19 11:10:14 +01:00
Luke Benes 9da51402f4 [DebugInfo][InstrRef] Fix Wdangling-else warning in InstrRefLDVTest
Fix a dangling else that gcc-11 warned about. The EXPECT_EQ macro
expands to an if-else, so the whole construction contains a hidden
dangling else.

Differential Revision: https://reviews.llvm.org/D112044
2021-10-19 10:17:57 +01:00
Jeremy Morse b5426ced71 [DebugInfo][InstrRef] Place variable-values PHI using LLVM utilities
This patch is very similar to D110173 / a3936a6c19, but for variable
values rather than machine values. This is for the second instr-ref
problem, calculating the correct variable value on entry to each block.
The previous lattice based implementation was broken; we now use LLVMs
existing PHI placement utilities to work out where values need to merge,
then eliminate un-necessary ones through value propagation.

Most of the deletions here happen in vlocJoin: it was trying to pick a
location for PHIs to happen in, badly, leading to an infinite loop in the
MIR test added, where it would repeatedly switch between register
locations. The new approach is simpler: either PHIs can be eliminated, or
they can't, and the location of the value is a different problem.

Various bits and pieces move to the header so that they can be tested in
the unit tests. The DbgValue class grows a "VPHI" kind to represent
variable value PHIS that haven't been eliminated yet.

Differential Revision: https://reviews.llvm.org/D110630
2021-10-14 14:43:43 +01:00
Jeremy Morse a3936a6c19 [DebugInfo][InstrRef] Use PHI placement utilities for machine locations
InstrRefBasedLDV used to try and determine which values are in which
registers using a lattice approach; however this is hard to understand, and
broken in various ways. This patch replaces that approach with a standard
SSA approach using existing LLVM utilities. PHIs are placed at dominance
frontiers; value propagation then eliminates un-necessary PHIs.

This patch also adds a bunch of unit tests that should cover many of the
weirder forms of control flow.

Differential Revision: https://reviews.llvm.org/D110173
2021-10-13 12:49:04 +01:00
Jeremy Morse 838b4a533e [DebugInfo][NFC] Move LiveDebugValues class to header
This patch shifts the InstrRefBasedLDV class declaration to a header.
Partially because it's already massive, but mostly so that I can start
writing some unit tests for it. This patch also adds the boilerplate for
said unit tests.

Differential Revision: https://reviews.llvm.org/D110165
2021-10-12 16:07:26 +01:00
Reid Kleckner 89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Amara Emerson 08b3c0d995 [GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c)
In order to not generate an unnecessary G_CTLZ, I extended the constant folder
in the CSEMIRBuilder to handle G_CTLZ. I also added some extra handing of
vector constants too. It seems we don't have any support for doing constant
folding of vector constants, so the tests show some other useless G_SUB
instructions too.

Differential Revision: https://reviews.llvm.org/D111036
2021-10-07 23:51:37 -07:00
Jack Andersen bd4dad87f4 [MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand
Based on the reasoning of D53903, register operands of DBG_VALUE are
invariably treated as RegState::Debug operands. This change enforces
this invariant as part of MachineInstr::addOperand so that all passes
emit this flag consistently.

RegState::Debug is inconsistently set on DBG_VALUE registers throughout
LLVM. This runs the risk of a filtering iterator like
MachineRegisterInfo::reg_nodbg_iterator to process these operands
erroneously when not parsed from MIR sources.

This issue was observed in the development of the llvm-mos fork which
adds a backend that relies on physical register operands much more than
existing targets. Physical RegUnit 0 has the same numeric encoding as
$noreg (indicating an undef for DBG_VALUE). Allowing debug operands into
the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision
of register numbers with different zero semantics). Eventually, this
causes an assert where DBG_VALUE instructions are prohibited from
participating in live register ranges.

Reviewed By: MatzeB, StephenTozer

Differential Revision: https://reviews.llvm.org/D110105
2021-10-07 16:08:52 +01:00
Bjorn Pettersson 8ed0e6b2cf [SelectionDAG] Replace error prone index check in BaseIndexOffset::computeAliasing
Deriving NoAlias based on having the same index in two BaseIndexOffset
expressions seemed weird (and as shown in the added unittest the
correctness of doing so depended on undocumented pre-conditions that
the user of BaseIndexOffset::computeAliasing would need to take care
of.

This patch removes the code that dereived NoAlias based on indices
being the same. As a compensation, to avoid regressions/diffs in
various lit test, we also add a new check. The new check derives
NoAlias in case the two base pointers are based on two different
GlobalValue:s (neither of them being a GlobalAlias).

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D110256
2021-10-05 12:15:55 +02:00
Bjorn Pettersson 1896fb2cff [SelectionDAG] Assume that a GlobalAlias may alias other global values
This fixes a bug detected in DAGCombiner when using global alias
variables. Here is an example:
  @foo = global i16 0, align 1
  @aliasFoo = alias i16, i16 * @foo
  define i16 @bar() {
    ...
    store i16 7, i16 * @foo, align 1
    store i16 8, i16 * @aliasFoo, align 1
    ...
  }

BaseIndexOffset::computeAliasing would incorrectly derive NoAlias
for the two accesses in the example above, resulting in DAGCombiner
miscompiles.

This patch fixes the problem by a defensive approach letting
BaseIndexOffset::computeAliasing return false, i.e. that the aliasing
couldn't be determined, when comparing two global values and at least
one is a GlobalAlias. In the future we might improve this with a
deeper analysis to look at the aliasee for the GlobalAlias etc. But
that is a bit more complicated considering that we could have
'local_unnamed_addr' and situations with several 'alias' variables.

Fixes PR51878.

Differential Revision: https://reviews.llvm.org/D110064
2021-10-05 12:15:55 +02:00
Amara Emerson 8bde5e58c0 Delay outgoing register assignments to last.
The delayed stack protector feature which is currently used for SDAG (and thus
allows for more commonly generating tail calls) depends on being able to extract
the tail call into a separate return block. To do this it also has to extract
the vreg->physreg copies that set up the call's arguments, since if it doesn't
then the call inst ends up using undefined physregs in it's new spliced block.

SelectionDAG implementations can do this because they delay emitting register
copies until  *after* the stack arguments are set up. GISel however just
processes and emits the arguments in IR order, so stack arguments always end up
last, and thus this breaks the code that looks for any register arg copies that
precede the call instruction.

This patch adds a thunk argument to the assignValueToReg() and custom assignment
hooks. For outgoing arguments, register assignments use this return param to
return a thunk that does the actual generating of the copies. We collect these
until all the outgoing stack assignments have been done and then execute them,
so that the copies (and perhaps some artifacts like G_SEXTs) are placed after
any stores.

Differential Revision: https://reviews.llvm.org/D110610
2021-10-04 12:33:20 -07:00
Petar Avramovic 8bc7185668 GlobalISel/Utils: Refactor constant splat match functions
Add generic helper function that matches constant splat. It has option to
match constant splat with undef (some elements can be undef but not all).
Add util function and matcher for G_FCONSTANT splat.

Differential Revision: https://reviews.llvm.org/D104410
2021-09-21 12:09:35 +02:00
Petar Avramovic d477a7c2e7 GlobalISel/Utils: Refactor integer/float constant match functions
Rework getConstantstVRegValWithLookThrough in order to make it clear if we
are matching integer/float constant only or any constant(default).
Add helper functions that get DefVReg and APInt/APFloat from constant instr
getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT
getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT
getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT

Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal
and getIConstantVRegSExtVal. These now only match G_CONSTANT as described
in comment.

Relevant matchers now return both DefVReg and APInt/APFloat.

Replace existing uses of getConstantstVRegValWithLookThrough and
getConstantVRegVal with new helper functions. Any constant match is
only required in:
ConstantFoldBinOp: for constant argument that was bit-cast of float to int
getAArch64VectorSplat: AArch64::G_DUP operands can be any constant
amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant

In other places use integer only constant match.

Differential Revision: https://reviews.llvm.org/D104409
2021-09-17 11:22:13 +02:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Petar Avramovic 2bf4eeeeb6 [GlobalISel] Avoid creating COPY in LegalizationArtifactCombiner
When Src and Dst used in buildAnyExtOrTrunc or buildSExtOrTrunc
have the same type (creates COPY) use Src register directly or
use replaceRegOrBuildCopy instead.

Differential Revision: https://reviews.llvm.org/D108306
2021-08-24 11:09:56 +02:00
Konstantin Schwarz 64bef13f08 [GlobalISel] Look through truncs and extends in narrowScalarShift
If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be
shifted individually by the constant shift amount.

However in case the shift amount came from a G_TRUNC(G_CONSTANT), the generic shift legalization
code was used, producing intermediate shifts that are potentially illegal on some targets.

This change teaches narrowScalarShift to look through G_TRUNCs and G_*EXTs.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D89100
2021-08-10 13:49:22 +02:00
Jon Roelofs eae4a44c1d [GlobalISel][KnownBits] Implement G_CTPOP
Implementation copied almost verbatim from ValueTracking.

Differential revision: https://reviews.llvm.org/D107606
2021-08-06 09:48:39 -07:00
Jay Foad 57b9107e3f [GlobalISel] Improve widening of cttz/cttz_zero_undef
Differential Revision: https://reviews.llvm.org/D107631
2021-08-06 14:25:56 +01:00