Commit Graph

429 Commits

Author SHA1 Message Date
Qunyan Mangus 12bae5f3e2 Remove duplicate fields in RAGreedy
RAGreedy has two fields of RegisterClassInfo, one called RCI and another RegClassInfo from its base class.
RCI is initialized without freezeReservedRegs first, while RegClassInfo does. Therefore, if reserved registers
information is changed between last time freezeReservedRegs is called and RAGreedy, it's not picked up by RCI.
Instead of having both fields in RAGreedy, remove RCI and use RegClassInfo instead. Also removed is the TRI field
which is present in its base class.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D125926
2022-05-23 13:08:25 -07:00
Jay Foad 77480556c4 [RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness
Add a new TargetRegisterInfo hook to allow targets to tweak the
priority of live ranges, so that AllocationPriority of the register
class will be treated as more important than whether the range is local
to a basic block or global. This is determined per-MachineFunction.

Differential Revision: https://reviews.llvm.org/D125102
2022-05-17 12:35:21 +01:00
Jay Foad 9ebbe25034 RegAllocGreedy: Common up part of the priority calculation. NFC. 2022-05-05 10:35:33 +01:00
Matt Arsenault 7714e03175 RegAllocGreedy: Allow last chance recolor to retry overlapping tuples
Last chance recoloring didn't try recoloring a done register with the
same class since it believed there was no point. This doesn't
necessarily apply if the members in that class overlap. Allow the
recoloring to proceed if the assigned interfering physical register
overlaps with the candidate register.

This avoids an allocation failure with overlapping tuples. This
testcase could be handled better, and I don't believe should reach
last chance recoloring. The failure only manifests with the mutually
unsatisfiable register hints to overlapping tuples. The earlier
assignment decisions probably should have figured out that using these
hints was a bad idea.
2022-04-25 17:07:17 -04:00
Matt Arsenault 681b9466c9 RegAllocGreedy: Remove redundant check for virtual registers
The set of interfering virtual registers obviously only includes
virtual registers.
2022-04-13 15:00:18 -04:00
serge-sans-paille fa5a4e1b95 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since a96638e50e detected a few
regressions, fixing them.
2022-04-13 20:53:19 +02:00
Matt Arsenault eefed1dbf0 RegAllocGreedy: Roll back successful recolorings on failure
This is a replacement for the original fix attempted in
c46aab01c0.

This fixes "overlapping insert" assertion failures when trying to
unwind an unsuccessful recoloring attempt.

The problem would occur when there are multiple recoloring candidates
which recursively required recoloring. If one recoloring candidate was
successfully recolored at one level, and the next recoloring candidate
was unsuccessful, we would not roll back the first candidates
successful recoloring. The forgotten successful recoloring may have
been assigned to something that conflicts with a register that needs
to be restored in a parent recoloring attempt.

See the testcase added in issue48473 for a more concrete example with
explanation.
2022-04-12 19:02:48 -04:00
Max Kazantsev 9a2798c7a3 [CodeGen][NFC] Hoist budget check out of loop
Less computations & early exit if we know for sure that the limit will be exceeded.
2022-04-05 14:20:42 +07:00
Matt Arsenault 395f8ccfc9 RegAllocGreedy: Fix typo 2022-03-31 16:30:01 -04:00
Matt Arsenault 8d66603a48 Revert "RegAllocGreedy: Fix last chance recolor assert in impossible case"
This reverts commit c46aab01c0.

This evidently blocks compiling in some cases that used to work
before. I'm also not fully convinced this is the correct place to fix
this problem.
2022-03-17 13:12:01 -04:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Kazu Hirata 9286786e87 [CodeGen] Remove an unused variable introduced in D121128 2022-03-14 11:41:04 -07:00
Mircea Trofin 294eca35a0 [regalloc] Remove -consider-local-interval-cost
Discussed extensively on D98232. The functionality introduced in D35816
never worked correctly. In D98232, it was fixed, but, as it was
introducing a large compile-time regression, and the value of the
original patch was called into doubt, we disabled it by default
everywhere. A year later, it appears that caused no grief, so it seems
safe to remove the disabled code.

This should be accompanied by re-opening bug 26810.

Differential Revision: https://reviews.llvm.org/D121128
2022-03-14 10:49:16 -07:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Vasileios Porpodas 6f9640d6a3 [RegAlloc] Add a complexity limit in growRegion() to cap compilation time.
growRegion() does not scale in code with BBs with a very large number of edges.
In such code growRegion() becomes a compile-time bottleneck, consuming 60% of
the total compilation time.
This patch adds a limit to the complexity of growRegion() by incrementing a counter
in each iteration. We bail out once the limit is reached.

Differential Revision: https://reviews.llvm.org/D120752
2022-03-03 11:31:07 -08:00
Matt Arsenault c46aab01c0 RegAllocGreedy: Fix last chance recolor assert in impossible case
This example is not compilable without handling eviction of specific
subregisters. Last chance recoloring was deciding it could try
evicting an overlapping superregister, which doesn't help make any
progress. The LiveIntervalUnion would then assert due to an
overlapping / identical range when trying the new assignment.

Unfortunately this is also producing a verifier error after the
allocation fails. I've seen a number of these, and not sure if we
should just start deleting the function on error rather than trying to
figure out how to put together valid MIR.

I'm not super confident this is the right place to fix this. I also
have a number of failing testcases I need to fix by handling partial
evictions of superregisters.
2022-02-17 18:30:56 -05:00
Mircea Trofin 592f52de33 [nfc][regalloc] const LiveIntervals within the allocator
Once built, LiveIntervals are immutable. This patch captures that.

Differential Revision: https://reviews.llvm.org/D118918
2022-02-03 12:35:36 -08:00
Mircea Trofin 22d3bbdf4e [nfc][regalloc] Move DefaultEvictionAdvisor::* to RegAllocEvictionAdvisor.cpp
This is leftover from the advisor refactoring. Straight-forward copy and
paste.
2022-02-01 07:59:25 -08:00
Mircea Trofin d46305e22d [NFC][regalloc] Move evict advisor initialization before VRAI
This is because a subsequent patch will propose obtaining the VRAI from
the advisor, which will enable feature caching for the ML advisor, for
better compile time. Making this change first as it's both innocuous and
keeps the future patch to be reviewed small.
2022-01-31 14:04:59 -08:00
wangpc 8597458278 [regalloc] Fix assertion error when LiveInterval is empty
When evicting interference, it causes an asseertion error
since LiveIntervals::intervalIsInOneMBB assumes that input
is not empty.

This patch fixed bug mentioned in D118020.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D118124
2022-01-26 14:06:57 +08:00
Mircea Trofin b191c1f0f9 [NFC][regalloc] Pull out some AllocationOrder/CostPerUseLimit eviction logic
We are reusing that logic in the ML implementation.

Differential Revision: https://reviews.llvm.org/D116075
2022-01-10 15:47:31 -08:00
Mircea Trofin e121269131 [NFC][regalloc] Pass RAGreedy to eviction adviser
This patch simplifies the interface between RAGreedy and the eviction
adviser by passing the allocator to the adviser, which allows the latter
to extract needed information as needed, rather than requiring it be passed
piecemeal at construction time (which would also complicate later
evolution).

Part of this, the patch also moves ExtraRegInfo back to RAGreedy. We
keep the encapsulation of ExtraRegInfo because it has benefits (e.g.
improved readability by abstracting access to the cascade info) and also
simpler re-initialization at regalloc pass re-entry time (we just flush
the Optional).

Differential Revision: https://reviews.llvm.org/D116669
2022-01-10 11:55:16 -08:00
Mircea Trofin c41610778b [NFC][regalloc] Introduce RegAllocGreedy.h
This was suggested in D114831. It should simplify the relation between
eviction advisor and the allocator, and simplify ingesting more features
tied to the internals of the allocator, in the future.

This change simply pulls out RAGreedy, places it in the llvm namespace,
and cleans up a bit the includes in the new header file.

Differential Revision: https://reviews.llvm.org/D116114
2022-01-04 08:04:55 -08:00
Mircea Trofin 09103807e7 [NFC][regalloc] Introduce the RegAllocEvictionAdvisorAnalysis
This patch introduces the eviction analysis and the eviction advisor,
the default implementation, and the scaffolding for introducing the
other implementations of the advisor.

Differential Revision: https://reviews.llvm.org/D115707
2021-12-16 17:56:46 -08:00
Mircea Trofin 657adcb077 [NFC][regalloc] Move ExtraRegInfo and related to LiveRangeStageManager
This would allow sharing the LiveRangeStageManager between different
RegAllocEvictionAdvisors. One scenario is for ML training, where we want
to capture what the default advisor would do, for bootstrapping (speeds
up training).

Differential Revision: https://reviews.llvm.org/D114831
2021-12-13 10:10:57 -08:00
Kazu Hirata ca2f53897a [CodeGen] Use range-based for loops (NFC) 2021-12-04 08:48:05 -08:00
Mircea Trofin a503cb00d1 [NFC][regalloc] Factor accesses to ExtraRegInfo
We'll move ExtraRegInfo to the RegAllocEvictionAdvisor subsequently.
This change prepares for that by factoring all accesses.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D114759
2021-11-30 15:10:49 -08:00
Mircea Trofin e8b8304d76 [NFC][Regalloc] Split canEvictInterference into hint and general
There are 2 eviction queries. One is made by tryAssign, when it attempts to
free an interference occupying the hint of the candidate. The other is
during 'regular' interference resolution, where we scan over all
physical registers and try to see if we can evict live ranges in favor
of the candidate. We currently use the same logic in both cases, just
that the former never passes the cost to any subsequent query.
Technically, the 2 decisions could be implemented with different
policies.

This patch splits the 2.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D114019
2021-11-29 16:04:03 -08:00
Mircea Trofin c6b9b702a0 [NFC][Regalloc] Factor out eviction decision from eviction attempt
This splits tryEvict into a const tryFindEvictionCandidate, which
attempts to find a candidate, and the actual eviction (should the former
be successful)

The newly introduced tryFindEvictionCandidate will move subsequently
into the RegAllocEvictionAdvisor.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D113941
2021-11-16 10:50:23 -08:00
Mircea Trofin 19e6b730ce [NFC][Regalloc] Factor types that would be used by the eviction advisor
This is in prepartion of pulling the eviction decision-making into an
analysis pass, which would then allow swapping that decision making
process.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D113929
2021-11-15 13:15:14 -08:00
Mircea Trofin 34f4fe3a90 [NFC][Regalloc] Ensure Query::interferingVRegs is accurate.
To correctly use Query, one had to first call collectInterferingVRegs to
pre-cache the query result, then call interferingVRegs. Failing the
former, interferingVRegs could be stale. This did cause a bug which was
addressed in D98232, but the underlying usability issue of the Query API
wasn't.

This patch addresses the latter by making collectInterferingVRegs an
implementation detail, and having interferingVRegs play both roles. One
side-effect of this is that interferingVRegs is not const anymore.

Differential Revision: https://reviews.llvm.org/D112882
2021-11-02 18:26:54 -07:00
Matt Arsenault 2875d3d484 RegAllocGreedy: Remove an unhelpful auto, and don't use a reference 2021-09-23 17:25:25 -04:00
Craig Topper d5c67bba62 [RegAlloc] Cast uint8_t to unsigned before printing it.
raw_ostream interprets uint8_t as wanting to print a character
with that ASCII value. In this case the uint8_t is an integer
that we want to print.
2021-09-23 08:49:44 -07:00
Matt Arsenault 87c00878d3 SplitKit: Remove decade old live interval hack
This was trying to fixup broken live intervals coming out of the
coalescer. The verifier is more complete now and no tests seem to fail
without this.
2021-09-15 17:35:59 -04:00
Matt Arsenault 4a36e96c3f RegAllocGreedy: Account for reserved registers in num regs heuristic
This simple heuristic uses the estimated live range length combined
with the number of registers in the class to switch which heuristic to
use. This was taking the raw number of registers in the class, even
though not all of them may be available. AMDGPU heavily relies on
dynamically reserved numbers of registers based on user attributes to
satisfy occupancy constraints, so the raw number is highly misleading.

There are still a few problems here. In the original testcase that
made me notice this, the live range size is incorrect after the
scheduler rearranges instructions, since the instructions don't have
the original InstrDist offsets. Additionally, I think it would be more
appropriate to use the number of disjointly allocatable registers in
the class. For the AMDGPU register tuples, there are a large number of
registers in each tuple class, but only a small fraction can actually
be allocated at the same time since they all overlap with each
other. It seems we do not have a query that corresponds to the number
of independently allocatable registers. Relatedly, I'm still debugging
some allocation failures where overlapping tuples seem to not be
handled correctly.

The test changes are mostly noise. There are a handful of x86 tests
that look like regressions with an additional spill, and a handful
that now avoid a spill. The worst looking regression is likely
test/Thumb2/mve-vld4.ll which introduces a few additional
spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll
shows a massive improvement by completely eliminating a large number
of spills inside a loop.
2021-09-14 21:00:29 -04:00
Qiu Chaofan 5ca250a03d [RegAlloc] Remove addAllocPriorityToGlobalRanges hook
It was introduced in 1a6dc92 and only enabled on PowerPC/AMDGPU. That
should be enabled for all targets.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D108010
2021-08-18 10:21:27 +08:00
Matt Arsenault 1b41945da0 RegAllocGreedy: Add spaces between registers in debug message 2021-08-10 13:12:34 -04:00
Stefan Pintilie 1a6dc92be7 [PowerPC] Inefficient register allocation of ACC registers results in many copies.
ACC registers are a combination of four consecutive vector registers.
If the vector registers are assigned first this often forces a number
of copies to appear just before the ACC register is created. If the ACC
register is assigned first then fewer copies are generated when the vector
registers are assigned.

This patch tries to force the register allocator to assign the ACC registers first
and then the UACC registers and then the vector pair registers. It does this
by changing the priority of the register classes.

This patch also adds hints to help the register allocator assign UACC registers from
known ACC registers and vector pair registers from known UACC registers.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D105854
2021-07-20 10:53:40 -05:00
Matt Arsenault eebe841a47 RegAlloc: Allow targets to split register allocation
AMDGPU normally spills SGPRs to VGPRs. Previously, since all register
classes are handled at the same time, this was problematic. We don't
know ahead of time how many registers will be needed to be reserved to
handle the spilling. If no VGPRs were left for spilling, we would have
to try to spill to memory. If the spilled SGPRs were required for exec
mask manipulation, it is highly problematic because the lanes active
at the point of spill are not necessarily the same as at the restore
point.

Avoid this problem by fully allocating SGPRs in a separate regalloc
run from VGPRs. This way we know the exact number of VGPRs needed, and
can reserve them for a second run.  This fixes the most serious
issues, but it is still possible using inline asm to make all VGPRs
unavailable. Start erroring in the case where we ever would require
memory for an SGPR spill.

This is implemented by giving each regalloc pass a callback which
reports if a register class should be handled or not. A few passes
need some small changes to deal with leftover virtual registers.

In the AMDGPU implementation, a new pass is introduced to take the
place of PrologEpilogInserter for SGPR spills emitted during the first
run.

One disadvantage of this is currently StackSlotColoring is no longer
used for SGPR spills. It would need to be run again, which will
require more work.

Error if the standard -regalloc option is used. Introduce new separate
-sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be
controlled individually. PBQB is not currently supported, so this also
prevents using the unhandled allocator.
2021-07-13 18:49:29 -04:00
Hongtao Yu 39ae5bf5c5 [CSSPGO] Fix an AV caused by a block that has only pseudo pseudo instructions.
Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D101415
2021-04-27 17:54:34 -07:00
Serguei Katkov 70193bdfc0 Re-land [GreedyRA ORE] Add Cost of spill locations into remark
Re-land the patch with a fix of clang test.

Cost of spill location is computed basing on relative branch frequency
where corresponding spill/reload/copy are located.

While the number itself is highly depends on incoming IR,
the total cost can be used when do some changes in RA.

Revert "Revert "[GreedyRA ORE] Add Cost of spill locations into remark""
This reverts commit 680f3d6de7.
2021-04-20 16:21:07 +07:00
Serguei Katkov 680f3d6de7 Revert "[GreedyRA ORE] Add Cost of spill locations into remark"
This reverts commit 328377307a.

This commit causes buildbot failures due to some clang tests are not updated.
Temporary revert to fix clang tests.
2021-04-20 11:08:24 +07:00
Serguei Katkov 328377307a [GreedyRA ORE] Add Cost of spill locations into remark
Cost of spill location is computed basing on relative branch frequency
where corresponding spill/reload/copy are located.

While the number itself is highly depends on incoming IR,
the total cost can be used when do some changes in RA.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100020
2021-04-20 10:47:22 +07:00
Hongtao Yu b98807df05 [CSSPGO] Exclude pseudo probes from slot index
Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because of enlarged lifetime length with pseudo probe instructions. As a consequence, program could get different code generated w/ and w/o pseudo probes. I'm closing the gap by excluding pseudo probes from stack index and downstream register allocation related passes.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D100334
2021-04-19 17:55:35 -07:00
Serguei Katkov 9f33943ee0 [GreedyRA ORE] Add stats for copy of virtual registers.
Greedy RA adds copies of virtual registers when splitting live interval.
This stat might be useful.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100017
2021-04-19 12:43:44 +07:00
Serguei Katkov cf0d3477aa [GreedyRA ORE] Separate Folder Reloads and Zero Cost Folder Reloads
Patchpoint instructions have operands which is actually zero cost
(or the same as register) to use the value from the stack.
In terms of statistic it makes same to separate them.

Move from computation instructions related to stack spill/reload to
number of stack slot referenced.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100016
2021-04-14 14:25:28 +07:00
Serguei Katkov c362179b0a [GreedyRA ORE] Add debug location for function level report
Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100168
2021-04-13 09:49:12 +07:00
Serguei Katkov f6e3b4fe58 [GreedyRA ORE] Re-factor computeNumberOfSplillsReloads.
Replace if-else to if-continue usage.
This simplifies further extension of the collected stats.
2021-04-09 12:44:11 +07:00