Commit Graph

67 Commits

Author SHA1 Message Date
Mircea Trofin 5dc47541f9 [NFC] Use Register/MCRegister
Differential Revision: https://reviews.llvm.org/D90724
2020-11-04 12:20:17 -08:00
Jay Foad 58de4b2053 [AMDGPU] Use pseudo instructions for readlane/writelane
This reverts r227987 "R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2".

All the codegen changes are caused by the post-RA scheduler no longer
treating readlane/writelane as scheduling barriers due to having
unmodelled side effects. (The pseudos are hasSideEffects = 0, but the
real instructions are hasSideEffects = ? which TableGen conservatively
treats as 1.)

Differential Revision: https://reviews.llvm.org/D90401
2020-10-29 16:00:53 +00:00
Jay Foad 69f5105f5c [AMDGPU] Simplify insertNoops functions. NFC. 2020-10-29 10:55:20 +00:00
Austin Kerbow de51867343 [AMDGPU] Add Reset function to GCNHazardRecognizer
Reset the tracked emitted instructions when starting scheduling on a new
region.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90347
2020-10-28 16:32:32 -07:00
Austin Kerbow 8b127a8661 [AMDGPU] Fix inserting combined s_nop in bundles
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90334
2020-10-28 14:34:04 -07:00
Austin Kerbow ebdcef20ce [AMDGPU] Avoid inserting noops during scheduling
Passes that are run after the post-RA scheduler may insert instructions like
waitcnt which eliminate the need for certain noops. After this patch the
scheduler is still aware of possible latency from hazards but noops will
not be inserted until the dedicated hazard recognizer pass is run.

Depends on D89753.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D89754
2020-10-20 17:11:36 -07:00
Austin Kerbow a4f35ab232 [AMDGPU] Fix mai hazard VALU to LD/ST
Fixes: SWDEV-251863

Differential Revision: https://reviews.llvm.org/D89079
2020-10-08 17:13:02 -07:00
Jay Foad 90777e2924 [AMDGPU] Enable scheduling around FP MODE-setting instructions
Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is
marked as having unmodeled side effects, which makes the machine
scheduler treat it as a barrier. Now that we have proper implicit $mode
operands we can use a no-side-effects S_SETREG_B32_mode pseudo instead
for setregs that only touch the FP MODE bits, to give the scheduler more
freedom.

Differential Revision: https://reviews.llvm.org/D87446
2020-09-16 16:10:47 +01:00
Matt Arsenault 85490874b2 AMDGPU: Skip all meta instructions in hazard recognizer
This was not adding a necessary nop due to thinking the kill counted.
2020-09-09 19:45:40 -04:00
Stanislav Mekhanoshin 43a38dc251 [AMDGPU] Fix MAI ld/st hazard handling
It did not process hazard for ds_permute because it does not
load or store even though it is DS.

Differential Revision: https://reviews.llvm.org/D86003
2020-08-14 17:07:37 -07:00
Stanislav Mekhanoshin decfdb8ce3 [AMDGPU] Fixed formatting in GCNHazardRecognizer.cpp. NFC. 2020-07-29 12:21:28 -07:00
Stanislav Mekhanoshin 13b63be472 [AMDGPU] prefer non-mfma in post-RA schedule
MFMA instructions shall not be scheduled back to back
to avoid MAI SIMD stall. Tell post-RA schedule we would
prefer some other instruction instead.

Differential Revision: https://reviews.llvm.org/D84883
2020-07-29 12:17:50 -07:00
Dmitry Preobrazhensky 2e87acac9b [AMDGPU] Removed s_mov_regrd and mov_fed opcodes
These opcodes are not intended for public use.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81659
2020-07-17 19:52:54 +03:00
Carl Ritson 5bf2a9dd40 [AMDGPU] Update VMEM scalar write hazard mitigation sequence
Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm.org/D83872
2020-07-16 11:37:45 +09:00
Jay Foad 29067aac46 [AMDGPU] Don't implement GCNHazardRecognizer::PreEmitNoops(SUnit *)
When called from the post-RA scheduler, hazards have already been
handled by getHazardType returning NoopHazard, so PreEmitNoops always
returns zero. Remove it. NFC.

Historical note: PreEmitNoops was added to the hazard recognizer
interface as an optional feature to support dispatch group formation on
the POWER target:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20131202/197470.html
So it seems right that we shouldn't need to implement it.

We do still implement the other overload PreEmitNoops(MachineInstr *)
because that is used by the PostRAHazardRecognizer pass.

Differential Revision: https://reviews.llvm.org/D79476
2020-05-06 16:11:19 +01:00
Jay Foad 3d76824b7f [AMDGPU] Better support for VMEM soft clauses in GCNHazardRecognizer
VMEM soft clauses only contain VMEM and FLAT instructions. Teaching
GCNHazardRecognizer::checkSoftClauseHazards that other kinds of
instructions will naturally break the clause means there are far fewer
cases where it has to insert an s_nop instruction to forcibly break the
clause.

Differential Revision: https://reviews.llvm.org/D79353
2020-05-05 15:49:09 +01:00
Jay Foad c5c935ab66 Make more use of MachineInstr::mayLoadOrStore. 2019-12-19 11:51:52 +00:00
Austin Kerbow 2f41a023af AMDGPU: Fix SMEM WAR hazard for gfx10 readlane
Summary: Hazard recognizer fails to see hazard with V_READLANE_B32_gfx10.

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69172

llvm-svn: 375265
2019-10-18 18:20:30 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Nicolai Haehnle 8b7041a5c6 AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXEC
Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630

Reviewers: rampitec, mareko

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64807

Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc
llvm-svn: 366314
2019-07-17 11:22:57 +00:00
Stanislav Mekhanoshin 7d2019bb96 [AMDGPU] gfx908 hazard recognizer
Differential Revision: https://reviews.llvm.org/D64593

llvm-svn: 365829
2019-07-11 21:30:34 +00:00
Stanislav Mekhanoshin bdf7f81b89 [AMDGPU] hazard recognizer for fp atomic to s_denorm_mode
This requires 3 wait states unless there is a wait or VALU in
between.

Differential Revision: https://reviews.llvm.org/D63619

llvm-svn: 364074
2019-06-21 16:30:14 +00:00
Matt Arsenault e4c2e9b016 AMDGPU: Consolidate some getGeneration checks
This is incomplete, and ideally these would all be removed, but it's
better to localize them to the subtarget first with comments about
what they're for.

llvm-svn: 363902
2019-06-19 23:54:58 +00:00
Stanislav Mekhanoshin 5f581c9f08 [AMDGPU] gfx1010 premlane instructions
Differential Revision: https://reviews.llvm.org/D63202

llvm-svn: 363185
2019-06-12 17:52:51 +00:00
Carl Ritson 34e95ce259 [AMDGPU] gfx1010 Avoid SMEM WAR hazard for some s_waitcnt values
Summary:
Avoid introducing hazard mitigation when lgkmcnt is reduced to 0.
Clarify code comments to explain assumptions made for this hazard
mitigation.  Expand and correct test cases to cover variants of
s_waitcnt.

Reviewers: nhaehnle, rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62058

llvm-svn: 361124
2019-05-20 07:20:12 +00:00
Austin Kerbow 8a3d3a9af6 [AMDGPU] Check MI bundles for hazards
Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually.

Reviewers: arsenm, msearles, rampitec

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61564

llvm-svn: 360199
2019-05-07 22:12:15 +00:00
Stanislav Mekhanoshin 5ddd564e19 [AMDGPU] Fixed asan error after D61536
llvm-svn: 359963
2019-05-04 06:40:20 +00:00
Stanislav Mekhanoshin 51d1415a16 AMDGPU] gfx1010 hazard recognizer
Differential Revision: https://reviews.llvm.org/D61536

llvm-svn: 359961
2019-05-04 04:30:57 +00:00
Matt Arsenault 807bedab2e AMDGPU: Remove unnecessary subtarget get
llvm-svn: 357542
2019-04-03 00:01:05 +00:00
David Stuttard 81eec58a0d [AMDGPU] Omit KILL instructions from hazard recognizer
Summary:
In some cases the KILL was causing a hazard to be introduced as these were
scheduled into hazard slots, but don't result in an instruction.

KILL shouldn't be considered for hazard recognition.

Change-Id: Ib6d2a2160f8c94cd0ce611ab198c7e4f46aeffcf

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58898

llvm-svn: 355384
2019-03-05 10:25:16 +00:00
Stanislav Mekhanoshin f92ed6966e [AMDGPU] Fixed hazard recognizer to walk predecessors
Fixes two problems with GCNHazardRecognizer:
1. It only scans up to 5 instructions emitted earlier.
2. It does not take control flow into account. An earlier instruction
from the previous basic block is not necessarily a predecessor.
At the same time a real predecessor block is not scanned.

The patch provides a way to distinguish between scheduler and
hazard recognizer mode. It is OK to work with emitted instructions
in the scheduler because we do not really know what will be emitted
later and its order. However, when pass works as a hazard recognizer
the schedule is already finalized, and we have full access to the
instructions for the whole function, so we can properly traverse
predecessors and their instructions.

Differential Revision: https://reviews.llvm.org/D56923

llvm-svn: 351759
2019-01-21 19:11:26 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Marek Olsak c5cec5e1fa AMDGPU: Add llvm.amdgcn.ds.ordered.add & swap
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52944

llvm-svn: 351351
2019-01-16 15:43:53 +00:00
Carl Ritson f898edd117 [AMDGPU] Prevent sequences of non-instructions disrupting GCNHazardRecognizer wait state counting
Summary:
This fixes a bug where a large number of implicit def instructions can fill the GCNHazardRecognizer lookahead buffer causing required NOPs to not be inserted.

Reviewers: nhaehnle, arsenm

Reviewed By: arsenm

Subscribers: sheredom, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51726

Change-Id: Ie75338f94de704ee5816b05afd0c922c6748a95b
llvm-svn: 341798
2018-09-10 10:14:48 +00:00
Tom Stellard 5bfbae5cb1 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

llvm-svn: 336851
2018-07-11 20:59:01 +00:00
Tom Stellard 44b30b4537 AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930
2018-05-22 02:03:23 +00:00
Shiva Chen 801bf7ebbe [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

llvm-svn: 331844
2018-05-09 02:42:00 +00:00
Mark Searles d29f24acfb [AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards().
Differential Revision: https://reviews.llvm.org/D40098

llvm-svn: 320083
2017-12-07 20:34:25 +00:00
Matt Arsenault a41351e37c AMDGPU: Move hazard avoidance out of waitcnt pass.
This is mostly moving VMEM clause breaking into
the hazard recognizer. Also move another hazard
currently handled in the waitcnt pass.

Also stops breaking clauses unless xnack is enabled.

llvm-svn: 318557
2017-11-17 21:35:32 +00:00
Matt Arsenault 4512d0a68b AMDGPU: Replace list of SMEM buffer opcodes
llvm-svn: 318506
2017-11-17 04:18:26 +00:00
Matt Arsenault 03c67d1eb2 AMDGPU: Fix breaking SMEM clauses
This was completely ignoring subregisters,
so was not very useful. Also only break them
if xnack is actually enabled.

llvm-svn: 318505
2017-11-17 04:18:24 +00:00
Marek Olsak 2232243863 AMDGPU: Handle s_buffer_load_dword hazard on SI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D39171

llvm-svn: 316666
2017-10-26 14:43:02 +00:00
Nicolai Haehnle 523827145b AMDGPU: Make worst-case assumption about the wait states in inline assembly
Summary:
Mesa still uses a hack where empty inline assembly is used as a kind of
optimization barrier. This exposed a problem where not enough wait states
were inserted, because the hazard recognizer implicitly assumed that each
inline assembly "instruction" has at least one wait state.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37205

llvm-svn: 312635
2017-09-06 13:50:13 +00:00
Nicolai Haehnle 75c98c365b AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states
Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36193

llvm-svn: 312337
2017-09-01 16:56:32 +00:00
Connor Abbott 00755362b9 [AMDGPU] Add missing hazard for DPP-after-EXEC-write
Summary:
Following the docs, we need at least 5 wait states between an EXEC write
and an instruction that uses DPP.

Reviewers: tstellar, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D34849

llvm-svn: 310013
2017-08-04 01:09:43 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Matt Arsenault 59ece95f6c AMDGPU: Fix broken condition in hazard recognizer
Fixes bug 32248.

llvm-svn: 298125
2017-03-17 21:36:28 +00:00
Simon Pilgrim 6778b8f715 Reverted unintended commit
llvm-svn: 297841
2017-03-15 14:47:30 +00:00
Simon Pilgrim 3804a12fc3 Fix Wint-in-bool-context warning (PR32248)
llvm-svn: 297840
2017-03-15 14:38:19 +00:00
Matt Arsenault e823d92f7f AMDGPU: Merge initial gfx9 support
llvm-svn: 295554
2017-02-18 18:29:53 +00:00