Commit Graph

9 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin 555d8f4ef5 [AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the
MemOpClusterMutation to keep loads and stores together in the
post-RA scheduler. This does not work all the time since it
allows to schedule a completely independent instruction in the
middle of the cluster.

Removed the DAG mutation and added pass to bundle already
clustered instructions. These bundles are unpacked before the
memory legalizer because it does not work with bundles but also
because it allows to insert waitcounts in the middle of a store
cluster.

Removing artificial edges also allows a more relaxed scheduling.

Differential Revision: https://reviews.llvm.org/D72737
2020-01-24 11:33:38 -08:00
Jay Foad 0412f518dc [AMDGPU] Fix typo in SIInstrInfo::memOpsHaveSameBasePtr
Summary:
The typo has been present since memOpsHaveSameBasePtr was introduced in
r313208.

It caused SIInstrInfo::shouldClusterMemOps to cluster more mem ops than
it was supposed to.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71616
2019-12-17 18:54:27 +00:00
Matt Arsenault 190a17bbd1 AMDGPU: Fix i16 arithmetic pattern redundancy
There were 2 problems here. First, these patterns were duplicated to
handle the inverted shift operands instead of using the commuted
PatFrags.

Second, the point of the zext folding patterns don't apply to the
non-0ing high subtargets. They should be skipped instead of inserting
the extension. The zeroing high code would be emitted when necessary
anyway. This was also emitting unnecessary zexts in cases where the
high bits were undefined.

llvm-svn: 374092
2019-10-08 17:36:38 +00:00
Simon Pilgrim 6b079cc2d4 [AMDGPU] Regenerate idot tests. NFCI.
Reduces diff in D63281.

llvm-svn: 365754
2019-07-11 10:37:58 +00:00
Stanislav Mekhanoshin e917b3b4b8 [AMDGPU] gfx10 tests. NFC.
llvm-svn: 363946
2019-06-20 16:29:40 +00:00
Nirav Dave fe59e14031 [DAGCombine] Prune unnused nodes.
Summary:
Nodes that have no uses are eventually pruned when they are selected
from the worklist. Record nodes newly added to the worklist or DAG and
perform pruning after every combine attempt.

Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight

Reviewed By: jyknight

Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58070

llvm-svn: 357283
2019-03-29 17:35:56 +00:00
Matt Arsenault 6f35f0c212 AMDGPU: Actually commit re-run of update_llc_test_checks
llvm-svn: 341218
2018-08-31 15:05:06 +00:00
Matt Arsenault 28c16bd534 AMDGPU: Fix broken generated check lines
This was incorrectly using the same check prefix for multiple lines

llvm-svn: 341214
2018-08-31 14:34:22 +00:00
Farhana Aleen 3528c80378 [AMDGPU] Support idot2 pattern.
Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),

         add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D50024

llvm-svn: 340295
2018-08-21 16:21:15 +00:00