Commit Graph

51 Commits

Author SHA1 Message Date
Matt Arsenault 0eb62d5b3f AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.store 2020-01-27 15:16:21 -05:00
Matt Arsenault 533d650e94 AMDGPU/GlobalISel: Move llvm.amdgcn.raw.buffer.store handling
Treat this the same way as loads. There's less value to the
intermediate nodes, but it's good to be consistent.
2020-01-27 14:59:30 -05:00
Matt Arsenault 09ed0e44d9 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.load 2020-01-27 13:40:37 -05:00
Matt Arsenault 198624c39d AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load.format 2020-01-27 13:02:19 -05:00
Matt Arsenault fc90222a91 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load
Use intermediate instructions, unlike with buffer stores. This is
necessary because of the need to have an internal way to distinguish
between signed and unsigned extloads. This introduces some duplication
and near duplication with the buffer store selection path. The store
handling should maybe be moved into legalization to match and
eliminate the duplication.
2020-01-27 12:49:23 -05:00
Matt Arsenault e60d658260 AMDGPU/GlobalISel: Handle VOP3NoMods 2020-01-27 09:03:44 -08:00
Matt Arsenault ac0b9b4ccf AMDPGPU/GlobalISel: Select more MUBUF global addressing modes
The handling of the high bits of the resource descriptor seem weird to
me, where the 3rd dword changes based on the instruction.
2020-01-27 07:28:36 -08:00
Matt Arsenault fdaad485e6 AMDGPU/GlobalISel: Initial selection of MUBUF addr64 load/store
Fixes the main reason for compile failures on SI, but doesn't really
try to use the addressing modes yet.
2020-01-27 07:13:56 -08:00
Matt Arsenault 4d14772f5c AMDGPU/GlobalISel: Remove redundant or patterns
These ended up with higher priority than or3 patterns in a future
patch. This also fixes the using VOP2 forms.
2020-01-22 21:45:51 -05:00
Matt Arsenault a174f0da62 AMDGPU/GlobalISel: Add pre-legalize combiner pass
Just copy the AArch64 pass as-is for now, except for removing the
memcpy handling.
2020-01-22 10:16:39 -05:00
Matt Arsenault a722cbf77c AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec
The intermediate instruction drops the extra volatile argument. We are
missing an atomic ordering on these.
2020-01-22 09:26:17 -05:00
Matt Arsenault 592de0009f AMDGPU/GlobalISel: Select llvm.amdgcn.update.dpp
The existing test is overly reliant on -mattr=-flat-for-global, and
some missing optimizations to re-use.
2020-01-17 20:09:53 -05:00
Matt Arsenault b4a647449f TableGen/GlobalISel: Add way for SDNodeXForm to work on timm
The current implementation assumes there is an instruction associated
with the transform, but this is not the case for
timm/TargetConstant/immarg values. These transforms should directly
operate on a specific MachineOperand in the source
instruction. TableGen would assert if you attempted to define an
equivalent GISDNodeXFormEquiv using timm when it failed to find the
instruction matcher.

Specially recognize SDNodeXForms on timm, and pass the operand index
to the render function.

Ideally this would be a separate render function type that looks like
void renderFoo(MachineInstrBuilder, const MachineOperand&), but this
proved to be somewhat mechanically painful. Add an optional operand
index which will only be passed if the transform should only look at
the one source operand.

Theoretically it would also be possible to only ever pass the
MachineOperand, and the existing renderers would check the parent. I
think that would be somewhat ugly for the standard usage which may
want to inspect other operands, and I also think MachineOperand should
eventually not carry a pointer to the parent instruction.

Use it in one sample pattern. This isn't a great example, since the
transform exists to satisfy DAG type constraints. This could also be
avoided by just changing the MachineInstr's arbitrary choice of
operand type from i16 to i32. Other patterns have nontrivial uses, but
this serves as the simplest example.

One flaw this still has is if you try to use an SDNodeXForm defined
for imm, but the source pattern uses timm, you still see the "Failed
to lookup instruction" assert. However, there is now a way to avoid
it.
2020-01-09 17:37:52 -05:00
Matt Arsenault e71af77568 AMDGPU/GlobalISel: Add IMMPopCount xform
Partially fixes BFE pattern import.
2020-01-09 10:29:32 -05:00
Matt Arsenault 79450a4ea2 AMDGPU/GlobalISel: Add selectVOP3Mods_nnan
This doesn't enable any new imports yet, but moves the fmed patterns
from failing on this to hitting the "complex suboperand referenced
more than once" limitation in tablegen.
2020-01-09 10:29:32 -05:00
Matt Arsenault d964086c62 AMDGPU/GlobalISel: Add equiv xform for bitcast_fpimm_to_i32
Only partially fixes one pattern import.
2020-01-09 10:29:31 -05:00
Matt Arsenault 3952748ffd AMDGPU/GlobalISel: Fix add of neg inline constant pattern 2020-01-09 10:29:31 -05:00
Matt Arsenault c3a10faadc AMDGPU: Remove VOP3Mods0Clamp0OMod
Now that overridable default operands work, there's no reason to use
complex patterns to just produce 0s.
2020-01-07 15:10:08 -05:00
Matt Arsenault 171cf5302f AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG
Custom lower this to a target instruction with the merge operands. I
think it might be better to directly select this and emit a
REG_SEQUENCE, but this would be more work since it would require
splitting the tablegen patterns for these cases from the other
atomics.
2019-10-25 13:11:09 -07:00
Matt Arsenault 27269054d2 GlobalISel: Add target pre-isel instructions
Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

llvm-svn: 373937
2019-10-07 18:43:29 +00:00
Matt Arsenault 59b91aa93e AMDGPU/GlobalISel: Add support for init.exec intrinsics
TThe existing wave32 behavior seems broken and incomplete, but this
reproduces it.

llvm-svn: 373296
2019-10-01 02:07:25 +00:00
Matt Arsenault 7df5b3fd26 AMDGPU/GlobalISel: Select cvt pk intrinsics
llvm-svn: 371539
2019-09-10 17:17:05 +00:00
Matt Arsenault 77e3e9cafd AMDGPU/GlobalISel: Select llvm.amdgcn.class
Also fixes missing SubtargetPredicate on f16 class instructions.

llvm-svn: 371436
2019-09-09 18:29:45 +00:00
Matt Arsenault d6c1f5bb15 AMDGPU/GlobalISel: Select fmed3
llvm-svn: 371435
2019-09-09 18:29:37 +00:00
Matt Arsenault 63e6d8db1c AMDGPU/GlobalISel: Select atomic loads
A new check for an explicitly atomic MMO is needed to avoid
incorrectly matching pattern for non-atomic loads

llvm-svn: 371418
2019-09-09 16:18:07 +00:00
Matt Arsenault ebbd6e4976 AMDGPU: Remove code address space predicates
Fixes 8-byte, 8-byte aligned LDS loads. 16-byte case still broken due
to not be reported as legal.

llvm-svn: 371413
2019-09-09 16:02:07 +00:00
Matt Arsenault 508dff2ce1 AMDGPU/GlobalISel: Remove dead patterns
llvm-svn: 371404
2019-09-09 15:06:06 +00:00
Matt Arsenault 9952f46407 AMDGPU/GlobalISel: Fix flat load/store of pointer types
llvm-svn: 367513
2019-08-01 03:57:42 +00:00
Matt Arsenault 26cb53b260 AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD
llvm-svn: 367509
2019-08-01 03:33:15 +00:00
Matt Arsenault da5b9bfa95 AMDGPU/GlobalISel: Allow selection of DS atomicrmw
llvm-svn: 367507
2019-08-01 03:29:01 +00:00
Matt Arsenault 3baf4d3418 AMDGPU/GlobalISel: Select simple local stores
llvm-svn: 367504
2019-08-01 03:09:15 +00:00
Matt Arsenault 3594011de0 AMDGPU/GlobalISel: Select local loads
llvm-svn: 367498
2019-08-01 00:53:38 +00:00
Matt Arsenault f8c8284455 AMDGPU/GlobalISel: Select G_ASHR
llvm-svn: 366257
2019-07-16 20:31:25 +00:00
Matt Arsenault 7161fb0be5 AMDGPU/GlobalISel: Select private loads
llvm-svn: 366248
2019-07-16 19:22:21 +00:00
Matt Arsenault 35c96598b1 AMDGPU/GlobalISel: Select flat loads
Now that the patterns use the new PatFrag address space support, the
only blocker to importing most load patterns is the addressing mode
complex patterns.

llvm-svn: 366237
2019-07-16 18:05:29 +00:00
Matt Arsenault 0a52e9d026 AMDGPU/GlobalISel: Complete implementation of G_GEP
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

llvm-svn: 364806
2019-07-01 16:34:48 +00:00
Matt Arsenault d810ff2588 AMDGPU/GlobalISel: Try to select VOP3 form of add
There are several things broken, but at least emit the right thing for
gfx9.

The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.

llvm-svn: 364804
2019-07-01 16:27:32 +00:00
Tom Stellard 9e9dd30de3 AMDGPU/GlobalISel: Implement select for 32-bit G_ADD
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

llvm-svn: 364797
2019-07-01 16:09:33 +00:00
Stanislav Mekhanoshin 7895c03232 [AMDGPU] predicate and feature refactoring
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.

Differential revision: https://reviews.llvm.org/D60292

llvm-svn: 357791
2019-04-05 18:24:34 +00:00
Konstantin Zhuravlyov 9a278bf6b5 Revert "AMDGPU/NFC: Cleanup subtarget predicates"
It breaks one of our downstream merges, so revert it
temporarily while investigating failures downstream

llvm-svn: 354700
2019-02-22 23:21:06 +00:00
Konstantin Zhuravlyov c2650178a1 AMDGPU/NFC: Cleanup subtarget predicates
Differential Revision: https://reviews.llvm.org/D58522

llvm-svn: 354620
2019-02-21 20:43:43 +00:00
Tom Stellard 79b5c3842b AMDGPU/GlobalISel: Move SMRD selection logic to TableGen
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52922

llvm-svn: 354516
2019-02-20 21:02:37 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Tom Stellard 14d8807d9a AMDGPU/GlobalISel: Select amdgcn.cvt.pkrtz to 64-bit instructions
Summary: The 32-bit variants do not exist on VI+.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52958

llvm-svn: 343985
2018-10-08 17:49:29 +00:00
Tom Stellard ac68471326 AMDGPU/GlobalISel: Implement select() for 32-bit @llvm.minnun and @llvm.maxnum
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D46172

llvm-svn: 337056
2018-07-13 22:16:03 +00:00
Tom Stellard 26fac0f8e1 AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D48196

llvm-svn: 335318
2018-06-22 02:54:57 +00:00
Tom Stellard 9a6535718e AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48195

llvm-svn: 335316
2018-06-22 02:34:29 +00:00
Tom Stellard a92847359a AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.cvt.pkrtz
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45907

llvm-svn: 334757
2018-06-14 19:26:37 +00:00
Tom Stellard 46bbbc33c0 AMDGPU/GlobalISel: Implement select() for 32-bit G_FADD and G_FMUL
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46171

llvm-svn: 334665
2018-06-13 22:30:47 +00:00
Tom Stellard dcc95e9385 AMDGPU/GlobalISel: Implement select() for 32-bit G_FPTOUI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45883

llvm-svn: 332082
2018-05-11 05:44:16 +00:00