Matt Arsenault
45b98189bd
AMDGPU: Don't use MUBUF vaddr if address may overflow
...
Effectively revert r263964. Before we would not
allow this if vaddr was not known to be positive.
llvm-svn: 318240
2017-11-15 00:45:43 +00:00
Konstantin Zhuravlyov
27b0a033d8
AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td
...
Differential Revision: https://reviews.llvm.org/D39880
llvm-svn: 317920
2017-11-10 20:01:58 +00:00
Matt Arsenault
28f52e51f1
AMDGPU: Add max-mix-insts subtarget feature
...
llvm-svn: 316553
2017-10-25 07:00:51 +00:00
Konstantin Zhuravlyov
eda425edd4
AMDGPU: Do not emit deprecated notes for code object v3
...
Differential Revision: https://reviews.llvm.org/D38749
llvm-svn: 315810
2017-10-14 15:59:07 +00:00
Matt Arsenault
cc85223f87
AMDGPU: Fix incorrect selection of pseudo-branches
...
These should only be used if the machine structurizer is enabled.
llvm-svn: 315357
2017-10-10 20:22:07 +00:00
Matt Arsenault
90c7593a75
AMDGPU: Remove global isGCN predicates
...
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Matt Arsenault
c6baa85fc6
AMDGPU: Fix typos
...
llvm-svn: 314715
2017-10-02 20:31:18 +00:00
Matt Arsenault
d7e2303df2
AMDGPU: Start selecting v_mad_mix_f32
...
llvm-svn: 312732
2017-09-07 18:05:07 +00:00
Matt Arsenault
efa1d655d4
AMDGPU: Add ds_{read|write}_addtid_b32 definitions
...
llvm-svn: 312349
2017-09-01 18:38:02 +00:00
Matt Arsenault
ed6e8f0a90
AMDGPU: Add most d16 load/store instruction definitions
...
Doesn't include the tied operand necessary for the loads,
but is enough for the assembler to work.
llvm-svn: 312347
2017-09-01 18:36:06 +00:00
Konstantin Zhuravlyov
68107657d4
AMDGPU: Fix gfx801 features
...
gfx801 has 1/2 rate F64, Fast F32 FMA
Differential Revision: https://reviews.llvm.org/D36981
llvm-svn: 311694
2017-08-24 20:03:07 +00:00
Dmitry Preobrazhensky
ff64aa514b
[AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes
...
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36674
llvm-svn: 311006
2017-08-16 13:51:56 +00:00
Matt Arsenault
8728c5f2db
AMDGPU: Cleanup subtarget features
...
Try to avoid mutually exclusive features. Don't use
a real default GPU, and use a fake "generic". The goal
is to make it easier to see which set of features are
incompatible between feature strings.
Most of the test changes are due to random scheduling changes
from not having a default fullspeed model.
llvm-svn: 310258
2017-08-07 14:58:04 +00:00
Matt Arsenault
a7eb14afc7
AMDGPU: Fix typo in feature description
...
llvm-svn: 310217
2017-08-06 18:13:23 +00:00
Matt Arsenault
ca7b0a1777
AMDGPU: Add instruction definitions for some scratch_* instructions
...
Omit atomics for now since they probably aren't useful.
llvm-svn: 308747
2017-07-21 15:36:16 +00:00
Matt Arsenault
c37fe66ec5
AMDGPU: Add encoding for carryless add/sub instructions
...
llvm-svn: 308639
2017-07-20 17:42:47 +00:00
Sam Kolton
a179d25b99
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
...
Summary:
1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it.
2. There were several problems with support of VOPC instructions in SDWA peephole pass.
Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl
Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye
Differential Revision: https://reviews.llvm.org/D34626
llvm-svn: 306413
2017-06-27 15:02:23 +00:00
Matt Arsenault
8bcf2f20a7
AMDGPU: Whitespace fixes
...
llvm-svn: 306265
2017-06-26 03:01:36 +00:00
Sam Kolton
3c4933fcc6
[AMDGPU] SDWA: add support for GFX9 in peephole pass
...
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
2017-06-22 06:26:41 +00:00
Matt Arsenault
9698f1c862
AMDGPU: Start adding global_* instructions
...
llvm-svn: 305838
2017-06-20 19:54:14 +00:00
Wei Ding
7c3e5115a5
AMDGPU : Fix ISA Version Definitions.
...
Differential Revision: http://reviews.llvm.org/D28531
llvm-svn: 305137
2017-06-10 03:53:19 +00:00
Konstantin Zhuravlyov
be6c0ca5e2
AMDGPU: Make auto waitcnt before barrier a feature
...
Differential Revision: https://reviews.llvm.org/D33793
llvm-svn: 304571
2017-06-02 17:40:26 +00:00
Sam Kolton
f7659d71eb
[AMDGPU] SDWA: Add assembler support for GFX9
...
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
2017-05-23 10:08:55 +00:00
Matt Arsenault
acdc7659cc
AMDGPU: Add new subtarget features for gfx9 flat instructions
...
Flat instructions gain an immediate offset, and 2 new
sets of segment specific flat instructions are added.
llvm-svn: 302729
2017-05-10 21:19:05 +00:00
Sam Kolton
5d99386b4d
[AMDGPU] DPP: add support for GFX9
...
Reviewers: artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32588
llvm-svn: 301551
2017-04-27 15:42:38 +00:00
Konstantin Zhuravlyov
f628406bbd
AMDGPU/GFX9: Enable FastFMAF32
...
Differential Revision: https://reviews.llvm.org/D32363
llvm-svn: 301029
2017-04-21 19:57:53 +00:00
Marek Olsak
e22fdb9cac
AMDGPU: Always use VGPR indexing on GFX9
...
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr
Differential Revision: https://reviews.llvm.org/D31157
llvm-svn: 298396
2017-03-21 17:00:32 +00:00
Matt Arsenault
9be7b0d485
AMDGPU: Add VOP3P instruction format
...
Add a few non-VOP3P but instructions related to packed.
Includes hack with dummy operands for the benefit of the assembler
llvm-svn: 296368
2017-02-27 18:49:11 +00:00
Matt Arsenault
2fdf2a1a18
AMDGPU: Redefine clamp node as clamp 0.0-1.0
...
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
2017-02-21 23:35:48 +00:00
Matt Arsenault
2021f08080
AMDGPU: Fix assembler subtarget predicate for gfx9
...
This was accepting GFX9 instructions on VI.
llvm-svn: 295557
2017-02-18 19:12:26 +00:00
Matt Arsenault
e823d92f7f
AMDGPU: Merge initial gfx9 support
...
llvm-svn: 295554
2017-02-18 18:29:53 +00:00
Wei Ding
205bfdb3e9
AMDGPU : Add trap handler support.
...
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
2017-02-10 02:15:29 +00:00
Tom Stellard
ca16621b2a
Re-commit AMDGPU/GlobalISel: Add support for simple shaders
...
Fix build when global-isel is disabled and fix a warning.
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293551
2017-01-30 21:56:46 +00:00
Tom Stellard
7a19d56f73
Revert "AMDGPU/GlobalISel: Add support for simple shaders"
...
This reverts commit r293503.
Revert while I investigate some of the buildbot failures.
llvm-svn: 293509
2017-01-30 17:42:41 +00:00
Tom Stellard
e48f60aec8
AMDGPU/GlobalISel: Add support for simple shaders
...
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293503
2017-01-30 17:09:15 +00:00
Matt Arsenault
d8f7ea381f
AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
...
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 293310
2017-01-27 17:42:26 +00:00
Matt Arsenault
7aad8fd8f4
Enable FeatureFlatForGlobal on Volcanic Islands
...
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Matt Arsenault
a6867fd441
AMDGPU: Combine fp16/fp64 subtarget features
...
The same control register controls both, and are set to
the same defaults. Keep the old names around as aliases.
llvm-svn: 292837
2017-01-23 22:31:03 +00:00
Sam Kolton
07dbde214b
[AMDGPU] Add subtarget features for SDWA/DPP
...
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28900
llvm-svn: 292596
2017-01-20 10:01:25 +00:00
Marek Olsak
23ae31cca0
AMDGPU/SI: Remove XNACK feature from CI
...
Summary: CI doesn't have XNACK.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27175
llvm-svn: 289263
2016-12-09 19:49:58 +00:00
Marek Olsak
0f55fbae6c
AMDGPU/SI: Don't reserve XNACK when it's disabled
...
Summary:
This frees 2 additional scalar registers.
These are results from all of my 3 patches combined:
Polaris:
Spilled SGPRs: 2231 -> 1517 (-32.00 %)
Tonga:
Spilled SGPRs: 3829 -> 2608 (-31.89 %)
Spilled VGPRs: 100 -> 84 (-16.00 %)
Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader
limited to 64 VGPRs.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27151
llvm-svn: 289262
2016-12-09 19:49:54 +00:00
Konstantin Zhuravlyov
f86e4b7266
[AMDGPU] Add f16 support (VI+)
...
Differential Revision: https://reviews.llvm.org/D25975
llvm-svn: 286753
2016-11-13 07:01:11 +00:00
Tom Stellard
115a61560e
AMDGPU: Add VI i16 support
...
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 286464
2016-11-10 16:02:37 +00:00
Tom Stellard
2d2d33f1dc
Revert "AMDGPU: Add VI i16 support"
...
This reverts commit r285939 and r285948. These broke some conformance tests.
llvm-svn: 285995
2016-11-04 13:06:34 +00:00
Tom Stellard
2b3379cdff
AMDGPU: Add VI i16 support
...
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 285939
2016-11-03 17:13:50 +00:00
Matt Arsenault
f3dd863031
AMDGPU: Whitespace fixes
...
llvm-svn: 285659
2016-11-01 00:55:14 +00:00
Matt Arsenault
c88ba36eab
AMDGPU: Use 1/2pi inline imm on VI
...
I'm guessing at how it is supposed to be printed
llvm-svn: 285490
2016-10-29 04:05:06 +00:00
Matt Arsenault
7b6475568d
AMDGPU: Add definitions for scalar store instructions
...
Also add glc bit to the scalar loads since they exist on VI
and change the caching behavior.
This currently has an assembler bug where the glc bit is incorrectly
accepted on SI/CI which do not have it.
llvm-svn: 285463
2016-10-28 21:55:15 +00:00
Yaxun Liu
94add85adb
AMDGPU: Refactor processor definition to use ISA version features
...
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.
Refactor processor definition to use ISA version features.
Fixed ISA version for stoney.
Based on Laurent Morichetti's patch.
Differential Revision: https://reviews.llvm.org/D25919
llvm-svn: 285210
2016-10-26 16:37:56 +00:00
Tom Stellard
64a9d0876c
AMDGPU/SI: Don't allow unaligned scratch access
...
Summary: The hardware doesn't support this.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25523
llvm-svn: 284257
2016-10-14 18:10:39 +00:00