Commit Graph

19 Commits

Author SHA1 Message Date
Matt Arsenault 1349a04ef5 AMDGPU: Make v2i16/v2f16 legal on VI
This usually results in better code. Fixes using
inline asm with short2, and also fixes having a different
ABI for function parameters between VI and gfx9.

Partially cleans up the mess used for lowering of the d16
operations. Making v4f16 legal will help clean this up more,
but this requires additional work.

llvm-svn: 332953
2018-05-22 06:32:10 +00:00
Stanislav Mekhanoshin 160f85794d [AMDGPU] Use packed literals with zero either lower or hi part
Differential Revision: https://reviews.llvm.org/D45790

llvm-svn: 330365
2018-04-19 21:16:50 +00:00
Stanislav Mekhanoshin 8b20b7dc2b [AMDGPU] Enabled v2.16 literals for VOP3P
Literal encoding needs op_sel_hi to select low 16 bit in this case.

Differential Revision: https://reviews.llvm.org/D45745

llvm-svn: 330230
2018-04-17 23:09:05 +00:00
Yaxun Liu 0124b5484c [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170

llvm-svn: 325030
2018-02-13 18:00:25 +00:00
Konstantin Zhuravlyov c40d9f2e5d AMDGPU/GCN: Bring processors in sync with AMDGPUUsage
- Add gfx704
    - Change bonaire to gfx704
  - Remove gfx804
  - Remove gfx901
  - Remove gfx903

Differential Revision: https://reviews.llvm.org/D40046

llvm-svn: 320194
2017-12-08 20:52:28 +00:00
Dmitry Preobrazhensky a0342dc9eb [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev}
See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765

Reviewers: tamazov, SamWot, arsenm, vpykhtin

Differential Revision: https://reviews.llvm.org/D40088

llvm-svn: 318675
2017-11-20 18:24:21 +00:00
Matt Arsenault 301162c4fe AMDGPU: Replace i64 add/sub lowering
Use VOP3 add/addc like usual.

This has some tradeoffs. Inline immediates fold
a little better, but other constants are worse off.
SIShrinkInstructions could be made smarter to handle
these cases.

This allows us to avoid selecting scalar adds where we
need to track the carry in scc and replace its users.
This makes it easier to use the carryless VALU adds.

llvm-svn: 318340
2017-11-15 21:51:43 +00:00
Matt Arsenault 4e309b0861 AMDGPU: Start selecting global instructions
llvm-svn: 309470
2017-07-29 01:03:53 +00:00
Matt Arsenault 6c29c5acfe AMDGPU: Allow SIShrinkInstructions to work in non-SSA
Immediates can be folded as long as the immediate is a vreg.

Also undo commuting instructions if it didn't fold an immediate.

llvm-svn: 307575
2017-07-10 19:53:57 +00:00
Stanislav Mekhanoshin 0330660403 [AMDGPU] Untangle SDWA pass from SIShrinkInstructions
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.

Also added handling to preserve original src modifiers.

Differential Revision: https://reviews.llvm.org/D33860

llvm-svn: 304665
2017-06-03 17:39:47 +00:00
Stanislav Mekhanoshin 56ea488d8b [AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.

Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.

Differential Revision: https://reviews.llvm.org/D33583

llvm-svn: 304219
2017-05-30 16:49:24 +00:00
Stanislav Mekhanoshin 5fa289f0d8 [AMDGPU] Narrow lshl from 64 to 32 bit if possible
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)

Differential Revision: https://reviews.llvm.org/D33367

llvm-svn: 303569
2017-05-22 16:58:10 +00:00
Konstantin Zhuravlyov 3d1cc88c68 AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16)
Differential Revision: https://reviews.llvm.org/D32361

llvm-svn: 301028
2017-04-21 19:45:22 +00:00
Sam Kolton 9fa169601f [AMDGPU] Resubmit SDWA peephole: enable by default
Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

llvm-svn: 299654
2017-04-06 15:03:28 +00:00
Ivan Krasin d4f70c70b9 Revert r299536. [AMDGPU] SDWA peephole: enable by default.
Reason: breaks multiple bots:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173

Original Review URL: https://reviews.llvm.org/D31671

llvm-svn: 299583
2017-04-05 19:58:12 +00:00
Sam Kolton 34e29784fb [AMDGPU] SDWA peephole: enable by default
Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

llvm-svn: 299536
2017-04-05 12:00:45 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Matthias Braun dbcf9e2ee4 LiveRegMatrix: Fix some subreg interference checks
Surprisingly, one of the three interference checks in LiveRegMatrix was
using the main live range instead of the apropriate subregister range
resulting in unnecessarily conservative results.

llvm-svn: 296722
2017-03-02 00:35:08 +00:00
Matt Arsenault eb522e68bc AMDGPU: Support v2i16/v2f16 packed operations
llvm-svn: 296396
2017-02-27 22:15:25 +00:00