Dmitry Preobrazhensky
4c8f4234b6
[AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP opcodes
...
See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751
Differential Revision: https://reviews.llvm.org/D44529
Reviewers: artem.tamazov, arsenm
llvm-svn: 327723
2018-03-16 16:38:04 +00:00
Mark Searles
c3c02bde73
[AMDGPU] Waitcnt pass: Modify the waitcnt pass to propagate info in the case of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed.
...
Differential Revision: https://reviews.llvm.org/D44434
llvm-svn: 327583
2018-03-14 22:04:32 +00:00
Francis Visoiu Mistrih
e85b06d65f
[CodeGen] Use MIR syntax for MachineMemOperand printing
...
Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)".
rdar://38163529
Differential Revision: https://reviews.llvm.org/D42377
llvm-svn: 327580
2018-03-14 21:52:13 +00:00
Yaxun Liu
a99e7d8e44
[AMDGPU] Fix lowering enqueue kernel when kernel has no name
...
Since the enqueued kernels have internal linkage, their names may be dropped.
In this case, give them unique names __amdgpu_enqueued_kernel or
__amdgpu_enqueued_kernel.n where n is a sequential number starting from 1.
Differential Revision: https://reviews.llvm.org/D44322
llvm-svn: 327291
2018-03-12 16:34:06 +00:00
Dmitry Preobrazhensky
da4a7c01bf
[AMDGPU][MC] Corrected GATHER4 opcodes
...
See bug 36252: https://bugs.llvm.org/show_bug.cgi?id=36252
Differential Revision: https://reviews.llvm.org/D43874
Reviewers: artem.tamazov, arsenm
llvm-svn: 327278
2018-03-12 15:03:34 +00:00
Matt Arsenault
7b9ed89dcf
AMDGPU/GlobalISel: Legality and RegBankInfo for G_{INSERT|EXTRACT}_VECTOR_ELT
...
llvm-svn: 327269
2018-03-12 13:35:53 +00:00
Matt Arsenault
c0aefd561e
AMDGPU/GlobalISel: InstrMapping for G_MERGE_VALUES
...
llvm-svn: 327268
2018-03-12 13:35:49 +00:00
Matt Arsenault
503afda95f
AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legal
...
llvm-svn: 327267
2018-03-12 13:35:43 +00:00
Sanjay Patel
3b36bb0362
[AMDGPU] fix tests to be independent of FP undef
...
llvm-svn: 327211
2018-03-10 16:39:59 +00:00
Matt Arsenault
cbda7ff4ae
AMDGPU: Fix crash when constant folding with physreg operand
...
llvm-svn: 327209
2018-03-10 16:05:35 +00:00
Farhana Aleen
a7cb31123c
[AMDGPU] Supported ds_read_b128 generation; Widened vector length for local address-space.
...
Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64.
This patch supports ds_read_b128 instruction pattern and generation of this instruction.
In the vectorizer, this patch also widen the vector length so that vectorizer generates
128 bit loads for local address-space which gets translated to ds_read_b128.
Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128.
Author: FarhanaAleen
Reviewed By: rampitec, arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44210
llvm-svn: 327153
2018-03-09 17:41:39 +00:00
Sanjay Patel
56d59c1f0f
[AMDGPU] fix test to be independent of FP undef
...
llvm-svn: 327147
2018-03-09 16:33:34 +00:00
Stanislav Mekhanoshin
c8127fc674
[AMDGPU] Fixed V_DIV_FIXUP_F16 selection on GFX9
...
GFX9 should select opsel version.
Differential Revision: https://reviews.llvm.org/D44279
llvm-svn: 327106
2018-03-09 07:21:43 +00:00
Sanjay Patel
672ad3269b
[AMDGPU] fix test to survive more FP undef constant folding
...
llvm-svn: 327066
2018-03-08 21:30:56 +00:00
Sanjay Patel
7325d12f58
[AMDGPU] fix test to survive the most basic undef constant folding
...
This will likely need to be changed again for anything more than:
fmul undef, undef -> undef
llvm-svn: 327034
2018-03-08 17:34:25 +00:00
Farhana Aleen
89196642f7
[AMDGPU] Increased vector length for global/constant loads.
...
Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache;
loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44179
llvm-svn: 326910
2018-03-07 17:09:18 +00:00
Farhana Aleen
347d12b4ce
Revert "[AMDGPU] Widened vector length for global/constant address space."
...
This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03.
llvm-svn: 326907
2018-03-07 16:55:27 +00:00
Farhana Aleen
0d03d0588d
[AMDGPU] Widened vector length for global/constant address space.
...
llvm-svn: 326904
2018-03-07 16:29:05 +00:00
Yaxun Liu
46439e8d4a
[AMDGPU] Fix lowering OpenCL enqueue_kernel
...
One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.
Differential Revision: https://reviews.llvm.org/D43785
llvm-svn: 326806
2018-03-06 16:04:39 +00:00
Matt Arsenault
e31ab94e97
AMDGPU/GlobalISel: Add InstrMapping for G_EXTRACT
...
llvm-svn: 326715
2018-03-05 16:25:18 +00:00
Matt Arsenault
71272e6d4e
AMDGPU/GlobalISel: Make some G_EXTRACTs legal
...
As far as I can tell legalization of weird sizes for the
output type isn't implemented.
llvm-svn: 326714
2018-03-05 16:25:15 +00:00
Alexander Timofeev
2e5eeceeb7
Pass Divergence Analysis data to Selection DAG to drive divergence
...
dependent instruction selection.
Differential revision: https://reviews.llvm.org/D35267
llvm-svn: 326703
2018-03-05 15:12:21 +00:00
Matt Arsenault
b9699c009d
AMDGPU/GlobalISel: InstrMapping for G_ZEXT
...
llvm-svn: 326589
2018-03-02 16:55:37 +00:00
Matt Arsenault
1c1aab99ae
AMDGPU/GlobalISel: InstrMapping for G_TRUNC
...
llvm-svn: 326588
2018-03-02 16:55:33 +00:00
Matt Arsenault
ef8db767d7
AMDGPU/GlobalISel: Define InstrMappings for G_FCMP
...
Patch by Tom Stellard
llvm-svn: 326587
2018-03-02 16:53:15 +00:00
Matt Arsenault
2607dc60de
AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnum
...
Patch by Tom Stellard
llvm-svn: 326586
2018-03-02 16:40:17 +00:00
Matt Arsenault
b46c191c49
AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnum
...
Patch by Tom Stellard
llvm-svn: 326567
2018-03-02 12:23:00 +00:00
Jan Vesely
b283ea0f0f
AMDGPU/GCN: Promote i16 ctpop
...
i16 capable ASICs do not support i16 operands for this instruction.
Add tablegen pattern to merge chained i16 additions.
Differential Revision: https://reviews.llvm.org/D43985
llvm-svn: 326535
2018-03-02 02:50:22 +00:00
Matt Arsenault
41d2e3d98e
AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSI
...
Patch by Tom Stellard
llvm-svn: 326534
2018-03-02 02:19:16 +00:00
Matt Arsenault
b23041ad4d
AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUI
...
Patch by Tom Stellard
llvm-svn: 326533
2018-03-02 02:19:11 +00:00
Matt Arsenault
327d5fb2e5
AMDGPU/GlobalISel: Define instruction mapping for G_FMUL
...
llvm-svn: 326532
2018-03-02 02:17:01 +00:00
Matt Arsenault
5a9e834eac
AMDGPU/GlobalISel: Define instruction mapping for G_FADD
...
Patch by Tom Stellard
llvm-svn: 326526
2018-03-02 01:22:13 +00:00
Matt Arsenault
d99317f1b3
AMDGPU/GlobalISel: Define instruction mapping for G_SHL
...
Patch by Tom Stellard
llvm-svn: 326525
2018-03-02 01:22:10 +00:00
Matt Arsenault
3c7a123ccc
AMDGPU/GlobalISel: Define instruction mapping for G_XOR
...
llvm-svn: 326524
2018-03-02 01:22:06 +00:00
Matt Arsenault
c0f34c9e36
AMDGPU/GlobalISel: Define instruction mapping for G_AND
...
Patch by Tom Stellard
llvm-svn: 326523
2018-03-02 01:22:01 +00:00
Matt Arsenault
364f12e8f9
AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.cvt.pkrtz
...
Patch by Tom Stellard
llvm-svn: 326490
2018-03-01 21:25:30 +00:00
Matt Arsenault
5320ee4a05
AMDGPU/GlobalISel: Define instruction mapping for G_OR
...
Patch by Tom Stellard
llvm-svn: 326489
2018-03-01 21:25:25 +00:00
Matt Arsenault
62669ede94
AMDGPU/GlobalISel: Define instruction mapping for G_BITCAST
...
Patch by Tom Stellard
llvm-svn: 326482
2018-03-01 20:59:44 +00:00
Matt Arsenault
0529a8e2de
AMDGPU/GlobalISel: Mark i32->i64 zext as legal
...
llvm-svn: 326481
2018-03-01 20:56:21 +00:00
Matt Arsenault
36b99e1937
AMDGPU/GlobalISel: InstrMapping for llvm.amdgcn.exp.compr
...
Patch by Tom Stellard
llvm-svn: 326479
2018-03-01 20:40:55 +00:00
Matt Arsenault
8931bbf8df
AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.exp
...
Patch by Tom Stellard
llvm-svn: 326477
2018-03-01 20:24:37 +00:00
Matt Arsenault
50721ab325
AMDGPU/GlobalISel: Define InstrMappings for G_ICMP
...
Patch by Tom Stellard
llvm-svn: 326472
2018-03-01 19:27:10 +00:00
Matt Arsenault
dc14ec05d4
AMDGPU/GlobalISel: Make i32 mul legal
...
llvm-svn: 326471
2018-03-01 19:22:05 +00:00
Matt Arsenault
06cbb27a79
AMDGPU/GlobalISel: Define instruction mapping for G_IMPLICIT_DEF
...
Patch by Tom Stellard
llvm-svn: 326470
2018-03-01 19:16:52 +00:00
Matt Arsenault
e3d9ecf2b9
AMDGPU/GlobalISel: Define instruction mapping for G_FCONSTANT
...
Patch by Tom Stellard
llvm-svn: 326468
2018-03-01 19:13:30 +00:00
Matt Arsenault
3f6a204eaa
AMDGPU/GlobalISel: Make i32 xor legal
...
llvm-svn: 326466
2018-03-01 19:09:21 +00:00
Matt Arsenault
8e80a5fbca
AMDGPU/GlobalISel: Mark 32/64-bit G_FCMP as legal
...
Patch by Tom Stellard
llvm-svn: 326465
2018-03-01 19:09:16 +00:00
Matt Arsenault
dd022ce064
AMDGPU/GlobalISel: Mark 32-bit G_FPTOSI as legal
...
Patch by Tom Stellard
llvm-svn: 326464
2018-03-01 19:04:25 +00:00
Tim Renouf
2a99fa2c08
[AMDGPU] added writelane intrinsic
...
Summary:
For use by LLPC SPV_AMD_shader_ballot extension.
The v_writelane instruction was already implemented for use by SGPR
spilling, but I had to add an extra dummy operand tied to the
destination, to represent that all lanes except the selected one keep
the old value of the destination register.
.ll test changes were due to schedule changes caused by that new
operand.
Differential Revision: https://reviews.llvm.org/D42838
llvm-svn: 326353
2018-02-28 19:10:32 +00:00
Geoff Berry
a2b9011290
Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
...
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
2018-02-27 16:59:10 +00:00