llvm-project/llvm/test/CodeGen/NVPTX
Artem Belevich 30512869ff [NVPTX] Make tensor shape part of WMMA intrinsic's name.
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.

Differential Revision: https://reviews.llvm.org/D44719

llvm-svn: 328158
2018-03-21 21:55:02 +00:00
..
LoadStoreVectorizer.ll [NVPTX] Added support for .f16x2 instructions. 2017-02-23 22:38:24 +00:00
MachineSink-call.ll [NVPTX] Annotate call machine instructions as calls. 2016-02-17 17:46:50 +00:00
MachineSink-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
TailDuplication-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
access-non-generic.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
add-128bit.ll [DAGCombiner] add missing folds for scalar select of {-1,0,1} 2017-02-24 17:17:33 +00:00
addrspacecast-gvar.ll
addrspacecast.ll [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. 2016-10-31 21:51:42 +00:00
aggr-param.ll
aggregate-return.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
alias.ll [CUDA] Die gracefully when trying to output an LLVM alias. 2016-01-23 21:12:20 +00:00
annotations.ll Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll. 2016-12-14 22:32:55 +00:00
arg-lowering.ll
arithmetic-fp-sm20.ll
arithmetic-int.ll [NVPTX] expand mul_lohi to mul_lo and mul_hi 2016-01-22 19:47:26 +00:00
atomics-sm60.ll [NVPTX] Implement __nvvm_atom_add_gen_d builtin. 2017-11-07 22:10:54 +00:00
atomics-with-scope.ll [NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions. 2016-09-28 17:25:38 +00:00
atomics.ll
barrier.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
bfe.ll
branch-fold.ll Roll forward r242871 2015-07-29 18:59:09 +00:00
bug17709.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
bug21465.ll [NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC. 2016-07-20 21:44:07 +00:00
bug22246.ll
bug22322.ll Add address space mangling to lifetime intrinsics 2017-04-10 20:18:21 +00:00
bug26185-2.ll [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection 2016-05-02 18:12:02 +00:00
bug26185.ll [NVPTX] Handle ldg created from sign-/zero-extended load 2016-04-05 12:38:01 +00:00
bypass-div.ll Use 32-bit divides instead of 64-bit divides where possible. 2015-08-11 22:16:34 +00:00
call-with-alloca-buffer.ll Fix NVPTX/call-with-alloca-buffer.ll after r276777. 2016-07-26 18:28:33 +00:00
callchain.ll
calling-conv.ll
combine-min-max.ll [NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. 2017-01-18 00:09:01 +00:00
compare-int.ll
constant-vectors.ll
convergent-mir-call.ll [NVPTX] Use different, convergent MIs for convergent calls. 2016-03-01 19:24:03 +00:00
convert-fp.ll [NVPTX] Add fptosi tests to convert-fp.ll. 2017-01-15 16:55:54 +00:00
convert-int-sm20.ll
ctlz.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
ctpop.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
cttz.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
disable-opt.ll [NVPTX] Disable performance optimizations when OptLevel==None 2016-02-04 04:15:36 +00:00
div-ri.ll
divrem-combine.ll [NVPTX] Compute 'rem' using the result of 'div', if possible. 2016-10-28 21:44:00 +00:00
envreg.ll
extloadv.ll
f16-instructions.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
f16x2-instructions.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
fast-math.ll [NVPTX] Enable combineRepeatedFPDivisors for NVPTX. 2017-02-03 15:13:50 +00:00
fcos-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
fma-assoc.ll [DAGCombine] require UnsafeFPMath for re-association of addition 2017-01-31 14:35:37 +00:00
fma-disable.ll
fma.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
fns.ll [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. 2017-12-06 17:50:05 +00:00
fp-contract.ll
fp-literals.ll
fp16.ll
fsin-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
function-align.ll
generic-to-nvvm-ir.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
generic-to-nvvm.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
global-addrspace.ll [NVPTX] Allow undef value as global initializer 2015-08-22 05:40:26 +00:00
global-ctor-empty.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-ctor.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-dtor.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-ordering.ll
global-variable-big.ll [NVPTX] Support global variables of integer type larger than i64. 2017-01-18 00:29:53 +00:00
global-visibility.ll [NVPTX] Do not emit .hidden or .protected directives as they are not allowed by PTX. 2016-01-15 23:57:53 +00:00
globals_init.ll
globals_lowering.ll
gvar-init.ll
half.ll [NVPTX] Added support for half-precision floating point. 2017-01-13 20:56:17 +00:00
i1-global.ll
i1-int-to-fp.ll
i1-param.ll
i8-param.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
i128-global.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-param.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-retval.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
idioms.ll [NVPTX] Lower integer absolute value idiom to abs instruction. 2017-01-18 00:08:44 +00:00
imad.ll
implicit-def.ll
inline-asm.ll
intrin-nocapture.ll
intrinsic-old.ll [NVVMIntrRange] Only set range metadata if none is already present 2016-12-22 00:51:59 +00:00
intrinsics.ll Fix some broken CHECK lines. 2017-01-22 20:28:56 +00:00
isspacep.ll
ld-addrspace.ll
ld-generic.ll
ld-st-addrrspace.py [NVPTX] allow address space inference for volatile loads/stores. 2017-10-24 20:31:44 +00:00
ldg-invariant.ll [NVPTX] Add tests that invariant vector loads get lowered to ld.global.nc. 2017-02-04 01:54:56 +00:00
ldparam-v4.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
ldu-i8.ll
ldu-ldg.ll
ldu-reg-plus-offset.ll
lit.local.cfg
load-sext-i1.ll
load-with-non-coherent-cache.ll [NVPTX] Use LDG for pointer induction variables. 2015-08-05 23:11:57 +00:00
local-stack-frame.ll
loop-vectorize.ll
lower-aggr-copies.ll Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) 2018-01-19 17:13:12 +00:00
lower-alloca.ll NVPTX: Move InferAddressSpaces to generic code 2017-01-31 01:10:58 +00:00
lower-kernel-ptr-arg.ll [NVPTX] Improve lowering of byval args of device functions. 2016-07-20 18:39:47 +00:00
machine-sink.ll
managed.ll
match.ll [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins. 2017-09-26 17:07:23 +00:00
math-intrins.ll [NVPTX] Add codegen tests for llvm.fma. 2017-01-15 16:55:37 +00:00
minmax-negative.ll Improve clamp recognition in ValueTracking. 2017-10-27 20:53:41 +00:00
misaligned-vector-ldst.ll [NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors. 2017-03-07 20:33:38 +00:00
module-inline-asm.ll
mulwide.ll
named-barriers.ll [NVPTX] Add intrinsics to support named barriers. 2017-01-28 16:38:15 +00:00
noduplicate-syncthreads.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
nounroll.ll
nvcl-param-align.ll
nvvm-reflect-module-flag.ll [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect. 2016-04-01 01:09:07 +00:00
nvvm-reflect.ll [NVPTX] Let there be One True Way to set NVVMReflect params. 2017-01-15 16:54:35 +00:00
param-align.ll [NVPTX] Make sure we adjust alignment at all call sites 2016-07-18 21:58:48 +00:00
param-load-store.ll [DAGCombiner] When combining zero_extend of a truncate, only mask before extending for vectors. 2018-03-01 22:32:25 +00:00
pr13291-i1-store.ll
pr16278.ll
pr17529.ll
read-global-variable-constant.ll [NVPTX] Lower loads from global constants using ld.global.nc (aka LDG). 2018-02-28 23:58:05 +00:00
refl1.ll
reg-copy.ll [NVPTX] allow register copy between float and int 2015-08-01 18:02:12 +00:00
reg-types.ll [NVPTX] Use untyped (.b) integer registers in PTX. 2016-08-12 22:02:19 +00:00
rotate.ll
sched1.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sched2.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sext-in-reg.ll
sext-params.ll
shfl-sync.ll [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins. 2017-09-20 21:23:07 +00:00
shfl.ll [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. 2016-10-31 21:51:42 +00:00
shift-parts.ll
simple-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-32.ll
sm-version-35.ll
sm-version-37.ll
sm-version-50.ll
sm-version-52.ll
sm-version-53.ll
sm-version-60.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-61.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-62.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-70.ll [CUDA] Added rudimentary support for CUDA-9 and sm_70. 2017-09-07 18:14:32 +00:00
speculative-execution-divergent-target.ll Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target. 2016-04-15 01:20:52 +00:00
sqrt-approx.ll [NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x). 2017-01-31 23:08:57 +00:00
st-addrspace.ll
st-generic.ll
surf-read-cuda.ll
surf-read.ll
surf-write-cuda.ll
surf-write.ll
symbol-naming.ll [NVPTX] Assign valid global names 2017-12-04 14:19:33 +00:00
tex-read-cuda.ll
tex-read.ll
texsurf-queries.ll
tid-range.ll [SelectionDAG] Correctly transform range metadata to AssertZExt 2017-01-06 00:11:46 +00:00
tuple-literal.ll
vec-param-load.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
vec8.ll Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB." 2017-05-18 18:50:05 +00:00
vector-args.ll
vector-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
vector-compare.ll
vector-global.ll
vector-loads.ll
vector-select.ll
vector-stores.ll
vote.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
weak-global.ll
weak-linkage.ll
wmma.py [NVPTX] Make tensor shape part of WMMA intrinsic's name. 2018-03-21 21:55:02 +00:00
zero-cs.ll llvm/test/CodeGen/NVPTX/zero-cs.ll: Relax an expression to match in -Asserts. 2016-09-21 04:43:11 +00:00
zeroext-32bit.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00