llvm-project/llvm/test/CodeGen/NVPTX
Steffen Larsen 1b4c85fc02 [NVPTX] Add NVPTX intrinsics for CUDA PTX 6.5 ldmatrix instructions
Adds NVPTX intrinsics for the CUDA PTX `ldmatrix.sync.aligned` instructions added in PTX 6.5.

PTX ISA description of `ldmatrix.sync.aligned`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-ldmatrix

Authored-by: Steffen Larsen <steffen.larsen@codeplay.com>

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D107046
2021-08-06 16:13:35 -07:00
..
LoadStoreVectorizer.ll
MachineSink-call.ll
MachineSink-convergent.ll
TailDuplication-convergent.ll
access-non-generic.ll
add-128bit.ll
addrspacecast-gvar.ll
addrspacecast.ll
aggr-param.ll
aggregate-return.ll
alias.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
annotations.ll
arg-lowering.ll
arithmetic-fp-sm20.ll
arithmetic-int.ll
async-copy.ll [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions 2021-05-17 09:46:59 -07:00
atomic-lower-local.ll [NVPTX] Enable lowering of atomics on local memory 2021-04-26 20:12:12 -04:00
atomics-sm60.ll [llvm] Fix missing FileCheck directive colons 2020-04-06 09:59:08 -06:00
atomics-with-scope.ll
atomics.ll
barrier.ll
bfe.ll
branch-fold.ll
bug17709.ll
bug21465.ll Revert "[NFC] remove explicit default value for strboolattr attribute in tests" 2021-05-24 19:43:40 +02:00
bug22246.ll
bug22322.ll Revert "[NFC] remove explicit default value for strboolattr attribute in tests" 2021-05-24 19:43:40 +02:00
bug26185-2.ll
bug26185.ll
bug41651.ll [NVPTX] Fix PR41651 2019-07-30 19:52:01 +00:00
bypass-div.ll
call-with-alloca-buffer.ll
callchain.ll
calling-conv.ll
calls-with-phi.ll
combine-min-max.ll [NVPTX] Fix typo in lit test 2020-08-17 16:02:11 -04:00
compare-int.ll
constant-vectors.ll
convergent-mir-call.ll
convert-fp.ll
convert-int-sm20.ll
ctlz.ll
ctpop.ll
cttz.ll
disable-opt.ll
div-ri.ll
divrem-combine.ll
envreg.ll
extloadv.ll
f16-instructions.ll Update @llvm.powi to handle different int sizes for the exponent 2021-06-17 09:38:28 +02:00
f16x2-instructions.ll Fix the default alignment of i1 vectors. 2021-07-31 14:09:59 -07:00
fast-math.ll [NVPTX] Add select(cc,binop(),binop()) fast-math tests 2021-07-18 15:30:24 +01:00
fcos-no-fast-math.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
fma-assoc.ll
fma-disable.ll
fma.ll
fns.ll
fp-contract.ll
fp-literals.ll
fp16.ll
fsin-no-fast-math.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
function-align.ll
generic-to-nvvm-ir.ll
generic-to-nvvm.ll
global-addrspace.ll
global-ctor-empty.ll
global-ctor.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
global-dtor.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
global-ordering.ll
global-variable-big.ll
global-visibility.ll
globals_init.ll
globals_lowering.ll
gvar-init.ll
half.ll
i1-global.ll
i1-int-to-fp.ll
i1-param.ll
i8-param.ll
i128-global.ll
i128-param.ll
i128-retval.ll
i128-struct.ll
idioms.ll
imad.ll
inline-asm.ll
inlineasm-output-template.ll
intrin-nocapture.ll
intrinsic-old.ll [NewPM][NVPTX] Port NVPTX opt passes 2021-01-07 15:12:35 -08:00
intrinsics.ll [llvm] Fix missing FileCheck directive colons 2020-04-06 09:59:08 -06:00
isspacep.ll
ld-addrspace.ll
ld-generic.ll
ld-st-addrrspace.py tests/CodeGen: Use %python lit substitution when invoking python 2021-07-06 18:46:36 -07:00
ldg-invariant.ll
ldparam-v4.ll
ldu-i8.ll
ldu-ldg.ll
ldu-reg-plus-offset.ll
libcall-fulfilled.ll [NVPTX] CUDA does provide malloc/free since compute capability 2.X 2021-03-15 22:45:56 -05:00
libcall-instruction.ll Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" 2020-02-13 10:16:06 -08:00
libcall-intrinsic.ll Update @llvm.powi to handle different int sizes for the exponent 2021-06-17 09:38:28 +02:00
lit.local.cfg [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions 2021-06-29 15:44:07 -07:00
load-sext-i1.ll
load-store.ll
load-with-non-coherent-cache.ll
local-stack-frame.ll
loop-vectorize.ll
lower-aggr-copies.ll
lower-alloca.ll
lower-args.ll [NVPTX] Avoid temp copy of byval kernel parameters. 2021-03-15 14:27:22 -07:00
lower-byval-args.ll [NVPTX] Handle bitcast and ASC(101) when trying to avoid argument copy. 2021-04-06 13:06:00 -07:00
lower-kernel-ptr-arg.ll [NVPTX] Avoid temp copy of byval kernel parameters. 2021-03-15 14:27:22 -07:00
machine-sink.ll
managed.ll
match.ll
math-intrins.ll Consolidate internal denormal flushing controls 2020-01-17 20:09:53 -05:00
mbarrier.ll [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions 2021-05-17 09:46:59 -07:00
minmax-negative.ll
misaligned-vector-ldst.ll
module-inline-asm.ll [NVPTX] Fix for NVPTX module asm regression 2020-06-24 11:17:09 -07:00
mulwide.ll
named-barriers.ll
noduplicate-syncthreads.ll
nofunc.ll
nounroll.ll
nvcl-param-align.ll
nvvm-reflect-arch.ll [NVPTX][NewPM] Re-enable NVVMReflectPass 2021-02-08 13:58:17 -08:00
nvvm-reflect-module-flag.ll [NewPM][NVPTX] Port NVPTX opt passes 2021-01-07 15:12:35 -08:00
nvvm-reflect.ll [NewPM][NVPTX] Port NVPTX opt passes 2021-01-07 15:12:35 -08:00
param-align.ll OpaquePtr: Bulk update tests to use typed byval 2020-11-20 14:00:46 -05:00
param-load-store.ll Fix the default alignment of i1 vectors. 2021-07-31 14:09:59 -07:00
pow2_mask_cmp.ll [TargetLowering] Add i1 condition for bit comparison fold 2020-10-27 12:22:20 +00:00
pr13291-i1-store.ll
pr16278.ll
pr17529.ll Revert "[NFC] remove explicit default value for strboolattr attribute in tests" 2021-05-24 19:43:40 +02:00
proxy-reg-erasure-mir.ll
proxy-reg-erasure-ptx.ll
read-global-variable-constant.ll
redux-sync.ll [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions 2021-05-17 09:46:59 -07:00
refl1.ll
reg-copy.ll
reg-types.ll
rotate.ll
sched1.ll
sched2.ll
sext-in-reg.ll
sext-params.ll
shfl-p.ll [NVPTX] Restructure shfl instrinsics and add variants that return a predicate. 2019-10-14 16:53:34 +00:00
shfl-sync-p.ll [NVPTX] Restructure shfl instrinsics and add variants that return a predicate. 2019-10-14 16:53:34 +00:00
shfl-sync.ll
shfl.ll
shift-parts.ll
simple-call.ll
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-32.ll
sm-version-35.ll
sm-version-37.ll
sm-version-50.ll
sm-version-52.ll
sm-version-53.ll
sm-version-60.ll
sm-version-61.ll
sm-version-62.ll
sm-version-70.ll
speculative-execution-divergent-target.ll
sqrt-approx.ll [DAGCombine] Remove the check for unsafe-fp-math when we are checking the AFN 2021-01-11 02:25:53 +00:00
st-addrspace.ll
st-generic.ll
surf-read-cuda.ll
surf-read.ll
surf-write-cuda.ll
surf-write.ll
symbol-naming.ll
tex-read-cuda.ll
tex-read.ll
texsurf-queries.ll
tid-range.ll
tuple-literal.ll
vec-param-load.ll
vec8.ll
vector-args.ll
vector-call.ll
vector-compare.ll
vector-global.ll
vector-loads.ll [llvm] Fix missing FileCheck directive colons 2020-04-06 09:59:08 -06:00
vector-select.ll
vector-stores.ll
vectorize-misaligned.ll
vote.ll
weak-global.ll
weak-linkage.ll
wmma.py [NVPTX] Add NVPTX intrinsics for CUDA PTX 6.5 ldmatrix instructions 2021-08-06 16:13:35 -07:00
zeroext-32bit.ll