llvm-project/llvm/test/CodeGen/NVPTX
Jonas Hahnfeld 20526bf483 [NVPTX] Select atomic loads and stores
According to PTX ISA .volatile has the same memory synchronization
semantics as .relaxed.sys, so it can be used to implement monotonic
atomic loads and stores. This is important for OpenMP's atomic
construct where
 - 'read's and 'write's are lowered to atomic loads and stores, and
 - an update of float or double types are lowered into a cmpxchg loop.
(Note that PTX could do better because it has atom.add.f{32,64} but
LLVM's atomicrmw instruction only allows integer types.)

Higher levels of atomicity (like acquire and release) need additional
synchronization properties which were added with PTX ISA 6.0 / sm_70.
So using these instructions still results in an error.

Differential Revision: https://reviews.llvm.org/D50391

llvm-svn: 339316
2018-08-09 07:45:49 +00:00
..
LoadStoreVectorizer.ll [NVPTX] Added support for .f16x2 instructions. 2017-02-23 22:38:24 +00:00
MachineSink-call.ll
MachineSink-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
TailDuplication-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
access-non-generic.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
add-128bit.ll [DAGCombiner] add missing folds for scalar select of {-1,0,1} 2017-02-24 17:17:33 +00:00
addrspacecast-gvar.ll
addrspacecast.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
aggr-param.ll
aggregate-return.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
alias.ll
annotations.ll Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll. 2016-12-14 22:32:55 +00:00
arg-lowering.ll
arithmetic-fp-sm20.ll
arithmetic-int.ll
atomics-sm60.ll [NVPTX] Implement __nvvm_atom_add_gen_d builtin. 2017-11-07 22:10:54 +00:00
atomics-with-scope.ll [NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions. 2016-09-28 17:25:38 +00:00
atomics.ll
barrier.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
bfe.ll
branch-fold.ll
bug17709.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
bug21465.ll [NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC. 2016-07-20 21:44:07 +00:00
bug22246.ll
bug22322.ll Add address space mangling to lifetime intrinsics 2017-04-10 20:18:21 +00:00
bug26185-2.ll [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection 2016-05-02 18:12:02 +00:00
bug26185.ll
bypass-div.ll
call-with-alloca-buffer.ll Fix NVPTX/call-with-alloca-buffer.ll after r276777. 2016-07-26 18:28:33 +00:00
callchain.ll
calling-conv.ll
combine-min-max.ll [NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. 2017-01-18 00:09:01 +00:00
compare-int.ll
constant-vectors.ll
convergent-mir-call.ll
convert-fp.ll [NVPTX] Add fptosi tests to convert-fp.ll. 2017-01-15 16:55:54 +00:00
convert-int-sm20.ll
ctlz.ll NFC - Various typo fixes in tests 2018-07-04 13:28:39 +00:00
ctpop.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
cttz.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
disable-opt.ll
div-ri.ll
divrem-combine.ll [NVPTX] Compute 'rem' using the result of 'div', if possible. 2016-10-28 21:44:00 +00:00
envreg.ll
extloadv.ll
f16-instructions.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
f16x2-instructions.ll finish: [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests 2018-07-11 20:31:51 +00:00
fast-math.ll [NVPTX] Enable combineRepeatedFPDivisors for NVPTX. 2017-02-03 15:13:50 +00:00
fcos-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
fma-assoc.ll [DAGCombine] require UnsafeFPMath for re-association of addition 2017-01-31 14:35:37 +00:00
fma-disable.ll
fma.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
fns.ll [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. 2017-12-06 17:50:05 +00:00
fp-contract.ll
fp-literals.ll
fp16.ll
fsin-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
function-align.ll
generic-to-nvvm-ir.ll [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. 2018-05-09 02:40:45 +00:00
generic-to-nvvm.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
global-addrspace.ll
global-ctor-empty.ll
global-ctor.ll
global-dtor.ll
global-ordering.ll
global-variable-big.ll [NVPTX] Support global variables of integer type larger than i64. 2017-01-18 00:29:53 +00:00
global-visibility.ll
globals_init.ll
globals_lowering.ll
gvar-init.ll
half.ll [NVPTX] add support for initializing fp16 arrays. 2018-04-06 22:25:08 +00:00
i1-global.ll
i1-int-to-fp.ll
i1-param.ll
i8-param.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
i128-global.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-param.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-retval.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
idioms.ll [NVPTX] Lower integer absolute value idiom to abs instruction. 2017-01-18 00:08:44 +00:00
imad.ll
inline-asm.ll
intrin-nocapture.ll
intrinsic-old.ll [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests 2018-07-11 20:25:49 +00:00
intrinsics.ll Fix some broken CHECK lines. 2017-01-22 20:28:56 +00:00
isspacep.ll
ld-addrspace.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
ld-generic.ll
ld-st-addrrspace.py [NVPTX] allow address space inference for volatile loads/stores. 2017-10-24 20:31:44 +00:00
ldg-invariant.ll [NVPTX] Fixed vectorized LDG for f16. 2018-04-06 21:10:24 +00:00
ldparam-v4.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
ldu-i8.ll
ldu-ldg.ll
ldu-reg-plus-offset.ll
lit.local.cfg
load-sext-i1.ll
load-store.ll [NVPTX] Select atomic loads and stores 2018-08-09 07:45:49 +00:00
load-with-non-coherent-cache.ll
local-stack-frame.ll
loop-vectorize.ll
lower-aggr-copies.ll Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) 2018-01-19 17:13:12 +00:00
lower-alloca.ll NVPTX: Move InferAddressSpaces to generic code 2017-01-31 01:10:58 +00:00
lower-kernel-ptr-arg.ll [NVPTX] Improve lowering of byval args of device functions. 2016-07-20 18:39:47 +00:00
machine-sink.ll
managed.ll
match.ll [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins. 2017-09-26 17:07:23 +00:00
math-intrins.ll [NVPTX] Add codegen tests for llvm.fma. 2017-01-15 16:55:37 +00:00
minmax-negative.ll Improve clamp recognition in ValueTracking. 2017-10-27 20:53:41 +00:00
misaligned-vector-ldst.ll [NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors. 2017-03-07 20:33:38 +00:00
module-inline-asm.ll
mulwide.ll
named-barriers.ll [NVPTX] Add intrinsics to support named barriers. 2017-01-28 16:38:15 +00:00
noduplicate-syncthreads.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
nounroll.ll
nvcl-param-align.ll
nvvm-reflect-arch.ll [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH"). 2018-08-03 18:05:24 +00:00
nvvm-reflect-module-flag.ll
nvvm-reflect.ll [NVPTX] Let there be One True Way to set NVVMReflect params. 2017-01-15 16:54:35 +00:00
param-align.ll [NVPTX] Make sure we adjust alignment at all call sites 2016-07-18 21:58:48 +00:00
param-load-store.ll finish: [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests 2018-07-11 20:31:51 +00:00
pr13291-i1-store.ll
pr16278.ll
pr17529.ll
read-global-variable-constant.ll [NVPTX] Lower loads from global constants using ld.global.nc (aka LDG). 2018-02-28 23:58:05 +00:00
refl1.ll
reg-copy.ll
reg-types.ll [NVPTX] Use untyped (.b) integer registers in PTX. 2016-08-12 22:02:19 +00:00
rotate.ll
sched1.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sched2.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sext-in-reg.ll
sext-params.ll
shfl-sync.ll [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins. 2017-09-20 21:23:07 +00:00
shfl.ll [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. 2016-10-31 21:51:42 +00:00
shift-parts.ll
simple-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-32.ll
sm-version-35.ll
sm-version-37.ll
sm-version-50.ll
sm-version-52.ll
sm-version-53.ll
sm-version-60.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-61.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-62.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-70.ll [CUDA] Added rudimentary support for CUDA-9 and sm_70. 2017-09-07 18:14:32 +00:00
speculative-execution-divergent-target.ll
sqrt-approx.ll [NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x). 2017-01-31 23:08:57 +00:00
st-addrspace.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
st-generic.ll
surf-read-cuda.ll
surf-read.ll
surf-write-cuda.ll
surf-write.ll
symbol-naming.ll [NVPTX] Assign valid global names 2017-12-04 14:19:33 +00:00
tex-read-cuda.ll
tex-read.ll
texsurf-queries.ll
tid-range.ll [SelectionDAG] Correctly transform range metadata to AssertZExt 2017-01-06 00:11:46 +00:00
tuple-literal.ll
vec-param-load.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
vec8.ll Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB." 2017-05-18 18:50:05 +00:00
vector-args.ll
vector-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
vector-compare.ll
vector-global.ll
vector-loads.ll
vector-select.ll
vector-stores.ll
vote.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
weak-global.ll
weak-linkage.ll
wmma.py [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. 2018-04-18 21:51:48 +00:00
zero-cs.ll llvm/test/CodeGen/NVPTX/zero-cs.ll: Relax an expression to match in -Asserts. 2016-09-21 04:43:11 +00:00
zeroext-32bit.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00