llvm-project

History

Jonas Hahnfeld 20526bf483 [NVPTX] Select atomic loads and stores According to PTX ISA .volatile has the same memory synchronization semantics as .relaxed.sys, so it can be used to implement monotonic atomic loads and stores. This is important for OpenMP's atomic construct where - 'read's and 'write's are lowered to atomic loads and stores, and - an update of float or double types are lowered into a cmpxchg loop. (Note that PTX could do better because it has atom.add.f{32,64} but LLVM's atomicrmw instruction only allows integer types.) Higher levels of atomicity (like acquire and release) need additional synchronization properties which were added with PTX ISA 6.0 / sm_70. So using these instructions still results in an error. Differential Revision: https://reviews.llvm.org/D50391 llvm-svn: 339316		2018-08-09 07:45:49 +00:00
..
LoadStoreVectorizer.ll	[NVPTX] Added support for .f16x2 instructions.	2017-02-23 22:38:24 +00:00
MachineSink-call.ll	…
MachineSink-convergent.ll	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0	2016-07-06 20:02:45 +00:00
TailDuplication-convergent.ll	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0	2016-07-06 20:02:45 +00:00
access-non-generic.ll	[NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM.	2018-02-28 23:57:48 +00:00
add-128bit.ll	[DAGCombiner] add missing folds for scalar select of {-1,0,1}	2017-02-24 17:17:33 +00:00
addrspacecast-gvar.ll	…
addrspacecast.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
aggr-param.ll	…
aggregate-return.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
alias.ll	…
annotations.ll	Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll.	2016-12-14 22:32:55 +00:00
arg-lowering.ll	…
arithmetic-fp-sm20.ll	…
arithmetic-int.ll	…
atomics-sm60.ll	[NVPTX] Implement __nvvm_atom_add_gen_d builtin.	2017-11-07 22:10:54 +00:00
atomics-with-scope.ll	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions.	2016-09-28 17:25:38 +00:00
atomics.ll	…
barrier.ll	[NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.	2017-09-21 18:44:49 +00:00
bfe.ll	…
branch-fold.ll	…
bug17709.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
bug21465.ll	[NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC.	2016-07-20 21:44:07 +00:00
bug22246.ll	…
bug22322.ll	Add address space mangling to lifetime intrinsics	2017-04-10 20:18:21 +00:00
bug26185-2.ll	[NVPTX] Fix sign/zero-extending ldg/ldu instruction selection	2016-05-02 18:12:02 +00:00
bug26185.ll	…
bypass-div.ll	…
call-with-alloca-buffer.ll	Fix NVPTX/call-with-alloca-buffer.ll after r276777.	2016-07-26 18:28:33 +00:00
callchain.ll	…
calling-conv.ll	…
combine-min-max.ll	[NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic.	2017-01-18 00:09:01 +00:00
compare-int.ll	…
constant-vectors.ll	…
convergent-mir-call.ll	…
convert-fp.ll	[NVPTX] Add fptosi tests to convert-fp.ll.	2017-01-15 16:55:54 +00:00
convert-int-sm20.ll	…
ctlz.ll	NFC - Various typo fixes in tests	2018-07-04 13:28:39 +00:00
ctpop.ll	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)	2017-05-12 19:56:43 +00:00
cttz.ll	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)	2017-05-12 19:56:43 +00:00
disable-opt.ll	…
div-ri.ll	…
divrem-combine.ll	[NVPTX] Compute 'rem' using the result of 'div', if possible.	2016-10-28 21:44:00 +00:00
envreg.ll	…
extloadv.ll	…
f16-instructions.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
f16x2-instructions.ll	finish: [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	2018-07-11 20:31:51 +00:00
fast-math.ll	[NVPTX] Enable combineRepeatedFPDivisors for NVPTX.	2017-02-03 15:13:50 +00:00
fcos-no-fast-math.ll	[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed.	2017-01-13 18:48:13 +00:00
fma-assoc.ll	[DAGCombine] require UnsafeFPMath for re-association of addition	2017-01-31 14:35:37 +00:00
fma-disable.ll	…
fma.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
fns.ll	[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.	2017-12-06 17:50:05 +00:00
fp-contract.ll	…
fp-literals.ll	…
fp16.ll	…
fsin-no-fast-math.ll	[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed.	2017-01-13 18:48:13 +00:00
function-align.ll	…
generic-to-nvvm-ir.ll	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.	2018-05-09 02:40:45 +00:00
generic-to-nvvm.ll	[NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM.	2018-02-28 23:57:48 +00:00
global-addrspace.ll	…
global-ctor-empty.ll	…
global-ctor.ll	…
global-dtor.ll	…
global-ordering.ll	…
global-variable-big.ll	[NVPTX] Support global variables of integer type larger than i64.	2017-01-18 00:29:53 +00:00
global-visibility.ll	…
globals_init.ll	…
globals_lowering.ll	…
gvar-init.ll	…
half.ll	[NVPTX] add support for initializing fp16 arrays.	2018-04-06 22:25:08 +00:00
i1-global.ll	…
i1-int-to-fp.ll	…
i1-param.ll	…
i8-param.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
i128-global.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
i128-param.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
i128-retval.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
idioms.ll	[NVPTX] Lower integer absolute value idiom to abs instruction.	2017-01-18 00:08:44 +00:00
imad.ll	…
inline-asm.ll	…
intrin-nocapture.ll	…
intrinsic-old.ll	[FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	2018-07-11 20:25:49 +00:00
intrinsics.ll	Fix some broken CHECK lines.	2017-01-22 20:28:56 +00:00
isspacep.ll	…
ld-addrspace.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
ld-generic.ll	…
ld-st-addrrspace.py	[NVPTX] allow address space inference for volatile loads/stores.	2017-10-24 20:31:44 +00:00
ldg-invariant.ll	[NVPTX] Fixed vectorized LDG for f16.	2018-04-06 21:10:24 +00:00
ldparam-v4.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
ldu-i8.ll	…
ldu-ldg.ll	…
ldu-reg-plus-offset.ll	…
lit.local.cfg	…
load-sext-i1.ll	…
load-store.ll	[NVPTX] Select atomic loads and stores	2018-08-09 07:45:49 +00:00
load-with-non-coherent-cache.ll	…
local-stack-frame.ll	…
loop-vectorize.ll	…
lower-aggr-copies.ll	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)	2018-01-19 17:13:12 +00:00
lower-alloca.ll	NVPTX: Move InferAddressSpaces to generic code	2017-01-31 01:10:58 +00:00
lower-kernel-ptr-arg.ll	[NVPTX] Improve lowering of byval args of device functions.	2016-07-20 18:39:47 +00:00
machine-sink.ll	…
managed.ll	…
match.ll	[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.	2017-09-26 17:07:23 +00:00
math-intrins.ll	[NVPTX] Add codegen tests for llvm.fma.	2017-01-15 16:55:37 +00:00
minmax-negative.ll	Improve clamp recognition in ValueTracking.	2017-10-27 20:53:41 +00:00
misaligned-vector-ldst.ll	[NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors.	2017-03-07 20:33:38 +00:00
module-inline-asm.ll	…
mulwide.ll	…
named-barriers.ll	[NVPTX] Add intrinsics to support named barriers.	2017-01-28 16:38:15 +00:00
noduplicate-syncthreads.ll	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0	2016-07-06 20:02:45 +00:00
nounroll.ll	…
nvcl-param-align.ll	…
nvvm-reflect-arch.ll	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").	2018-08-03 18:05:24 +00:00
nvvm-reflect-module-flag.ll	…
nvvm-reflect.ll	[NVPTX] Let there be One True Way to set NVVMReflect params.	2017-01-15 16:54:35 +00:00
param-align.ll	[NVPTX] Make sure we adjust alignment at all call sites	2016-07-18 21:58:48 +00:00
param-load-store.ll	finish: [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	2018-07-11 20:31:51 +00:00
pr13291-i1-store.ll	…
pr16278.ll	…
pr17529.ll	…
read-global-variable-constant.ll	[NVPTX] Lower loads from global constants using ld.global.nc (aka LDG).	2018-02-28 23:58:05 +00:00
refl1.ll	…
reg-copy.ll	…
reg-types.ll	[NVPTX] Use untyped (.b) integer registers in PTX.	2016-08-12 22:02:19 +00:00
rotate.ll	…
sched1.ll	Only enable LiveRangeShrink for x86.	2017-05-17 20:18:13 +00:00
sched2.ll	Only enable LiveRangeShrink for x86.	2017-05-17 20:18:13 +00:00
sext-in-reg.ll	…
sext-params.ll	…
shfl-sync.ll	[NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.	2017-09-20 21:23:07 +00:00
shfl.ll	[NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.	2016-10-31 21:51:42 +00:00
shift-parts.ll	…
simple-call.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
sm-version-20.ll	…
sm-version-21.ll	…
sm-version-30.ll	…
sm-version-32.ll	…
sm-version-35.ll	…
sm-version-37.ll	…
sm-version-50.ll	…
sm-version-52.ll	…
sm-version-53.ll	…
sm-version-60.ll	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM.	2016-07-06 21:06:10 +00:00
sm-version-61.ll	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM.	2016-07-06 21:06:10 +00:00
sm-version-62.ll	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM.	2016-07-06 21:06:10 +00:00
sm-version-70.ll	[CUDA] Added rudimentary support for CUDA-9 and sm_70.	2017-09-07 18:14:32 +00:00
speculative-execution-divergent-target.ll	…
sqrt-approx.ll	[NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x).	2017-01-31 23:08:57 +00:00
st-addrspace.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
st-generic.ll	…
surf-read-cuda.ll	…
surf-read.ll	…
surf-write-cuda.ll	…
surf-write.ll	…
symbol-naming.ll	[NVPTX] Assign valid global names	2017-12-04 14:19:33 +00:00
tex-read-cuda.ll	…
tex-read.ll	…
texsurf-queries.ll	…
tid-range.ll	[SelectionDAG] Correctly transform range metadata to AssertZExt	2017-01-06 00:11:46 +00:00
tuple-literal.ll	…
vec-param-load.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
vec8.ll	Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB."	2017-05-18 18:50:05 +00:00
vector-args.ll	…
vector-call.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
vector-compare.ll	…
vector-global.ll	…
vector-loads.ll	…
vector-select.ll	…
vector-stores.ll	…
vote.ll	[NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.	2017-09-21 18:44:49 +00:00
weak-global.ll	…
weak-linkage.ll	…
wmma.py	[NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.	2018-04-18 21:51:48 +00:00
zero-cs.ll	llvm/test/CodeGen/NVPTX/zero-cs.ll: Relax an expression to match in -Asserts.	2016-09-21 04:43:11 +00:00
zeroext-32bit.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00