llvm-project/llvm/test/CodeGen/NVPTX
Justin Lebar 49fac56ea3 [NVPTX] Allow libcalls that are defined in the current module.
The patch adds a possibility to make library calls on NVPTX.

An important thing about library functions - they must be defined within
the current module. This basically should guarantee that we produce a
valid PTX assembly (without calls to not defined functions). The one who
wants to use the libcalls is probably will have to link against
compiler-rt or any other implementation.

Currently, it's completely impossible to make library calls because of
error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can
lower ExternalSymbol to TargetExternalSymbol and verify if the function
definition is available.

Also, there was an issue with a DAG during legalisation. When we expand
instruction into libcall, the inner call-chain isn't being "integrated"
into outer chain. Since the last "data-flow" (call retval load) node is
located in call-chain earlier than CALLSEQ_END node, the latter becomes
a leaf and therefore a dead node (and is being removed quite fast).
Proposed here solution relies on another data-flow pseudo nodes
(ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and
instruction selection phases - we remove the pseudo instructions before
register scheduling phase.

Patch by Denys Zariaiev!

Differential Revision: https://reviews.llvm.org/D34708

llvm-svn: 350069
2018-12-26 19:12:31 +00:00
..
LoadStoreVectorizer.ll [NVPTX] Added support for .f16x2 instructions. 2017-02-23 22:38:24 +00:00
MachineSink-call.ll
MachineSink-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
TailDuplication-convergent.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
access-non-generic.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
add-128bit.ll [DAGCombiner] add missing folds for scalar select of {-1,0,1} 2017-02-24 17:17:33 +00:00
addrspacecast-gvar.ll
addrspacecast.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
aggr-param.ll
aggregate-return.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
alias.ll
annotations.ll Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll. 2016-12-14 22:32:55 +00:00
arg-lowering.ll
arithmetic-fp-sm20.ll
arithmetic-int.ll
atomics-sm60.ll [NVPTX] Implement __nvvm_atom_add_gen_d builtin. 2017-11-07 22:10:54 +00:00
atomics-with-scope.ll [NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions. 2016-09-28 17:25:38 +00:00
atomics.ll
barrier.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
bfe.ll
branch-fold.ll
bug17709.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
bug21465.ll [NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC. 2016-07-20 21:44:07 +00:00
bug22246.ll
bug22322.ll DAG: Check no-signed-zeros instead of unsafe-fp-math 2018-08-12 19:09:12 +00:00
bug26185-2.ll [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection 2016-05-02 18:12:02 +00:00
bug26185.ll
bypass-div.ll
call-with-alloca-buffer.ll Fix NVPTX/call-with-alloca-buffer.ll after r276777. 2016-07-26 18:28:33 +00:00
callchain.ll
calling-conv.ll
calls-with-phi.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
combine-min-max.ll [NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. 2017-01-18 00:09:01 +00:00
compare-int.ll
constant-vectors.ll
convergent-mir-call.ll
convert-fp.ll [NVPTX] Add fptosi tests to convert-fp.ll. 2017-01-15 16:55:54 +00:00
convert-int-sm20.ll
ctlz.ll NFC - Various typo fixes in tests 2018-07-04 13:28:39 +00:00
ctpop.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
cttz.ll [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) 2017-05-12 19:56:43 +00:00
disable-opt.ll
div-ri.ll
divrem-combine.ll [NVPTX] Compute 'rem' using the result of 'div', if possible. 2016-10-28 21:44:00 +00:00
envreg.ll
extloadv.ll
f16-instructions.ll [NVPTX] Remove ftz variants of cvt with rounding mode 2018-08-21 18:44:25 +00:00
f16x2-instructions.ll [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading towards scalarizing the type. 2018-11-23 02:32:13 +00:00
fast-math.ll [NVPTX] Enable combineRepeatedFPDivisors for NVPTX. 2017-02-03 15:13:50 +00:00
fcos-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
fma-assoc.ll [DAGCombine] require UnsafeFPMath for re-association of addition 2017-01-31 14:35:37 +00:00
fma-disable.ll
fma.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
fns.ll [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. 2017-12-06 17:50:05 +00:00
fp-contract.ll
fp-literals.ll
fp16.ll
fsin-no-fast-math.ll [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed. 2017-01-13 18:48:13 +00:00
function-align.ll
generic-to-nvvm-ir.ll [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. 2018-05-09 02:40:45 +00:00
generic-to-nvvm.ll [NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM. 2018-02-28 23:57:48 +00:00
global-addrspace.ll
global-ctor-empty.ll
global-ctor.ll
global-dtor.ll
global-ordering.ll
global-variable-big.ll [NVPTX] Support global variables of integer type larger than i64. 2017-01-18 00:29:53 +00:00
global-visibility.ll
globals_init.ll
globals_lowering.ll
gvar-init.ll
half.ll [NVPTX] add support for initializing fp16 arrays. 2018-04-06 22:25:08 +00:00
i1-global.ll
i1-int-to-fp.ll
i1-param.ll
i8-param.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
i128-global.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-param.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-retval.ll [NVPTX] Add lowering of i128 params. 2017-07-20 21:16:03 +00:00
i128-struct.ll [NVPTX] Add lowering of i128 numbers as struct fields 2018-12-01 00:21:52 +00:00
idioms.ll [NVPTX] Lower integer absolute value idiom to abs instruction. 2017-01-18 00:08:44 +00:00
imad.ll
inline-asm.ll
intrin-nocapture.ll
intrinsic-old.ll [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests 2018-07-11 20:25:49 +00:00
intrinsics.ll Fix some broken CHECK lines. 2017-01-22 20:28:56 +00:00
isspacep.ll
ld-addrspace.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
ld-generic.ll
ld-st-addrrspace.py [NVPTX] allow address space inference for volatile loads/stores. 2017-10-24 20:31:44 +00:00
ldg-invariant.ll [NVPTX] Fixed vectorized LDG for f16. 2018-04-06 21:10:24 +00:00
ldparam-v4.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
ldu-i8.ll
ldu-ldg.ll
ldu-reg-plus-offset.ll
libcall-fulfilled.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
libcall-instruction.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
libcall-intrinsic.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
lit.local.cfg
load-sext-i1.ll
load-store.ll [NVPTX] Select atomic loads and stores 2018-08-09 07:45:49 +00:00
load-with-non-coherent-cache.ll
local-stack-frame.ll
loop-vectorize.ll
lower-aggr-copies.ll Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) 2018-01-19 17:13:12 +00:00
lower-alloca.ll NVPTX: Move InferAddressSpaces to generic code 2017-01-31 01:10:58 +00:00
lower-kernel-ptr-arg.ll [NVPTX] Improve lowering of byval args of device functions. 2016-07-20 18:39:47 +00:00
machine-sink.ll
managed.ll
match.ll [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins. 2017-09-26 17:07:23 +00:00
math-intrins.ll [NVPTX] Add codegen tests for llvm.fma. 2017-01-15 16:55:37 +00:00
minmax-negative.ll Improve clamp recognition in ValueTracking. 2017-10-27 20:53:41 +00:00
misaligned-vector-ldst.ll [NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors. 2017-03-07 20:33:38 +00:00
module-inline-asm.ll
mulwide.ll
named-barriers.ll [NVPTX] Add intrinsics to support named barriers. 2017-01-28 16:38:15 +00:00
noduplicate-syncthreads.ll NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 2016-07-06 20:02:45 +00:00
nofunc.ll [NVPTX] do not rely on cached subtarget info. 2018-12-12 18:31:04 +00:00
nounroll.ll
nvcl-param-align.ll
nvvm-reflect-arch.ll [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH"). 2018-08-03 18:05:24 +00:00
nvvm-reflect-module-flag.ll
nvvm-reflect.ll [NVPTX] Let there be One True Way to set NVVMReflect params. 2017-01-15 16:54:35 +00:00
param-align.ll [NVPTX] Make sure we adjust alignment at all call sites 2016-07-18 21:58:48 +00:00
param-load-store.ll finish: [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests 2018-07-11 20:31:51 +00:00
pr13291-i1-store.ll
pr16278.ll
pr17529.ll
proxy-reg-erasure-mir.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
proxy-reg-erasure-ptx.ll [NVPTX] Allow libcalls that are defined in the current module. 2018-12-26 19:12:31 +00:00
read-global-variable-constant.ll [NVPTX] Lower loads from global constants using ld.global.nc (aka LDG). 2018-02-28 23:58:05 +00:00
refl1.ll
reg-copy.ll
reg-types.ll [NVPTX] Use untyped (.b) integer registers in PTX. 2016-08-12 22:02:19 +00:00
rotate.ll
sched1.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sched2.ll Only enable LiveRangeShrink for x86. 2017-05-17 20:18:13 +00:00
sext-in-reg.ll
sext-params.ll
shfl-sync.ll [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins. 2017-09-20 21:23:07 +00:00
shfl.ll [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. 2016-10-31 21:51:42 +00:00
shift-parts.ll
simple-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-32.ll
sm-version-35.ll
sm-version-37.ll
sm-version-50.ll
sm-version-52.ll
sm-version-53.ll
sm-version-60.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-61.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-62.ll [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. 2016-07-06 21:06:10 +00:00
sm-version-70.ll [CUDA] Added rudimentary support for CUDA-9 and sm_70. 2017-09-07 18:14:32 +00:00
speculative-execution-divergent-target.ll
sqrt-approx.ll [NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x). 2017-01-31 23:08:57 +00:00
st-addrspace.ll [NVPTX] Added a feature to use short pointers for const/local/shared AS. 2018-05-09 23:46:19 +00:00
st-generic.ll
surf-read-cuda.ll
surf-read.ll
surf-write-cuda.ll
surf-write.ll
symbol-naming.ll [NVPTX] Assign valid global names 2017-12-04 14:19:33 +00:00
tex-read-cuda.ll
tex-read.ll
texsurf-queries.ll
tid-range.ll [SelectionDAG] Correctly transform range metadata to AssertZExt 2017-01-06 00:11:46 +00:00
tuple-literal.ll
vec-param-load.ll [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values. 2017-02-21 22:56:05 +00:00
vec8.ll Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB." 2017-05-18 18:50:05 +00:00
vector-args.ll
vector-call.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00
vector-compare.ll
vector-global.ll
vector-loads.ll
vector-select.ll
vector-stores.ll
vectorize-misaligned.ll [NVPTX] Implement isLegalToVectorizeLoadChain 2018-08-27 17:29:43 +00:00
vote.ll [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. 2017-09-21 18:44:49 +00:00
weak-global.ll
weak-linkage.ll
wmma.py [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. 2018-04-18 21:51:48 +00:00
zeroext-32bit.ll [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) 2017-05-15 17:17:44 +00:00