llvm-project

History

Bixia Zheng 6c21ccd245 [NVPTX] Fix the codegen for llvm.round. Summary: Previously, we translate llvm.round to PTX cvt.rni, which rounds to the even interger when the source is equidistant between two integers. This is not correct as llvm.round should round away from zero. This change replaces llvm.round with a round away from zero implementation through target specific custom lowering. Modify a few affected tests to not check for cvt.rni. Instead, we check for the use of a few constants used in implementing round. We are also adding CUDA runnable tests to check for the values produced by llvm.round to test-suites/External/CUDA. Reviewers: tra Subscribers: jholewinski, sanjoy, jlebar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59947 llvm-svn: 357407		2019-04-01 16:10:26 +00:00
..
LoadStoreVectorizer.ll	[NVPTX] Added support for .f16x2 instructions.	2017-02-23 22:38:24 +00:00
MachineSink-call.ll	…
MachineSink-convergent.ll	…
TailDuplication-convergent.ll	…
access-non-generic.ll	[NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM.	2018-02-28 23:57:48 +00:00
add-128bit.ll	[DAGCombiner] add missing folds for scalar select of {-1,0,1}	2017-02-24 17:17:33 +00:00
addrspacecast-gvar.ll	…
addrspacecast.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
aggr-param.ll	…
aggregate-return.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
alias.ll	…
annotations.ll	Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll.	2016-12-14 22:32:55 +00:00
arg-lowering.ll	…
arithmetic-fp-sm20.ll	…
arithmetic-int.ll	…
atomics-sm60.ll	[NVPTX] Implement __nvvm_atom_add_gen_d builtin.	2017-11-07 22:10:54 +00:00
atomics-with-scope.ll	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions.	2016-09-28 17:25:38 +00:00
atomics.ll	…
barrier.ll	[NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.	2017-09-21 18:44:49 +00:00
bfe.ll	…
branch-fold.ll	…
bug17709.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
bug21465.ll	…
bug22246.ll	…
bug22322.ll	DAG: Check no-signed-zeros instead of unsafe-fp-math	2018-08-12 19:09:12 +00:00
bug26185-2.ll	…
bug26185.ll	…
bypass-div.ll	…
call-with-alloca-buffer.ll	…
callchain.ll	…
calling-conv.ll	…
calls-with-phi.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
combine-min-max.ll	[NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic.	2017-01-18 00:09:01 +00:00
compare-int.ll	…
constant-vectors.ll	…
convergent-mir-call.ll	…
convert-fp.ll	[NVPTX] Add fptosi tests to convert-fp.ll.	2017-01-15 16:55:54 +00:00
convert-int-sm20.ll	…
ctlz.ll	NFC - Various typo fixes in tests	2018-07-04 13:28:39 +00:00
ctpop.ll	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)	2017-05-12 19:56:43 +00:00
cttz.ll	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)	2017-05-12 19:56:43 +00:00
disable-opt.ll	…
div-ri.ll	…
divrem-combine.ll	[NVPTX] Compute 'rem' using the result of 'div', if possible.	2016-10-28 21:44:00 +00:00
envreg.ll	…
extloadv.ll	…
f16-instructions.ll	[NVPTX] Fix the codegen for llvm.round.	2019-04-01 16:10:26 +00:00
f16x2-instructions.ll	[NVPTX] Fix the codegen for llvm.round.	2019-04-01 16:10:26 +00:00
fast-math.ll	[NVPTX] Enable combineRepeatedFPDivisors for NVPTX.	2017-02-03 15:13:50 +00:00
fcos-no-fast-math.ll	[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed.	2017-01-13 18:48:13 +00:00
fma-assoc.ll	[DAGCombine] require UnsafeFPMath for re-association of addition	2017-01-31 14:35:37 +00:00
fma-disable.ll	…
fma.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
fns.ll	[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.	2017-12-06 17:50:05 +00:00
fp-contract.ll	…
fp-literals.ll	…
fp16.ll	…
fsin-no-fast-math.ll	[NVPTX] Only lower sin/cos to approximate instructions if unsafe math is allowed.	2017-01-13 18:48:13 +00:00
function-align.ll	…
generic-to-nvvm-ir.ll	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.	2018-05-09 02:40:45 +00:00
generic-to-nvvm.ll	[NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM.	2018-02-28 23:57:48 +00:00
global-addrspace.ll	…
global-ctor-empty.ll	…
global-ctor.ll	…
global-dtor.ll	…
global-ordering.ll	…
global-variable-big.ll	[NVPTX] Support global variables of integer type larger than i64.	2017-01-18 00:29:53 +00:00
global-visibility.ll	…
globals_init.ll	…
globals_lowering.ll	…
gvar-init.ll	…
half.ll	[NVPTX] add support for initializing fp16 arrays.	2018-04-06 22:25:08 +00:00
i1-global.ll	…
i1-int-to-fp.ll	…
i1-param.ll	…
i8-param.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
i128-global.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
i128-param.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
i128-retval.ll	[NVPTX] Add lowering of i128 params.	2017-07-20 21:16:03 +00:00
i128-struct.ll	[NVPTX] Add lowering of i128 numbers as struct fields	2018-12-01 00:21:52 +00:00
idioms.ll	[NVPTX] Lower integer absolute value idiom to abs instruction.	2017-01-18 00:08:44 +00:00
imad.ll	…
inline-asm.ll	…
intrin-nocapture.ll	…
intrinsic-old.ll	[FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	2018-07-11 20:25:49 +00:00
intrinsics.ll	[NVPTX] Some nvvm.read.ptx.sreg intrinsics should have IntrInaccessibleMemOnly attribute.	2019-01-26 00:28:32 +00:00
isspacep.ll	…
ld-addrspace.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
ld-generic.ll	…
ld-st-addrrspace.py	Python compat - print statement	2019-01-03 14:11:33 +00:00
ldg-invariant.ll	[NVPTX] Fixed vectorized LDG for f16.	2018-04-06 21:10:24 +00:00
ldparam-v4.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
ldu-i8.ll	…
ldu-ldg.ll	…
ldu-reg-plus-offset.ll	…
libcall-fulfilled.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
libcall-instruction.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
libcall-intrinsic.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
lit.local.cfg	…
load-sext-i1.ll	…
load-store.ll	[NVPTX] Select atomic loads and stores	2018-08-09 07:45:49 +00:00
load-with-non-coherent-cache.ll	…
local-stack-frame.ll	…
loop-vectorize.ll	…
lower-aggr-copies.ll	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)	2018-01-19 17:13:12 +00:00
lower-alloca.ll	NVPTX: Move InferAddressSpaces to generic code	2017-01-31 01:10:58 +00:00
lower-kernel-ptr-arg.ll	…
machine-sink.ll	…
managed.ll	…
match.ll	[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.	2017-09-26 17:07:23 +00:00
math-intrins.ll	[NVPTX] Fix the codegen for llvm.round.	2019-04-01 16:10:26 +00:00
minmax-negative.ll	Improve clamp recognition in ValueTracking.	2017-10-27 20:53:41 +00:00
misaligned-vector-ldst.ll	[NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors.	2017-03-07 20:33:38 +00:00
module-inline-asm.ll	…
mulwide.ll	…
named-barriers.ll	[NVPTX] Add intrinsics to support named barriers.	2017-01-28 16:38:15 +00:00
noduplicate-syncthreads.ll	…
nofunc.ll	[NVPTX] do not rely on cached subtarget info.	2018-12-12 18:31:04 +00:00
nounroll.ll	…
nvcl-param-align.ll	…
nvvm-reflect-arch.ll	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").	2018-08-03 18:05:24 +00:00
nvvm-reflect-module-flag.ll	…
nvvm-reflect.ll	[NVPTX] Let there be One True Way to set NVVMReflect params.	2017-01-15 16:54:35 +00:00
param-align.ll	…
param-load-store.ll	[Codegen] fix typos in test case	2019-03-02 08:03:59 +00:00
pr13291-i1-store.ll	…
pr16278.ll	…
pr17529.ll	…
proxy-reg-erasure-mir.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
proxy-reg-erasure-ptx.ll	[NVPTX] Allow libcalls that are defined in the current module.	2018-12-26 19:12:31 +00:00
read-global-variable-constant.ll	[NVPTX] Lower loads from global constants using ld.global.nc (aka LDG).	2018-02-28 23:58:05 +00:00
refl1.ll	…
reg-copy.ll	…
reg-types.ll	[NVPTX] Use untyped (.b) integer registers in PTX.	2016-08-12 22:02:19 +00:00
rotate.ll	…
sched1.ll	Only enable LiveRangeShrink for x86.	2017-05-17 20:18:13 +00:00
sched2.ll	Only enable LiveRangeShrink for x86.	2017-05-17 20:18:13 +00:00
sext-in-reg.ll	…
sext-params.ll	…
shfl-sync.ll	[NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.	2017-09-20 21:23:07 +00:00
shfl.ll	[NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.	2016-10-31 21:51:42 +00:00
shift-parts.ll	…
simple-call.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
sm-version-20.ll	…
sm-version-21.ll	…
sm-version-30.ll	…
sm-version-32.ll	…
sm-version-35.ll	…
sm-version-37.ll	…
sm-version-50.ll	…
sm-version-52.ll	…
sm-version-53.ll	…
sm-version-60.ll	…
sm-version-61.ll	…
sm-version-62.ll	…
sm-version-70.ll	[CUDA] Added rudimentary support for CUDA-9 and sm_70.	2017-09-07 18:14:32 +00:00
speculative-execution-divergent-target.ll	…
sqrt-approx.ll	[NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x).	2017-01-31 23:08:57 +00:00
st-addrspace.ll	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	2018-05-09 23:46:19 +00:00
st-generic.ll	…
surf-read-cuda.ll	…
surf-read.ll	…
surf-write-cuda.ll	…
surf-write.ll	…
symbol-naming.ll	[NVPTX] Assign valid global names	2017-12-04 14:19:33 +00:00
tex-read-cuda.ll	…
tex-read.ll	…
texsurf-queries.ll	…
tid-range.ll	[SelectionDAG] Correctly transform range metadata to AssertZExt	2017-01-06 00:11:46 +00:00
tuple-literal.ll	…
vec-param-load.ll	[NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.	2017-02-21 22:56:05 +00:00
vec8.ll	Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB."	2017-05-18 18:50:05 +00:00
vector-args.ll	…
vector-call.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00
vector-compare.ll	…
vector-global.ll	…
vector-loads.ll	…
vector-select.ll	…
vector-stores.ll	…
vectorize-misaligned.ll	[NVPTX] Implement isLegalToVectorizeLoadChain	2018-08-27 17:29:43 +00:00
vote.ll	[NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.	2017-09-21 18:44:49 +00:00
weak-global.ll	…
weak-linkage.ll	…
wmma.py	Python compat - print statement	2019-01-03 14:11:33 +00:00
zeroext-32bit.ll	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)	2017-05-15 17:17:44 +00:00