Justin Holewinski
428cf0e49a
[NVPTX] Improve handling of FP fusion
...
We now consider the FPOpFusion flag when determining whether
to fuse ops. We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its
own.
llvm-svn: 213287
2014-07-17 18:10:09 +00:00
Justin Holewinski
e5a1173f67
[NVPTX] Add missing .v4 qualifier on vector store instruction
...
llvm-svn: 213276
2014-07-17 16:58:56 +00:00
Justin Holewinski
360a5cfcd3
[NVPTX] Add support for [SHL,SRA,SRL]_PARTS
...
llvm-svn: 211936
2014-06-27 18:35:40 +00:00
Justin Holewinski
eafe26d082
[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
...
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer
llvm-svn: 211935
2014-06-27 18:35:37 +00:00
Justin Holewinski
832e09b4d9
[NVPTX] Add support for efficient rotate instructions on SM 3.2+
...
llvm-svn: 211934
2014-06-27 18:35:33 +00:00
Justin Holewinski
ca7a4f136d
[NVPTX] Add isel patterns for bit-field extract (bfe)
...
llvm-svn: 211932
2014-06-27 18:35:27 +00:00
Justin Holewinski
10c25968d8
[NVPTX] Add support for isspacep instruction
...
llvm-svn: 211931
2014-06-27 18:35:24 +00:00
Justin Holewinski
7706107e6f
[NVPTX] Add missing patterns for div.approx with immediate denominator
...
llvm-svn: 199746
2014-01-21 14:40:05 +00:00
Justin Holewinski
3d49e5c655
[NVPTX] Fix handling of indirect calls
...
Using a special machine node is cleaner than an InlineAsm node, and fixes an assertion failure in InstrEmitter
llvm-svn: 194810
2013-11-15 12:30:04 +00:00
Justin Holewinski
debe686f05
[NVPTX] Add missing patterns for i1 [s,u]int_to_fp
...
llvm-svn: 187800
2013-08-06 14:13:34 +00:00
Justin Holewinski
871ec93909
[NVPTX] Fix bug in stack code generation causes by MC conversion
...
We do use a very small set of physical registers, so account for
them in the virtual register encoding between MachineInstr and MC
llvm-svn: 187799
2013-08-06 14:13:31 +00:00
Justin Holewinski
cd069e6dec
[NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append
...
.ftz to instructions if the nvptx-f32ftz attribute is set to "true"
llvm-svn: 186820
2013-07-22 12:18:04 +00:00
Justin Holewinski
318c625ff4
[NVPTX] Add support for native SIGN_EXTEND_INREG where available
...
llvm-svn: 185330
2013-07-01 12:58:56 +00:00
Justin Holewinski
8df08c73c6
[NVPTX] Select -1 instead of 1 when anyextend'ing i1 types
...
This makes it more consistent with the ZeroOrNegativeOneBooleanContent flag
llvm-svn: 185179
2013-06-28 17:58:15 +00:00
Justin Holewinski
af258be134
[NVPTX] Add (1.0 / sqrt(x)) => rsqrt(x) generation when allowable by FP flags
...
llvm-svn: 185178
2013-06-28 17:58:13 +00:00
Justin Holewinski
dc372df63b
[NVPTX] Add support for cttz/ctlz/ctpop
...
llvm-svn: 185176
2013-06-28 17:58:07 +00:00
Justin Holewinski
dc5e3b68f5
[NVPTX] Clean up comparison/select/convert patterns and factor out PTX instructions from their patterns
...
Test case is no breakage
llvm-svn: 185175
2013-06-28 17:58:04 +00:00
Justin Holewinski
f8f7091722
[NVPTX] Remove i8 register class. PTX support for i8 (.b8, .u8, .s8) is rather poor and we're better off just ignoring it and letting LLVM expand all i8 ops out to i16.
...
llvm-svn: 185174
2013-06-28 17:57:59 +00:00
Justin Holewinski
fe44314f21
[NVPTX] Add infrastructure for vector loads/stores of parameters
...
llvm-svn: 185171
2013-06-28 17:57:51 +00:00
Justin Holewinski
48f4ad3fc0
[NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
...
llvm-svn: 182394
2013-05-21 16:51:30 +00:00
Justin Holewinski
be8dc6499a
[NVPTX] Disable vector registers
...
Vectors were being manually scalarized by the backend. Instead,
let the target-independent code do all of the work. The manual
scalarization was from a time before good target-independent support
for scalarization in LLVM. However, this forces us to specially-handle
vector loads and stores, which we can turn into PTX instructions that
produce/consume multiple operands.
llvm-svn: 174968
2013-02-12 14:18:49 +00:00
Benjamin Kramer
bde9176663
Fix typos found by http://github.com/lyda/misspell-check
...
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Justin Holewinski
ae556d3ef7
This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
...
The new target machines are:
nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX
The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.
NV_CONTRIB
llvm-svn: 156196
2012-05-04 20:18:50 +00:00