Commit Graph

36 Commits

Author SHA1 Message Date
Dan Bailey b7ee561399 PTX: Reverting implementation of i8.
The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.

llvm-svn: 133873
2011-06-25 18:16:28 +00:00
Dan Bailey dd01c4ac7a PTX: Add support for i8 type and introduce associated .b8 registers
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.

llvm-svn: 133814
2011-06-24 19:27:10 +00:00
Justin Holewinski 1be944a466 PTX: Add preliminary support for outputting debug information in the form of
.file and .loc directives.

Ideally, we would utilize the existing support in AsmPrinter for this, but
I cannot find a way to get .file and .loc directives to print without the
rest of the associated DWARF sections, which ptxas cannot handle.

llvm-svn: 133812
2011-06-24 19:19:18 +00:00
Justin Holewinski 7d65756aba PTX: Re-work target sm/compute selection and add some basic GPU
targets: g80, gt200, gf100(fermi)

llvm-svn: 133799
2011-06-24 16:27:49 +00:00
Justin Holewinski 6cdd72a9ca PTX: Always use registers for return values, but use .param space for device
parameters if SM >= 2.0

- Update test cases to be more robust against register allocation changes
- Bump up the number of registers to 128 per type
- Include Python script to re-generate register file with any number of
  registers

llvm-svn: 133736
2011-06-23 18:10:13 +00:00
Justin Holewinski 90b336ba0c PTX: Whitespace fixes and remove commented out code
llvm-svn: 133734
2011-06-23 18:10:07 +00:00
Justin Holewinski 9b459d9d7b PTX: Prevent DCE from eliminating st.param calls, and unify the handling of
st.param and ld.param

FIXME: Test cases still need to be updated
llvm-svn: 133733
2011-06-23 18:10:05 +00:00
Justin Holewinski 9472541ba1 PTX: Use .param space for parameters in device functions for SM >= 2.0
FIXME: DCE is eliminating the final st.param.x calls, figure out why
llvm-svn: 133732
2011-06-23 18:10:03 +00:00
Justin Holewinski 08d0f3550a PTX: Fix FrameIndex mapping bug
llvm-svn: 133619
2011-06-22 16:07:03 +00:00
Justin Holewinski 54e3c0f5d9 PTX: Add .address_size directive if PTX version >= 2.3
Patch by Wei-Ren Chen

llvm-svn: 133589
2011-06-22 00:43:56 +00:00
Justin Holewinski c7b073da0f PTX: Add basic register spilling code
The current implementation generates stack loads/stores, which are
really just mov instructions from/to "special" registers.  This may
not be the most efficient implementation, compared to an approach where
the stack registers are directly folded into instructions, but this is
easier to implement and I have yet to see a case where ptxas is unable
to see through this kind of register usage and know what is really
going on.

llvm-svn: 133443
2011-06-20 15:56:20 +00:00
Jay Foad 6002068c13 Fix a FIXME by making GlobalVariable::getInitializer() return a
const Constant *.

llvm-svn: 133400
2011-06-19 18:37:11 +00:00
Justin Holewinski 7f191b2a3b PTX: Finish new calling convention implementation
llvm-svn: 133172
2011-06-16 17:50:00 +00:00
Justin Holewinski 6b356c1f3f PTX: Rename register classes for readability and combine int and fp registers
llvm-svn: 133171
2011-06-16 17:49:58 +00:00
Justin Holewinski 5ccf812b1d PTX: Fix whitespace errors
llvm-svn: 133158
2011-06-16 15:17:11 +00:00
Justin Holewinski 638f0eb116 PTX: patch to AsmPrinter
- immediate value cast as long not int
- handles initializer for constant array

Patch by Dan Bailey

llvm-svn: 130352
2011-04-28 00:19:50 +00:00
Justin Holewinski 7d8895e767 PTX: Add intrinsics to list of built-in intrinsics, which allows them to be
used by Clang.  To help Clang integration, the PTX target has been split
     into two targets: ptx32 and ptx64, depending on the desired pointer size.

- Add GCCBuiltin class to all intrinsics
- Split PTX target into ptx32 and ptx64

llvm-svn: 129851
2011-04-20 15:37:17 +00:00
Justin Holewinski 0984dcc077 PTX: Fix various codegen issues
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)

llvm-svn: 127895
2011-03-18 19:24:28 +00:00
Che-Liang Chiou b1df0fe1cc ptx: fix parameter order that is reversed
llvm-svn: 127874
2011-03-18 11:23:56 +00:00
Che-Liang Chiou ff9d938e33 ptx: add unconditional and conditional branch
llvm-svn: 127873
2011-03-18 11:08:52 +00:00
Justin Holewinski fbc8d301bf PTX: Emit global arrays with proper sizes
- Emit all arrays as type .b8 and proper sizes in bytes to conform
  to the output of nvcc

llvm-svn: 127584
2011-03-14 15:40:11 +00:00
Che-Liang Chiou a19f075974 ptx: add set.p instruction and related changes to predicate execution
llvm-svn: 127577
2011-03-14 11:26:01 +00:00
Che-Liang Chiou 58bae0e957 ptx: add basic support of predicate execution
llvm-svn: 127569
2011-03-13 17:26:00 +00:00
Justin Holewinski 969dfbcff6 PTX: Fix a couple of lint violations
llvm-svn: 126936
2011-03-03 13:34:29 +00:00
Che-Liang Chiou 7ed32cc51b ptx: fix lint and compiler warnings
llvm-svn: 126838
2011-03-02 07:58:46 +00:00
Che-Liang Chiou 59515dc703 Add 64-bit addressing to PTX backend
- Add '64bit' sub-target option.
- Select 32-bit/64-bit loads/stores based on '64bit' option.
- Fix function parameter order.

Patch by Justin Holewinski

llvm-svn: 126837
2011-03-02 07:36:48 +00:00
Che-Liang Chiou 65b1476031 Extend initial support for primitive types in PTX backend
- Allow i16, i32, i64, float, and double types, using the native .u16,
  .u32, .u64, .f32, and .f64 PTX types.
- Allow loading/storing of all primitive types.
- Allow primitive types to be passed as parameters.
- Allow selection of PTX Version and Shader Model as sub-target attributes.
- Merge integer/floating-point test cases for load/store.
- Use .u32 instead of .s32 to conform to output from NVidia nvcc compiler.

Patch by Justin Holewinski

llvm-svn: 126824
2011-03-02 03:20:28 +00:00
Che-Liang Chiou 75a800d3bf Add preliminary support for .f32 in the PTX backend.
- Add appropriate TableGen patterns for fadd, fsub, fmul.
- Add .f32 as the PTX type for the LLVM float type.
- Allow parameters, return values, and global variable declarations
  to accept the float type.
- Add appropriate test cases.

Patch by Justin Holewinski

llvm-svn: 126636
2011-02-28 06:34:09 +00:00
Che-Liang Chiou 84fde9ef2b ptx: add passing parameter to kernel functions
llvm-svn: 125279
2011-02-10 12:01:24 +00:00
Che-Liang Chiou 3ee0501338 ptx: add state spaces
llvm-svn: 122638
2010-12-30 10:41:27 +00:00
Che-Liang Chiou aaedf8be1c ptx: add ld instruction and test
llvm-svn: 122398
2010-12-22 10:38:51 +00:00
Che-Liang Chiou b2f77f6206 ptx: bug fix: use after free
llvm-svn: 120571
2010-12-01 11:45:53 +00:00
Che-Liang Chiou e9baf13657 ptx: add command-line options for gpu target and ptx version
llvm-svn: 120423
2010-11-30 10:14:14 +00:00
Che-Liang Chiou d816204056 ptx: add ld instruction
support register and register-immediate addressing mode

todo: immediate and register-register addressing mode
llvm-svn: 120407
2010-11-30 07:34:44 +00:00
Che-Liang Chiou c03d04ee1f Add simple arithmetics and %type directive for PTX
llvm-svn: 119485
2010-11-17 08:08:49 +00:00
Chris Lattner 66031ed839 move all the target's asmprinters into the main target. The piece
that should be split out is the InstPrinter (if a target is mc'ized).
This change makes all the targets be consistent.

llvm-svn: 119056
2010-11-14 18:43:56 +00:00