Commit Graph

87 Commits

Author SHA1 Message Date
Justin Holewinski a0d531f031 [NVPTX] Add reflect intrinsic (better than matching by function name)
Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there.

llvm-svn: 211948
2014-06-27 18:36:11 +00:00
Justin Holewinski 2739c0175c [NVPTX] Add 'b' asm constraint
llvm-svn: 211946
2014-06-27 18:36:06 +00:00
Justin Holewinski 549c773619 [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization
llvm-svn: 211943
2014-06-27 18:36:01 +00:00
Justin Holewinski 773ca40f5d [NVPTX] Add support for .managed variables for UVM
llvm-svn: 211942
2014-06-27 18:35:58 +00:00
Justin Holewinski d73767a80a [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage
llvm-svn: 211941
2014-06-27 18:35:56 +00:00
Justin Holewinski b926d9d446 [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

llvm-svn: 211939
2014-06-27 18:35:51 +00:00
Justin Holewinski 6e40f63e41 [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors
llvm-svn: 211938
2014-06-27 18:35:44 +00:00
Justin Holewinski 360a5cfcd3 [NVPTX] Add support for [SHL,SRA,SRL]_PARTS
llvm-svn: 211936
2014-06-27 18:35:40 +00:00
Justin Holewinski eafe26d082 [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

llvm-svn: 211935
2014-06-27 18:35:37 +00:00
Justin Holewinski 832e09b4d9 [NVPTX] Add support for efficient rotate instructions on SM 3.2+
llvm-svn: 211934
2014-06-27 18:35:33 +00:00
Justin Holewinski 7be57de6b8 [NVPTX] Add missing isel patterns for 64-bit atomics
llvm-svn: 211933
2014-06-27 18:35:30 +00:00
Justin Holewinski ca7a4f136d [NVPTX] Add isel patterns for bit-field extract (bfe)
llvm-svn: 211932
2014-06-27 18:35:27 +00:00
Justin Holewinski 10c25968d8 [NVPTX] Add support for isspacep instruction
llvm-svn: 211931
2014-06-27 18:35:24 +00:00
Justin Holewinski 124fc1951f [NVPTX] Add support for envreg reads
llvm-svn: 211930
2014-06-27 18:35:21 +00:00
Justin Holewinski 7d5bf66f61 [NVPTX] Emit .weak when linkage is not external, internal, or private
llvm-svn: 211926
2014-06-27 18:35:10 +00:00
Jingyue Wu baabe5091c Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll

llvm-svn: 211004
2014-06-15 21:40:57 +00:00
Alp Toker d3d017cf00 Reduce verbiage of lit.local.cfg files
We can just split targets_to_build in one place and make it immutable.

llvm-svn: 210496
2014-06-09 22:42:55 +00:00
Rafael Espindola 42a4c9f9e0 Allow aliases to be unnamed_addr.
Alias with unnamed_addr were in a strange state. It is stored in GlobalValue,
the language reference talks about "unnamed_addr aliases" but the verifier
was rejecting them.

It seems natural to allow unnamed_addr in aliases:

* It is a property of how it is accessed, not of the data itself.
* It is perfectly possible to write code that depends on the address
of an alias.

This patch then makes unname_addr legal for aliases. One side effect is that
the syntax changes for a corner case: In globals, unnamed_addr is now printed
before the address space.

llvm-svn: 210302
2014-06-06 01:20:28 +00:00
Eli Bendersky 7cd70df708 Fix the test: DCE optimized away everything.
Use volatile store to protect the generated PTX from DCE.

Patch by Jingyue Wu.

llvm-svn: 206763
2014-04-21 17:23:12 +00:00
Justin Holewinski 30d56a7b86 [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces
This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later).

llvm-svn: 205907
2014-04-09 15:39:15 +00:00
Justin Holewinski 9d852a8e08 [NVPTX] Add support for addrspacecast in global variable initializers, including emitting generic() when casting to address space 0.
llvm-svn: 205906
2014-04-09 15:39:11 +00:00
Eli Bendersky bbef172f19 Optimize away unnecessary address casts.
Removes unnecessary casts from non-generic address spaces to the generic address
space for certain code patterns.

Patch by Jingyue Wu.

llvm-svn: 205571
2014-04-03 21:18:25 +00:00
Eli Bendersky 264cd4672d Fix for PR19099 - NVPTX produces invalid symbol names.
This is a more thorough fix for the issue than r203483. An IR pass will run
before NVPTX codegen to make sure there are no invalid symbol names that can't
be consumed by the ptxas assembler.

llvm-svn: 205212
2014-03-31 15:56:26 +00:00
Eli Bendersky a7b6679774 Add test to test/CodeGen/NVPTX for "alloca buffer" arguments.
Make sure such IR gets properly lowered to PTX.

llvm-svn: 204624
2014-03-24 16:52:30 +00:00
Justin Holewinski ba2fa6de4f [NVPTX] Add isel patterns for addrspacecast
llvm-svn: 204600
2014-03-24 11:17:53 +00:00
Eli Bendersky 2281ef91e6 Expose "noduplicate" attribute as a property for intrinsics.
The "noduplicate" function attribute exists to prevent certain optimizations
from duplicating calls to the function. This is important on platforms where
certain function call duplications are unsafe (for example execution barriers
for CUDA and OpenCL).

This patch makes it possible to specify intrinsics as "noduplicate" and
translates that to the appropriate function attribute.

llvm-svn: 204200
2014-03-18 23:51:07 +00:00
Rafael Espindola 2fb5bc33a3 Remove the linker_private and linker_private_weak linkages.
These linkages were introduced some time ago, but it was never very
clear what exactly their semantics were or what they should be used
for. Some investigation found these uses:

* utf-16 strings in clang.
* non-unnamed_addr strings produced by the sanitizers.

It turns out they were just working around a more fundamental problem.
For some sections a MachO linker needs a symbol in order to split the
section into atoms, and llvm had no idea that was the case. I fixed
that in r201700 and it is now safe to use the private linkage. When
the object ends up in a section that requires symbols, llvm will use a
'l' prefix instead of a 'L' prefix and things just work.

With that, these linkages were already dead, but there was a potential
future user in the objc metadata information. I am still looking at
CGObjcMac.cpp, but at this point I am convinced that linker_private
and linker_private_weak are not what they need.

The objc uses are currently split in

* Regular symbols (no '\01' prefix). LLVM already directly provides
whatever semantics they need.
* Uses of a private name (start with "\01L" or "\01l") and private
linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm
agrees with clang on L being ok or not for a given section. I have two
patches in code review for this.
* Uses of private name and weak linkage.

The last case is the one that one could think would fit one of these
linkages. That is not the case. The semantics are

* the linker will merge these symbol by *name*.
* the linker will hide them in the final DSO.

Given that the merging is done by name, any of the private (or
internal) linkages would be a bad match. They allow llvm to rename the
symbols, and that is really not what we want. From the llvm point of
view, these objects should really be (linkonce|weak)(_odr)?.

For now, just keeping the "\01l" prefix is probably the best for these
symbols. If we one day want to have a more direct support in llvm,
IMHO what we should add is not a linkage, it is just a hidden_symbol
attribute. It would be applicable to multiple linkages. For example,
on weak it would produce the current behavior we have for objc
metadata. On internal, it would be equivalent to private (and we
should then remove private).

llvm-svn: 203866
2014-03-13 23:18:37 +00:00
Eli Bendersky d47a5c2d3f Followup to r203483 - add test.
[forgot to 'svn add' before committing r203483]

llvm-svn: 203485
2014-03-10 20:36:04 +00:00
Gautam Chakrabarti 2c283400f9 [NVPTX] Fix emitting aggregate parameters
The code was missing the case for aggregate parameters and
hence was emitting them as .b0 type. Also fixed a couple
of comments.

llvm-svn: 200325
2014-01-28 18:35:29 +00:00
Justin Holewinski 7706107e6f [NVPTX] Add missing patterns for div.approx with immediate denominator
llvm-svn: 199746
2014-01-21 14:40:05 +00:00
Nico Rieck b5262d6d8f Fix non-deterministic SDNodeOrder-dependent codegen
Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code
generation.

llvm-svn: 199050
2014-01-12 14:09:17 +00:00
Justin Holewinski 4459717bab [NVPTX] Fix off-by-one error when creating the VT list for an SDNode
llvm-svn: 196503
2013-12-05 12:58:00 +00:00
Justin Holewinski 3d49e5c655 [NVPTX] Fix handling of indirect calls
Using a special machine node is cleaner than an InlineAsm node, and fixes an assertion failure in InstrEmitter

llvm-svn: 194810
2013-11-15 12:30:04 +00:00
Justin Holewinski 124e93de93 [NVPTX] Properly handle bitcast ConstantExpr when checking for the alignment of function parameters
llvm-svn: 194410
2013-11-11 19:28:19 +00:00
Justin Holewinski 4f5bc9b33a [NVPTX] Fix logic error in loading vector parameters of more than 4 components
llvm-svn: 194409
2013-11-11 19:28:16 +00:00
Justin Holewinski a51418c1ca [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc
Fixes PR17529

llvm-svn: 192445
2013-10-11 12:39:39 +00:00
Justin Holewinski 660597d190 Make AsmPrinter::emitImplicitDef a virtual method so targets can emit custom comments for implicit defs
For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers,
while NVPTX uses virtual registers (with a couple of exceptions).  Now, the implicit def comment will be
emitted as a true PTX register name. Other targets can use this to customize the output of implicit def
comments.

Fixes PR17519

llvm-svn: 192444
2013-10-11 12:39:36 +00:00
Justin Holewinski a54daa4640 [NVPTX] Make constant vector test case endian-independent
llvm-svn: 190998
2013-09-19 13:14:44 +00:00
Justin Holewinski 95564bdf5e [NVPTX] Support constant vector globals
llvm-svn: 190997
2013-09-19 12:51:46 +00:00
Justin Holewinski aaa8b6e355 [NVPTX] Re-enable assembly printing support for inline assembly
This support was removed by accident during the MC conversion

llvm-svn: 189160
2013-08-24 01:17:23 +00:00
Daniel Dunbar 9efbedfd35 [tests] Cleanup initialization of test suffixes.
- Instead of setting the suffixes in a bunch of places, just set one master
   list in the top-level config. We now only modify the suffix list in a few
   suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).

 - Aside from removing the need for a bunch of lit.local.cfg files, this enables
   4 tests that were inadvertently being skipped (one in
   Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
   CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
   XFAILED).

 - This commit also fixes a bunch of config files to use config.root instead of
   older copy-pasted code.

llvm-svn: 188513
2013-08-16 00:37:11 +00:00
Justin Holewinski debe686f05 [NVPTX] Add missing patterns for i1 [s,u]int_to_fp
llvm-svn: 187800
2013-08-06 14:13:34 +00:00
Justin Holewinski 871ec93909 [NVPTX] Fix bug in stack code generation causes by MC conversion
We do use a very small set of physical registers, so account for
them in the virtual register encoding between MachineInstr and MC

llvm-svn: 187799
2013-08-06 14:13:31 +00:00
Justin Holewinski a2a63d28df [NVPTX] Start conversion to MC infrastructure
This change converts the NVPTX target to use the MC infrastructure
instead of directly emitting MachineInstr instances. This brings
the target more up-to-date with LLVM TOT, and should fix PR15175
and PR15958 (libNVPTXInstPrinter is empty) as a side-effect.

llvm-svn: 187798
2013-08-06 14:13:27 +00:00
Justin Holewinski d3f2035a3c Add a target legalize hook for SplitVectorOperand (again)
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

Attempt to fix the buildbots by making the X86 test I just added platform independent

llvm-svn: 187202
2013-07-26 13:28:29 +00:00
Rafael Espindola 1d812728cc Revert "Add a target legalize hook for SplitVectorOperand"
This reverts commit 187198. It broke the bots.

The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".

llvm-svn: 187201
2013-07-26 13:18:16 +00:00
Justin Holewinski f848a24e50 Add a target legalize hook for SplitVectorOperand
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

llvm-svn: 187198
2013-07-26 12:46:39 +00:00
Justin Holewinski cd069e6dec [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append
.ftz to instructions if the nvptx-f32ftz attribute is set to "true"

llvm-svn: 186820
2013-07-22 12:18:04 +00:00
Stephen Lin f799e3f944 Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion.
This was done with the following sed invocation to catch label lines demarking function boundaries:
    sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll
which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct.

llvm-svn: 186258
2013-07-13 20:38:47 +00:00
Justin Holewinski d2bbdf05e0 [NVPTX] Add support for module-scope inline asm
Since we were explicitly not calling AsmPrinter::doInitialization,
any module-scope inline asm was not being printed.

llvm-svn: 185336
2013-07-01 13:00:14 +00:00