Commit Graph

28 Commits

Author SHA1 Message Date
Bill Schmidt ccbe0a8022 [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for LLVM correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.

The vec_sums and vec_vsumsws interfaces in altivec.h are also fixed,
because they used vec_perm calls intended to be recognized as vsldoi
instructions.  These vec_perm calls are now replaced with code that
more clearly shows the intent of the transformation.

llvm-svn: 214801
2014-08-04 23:21:26 +00:00
Joerg Sonnenberger 466a31eb65 vcfsx and dss instructions require immediates, variables are not valid.
llvm-svn: 214635
2014-08-02 15:07:21 +00:00
Bill Schmidt 56a6967000 [PPC64LE] Fix vec_sld and vec_vsldoi for little endian
The vec_sld and vec_vsldoi interfaces perform a left-shift on vector
arguments for both big and little endian.  However, because they rely
on the vec_perm interface which is endian-dependent, the permutation
vector needs to be reversed for LE to get the proper shift direction.

I've added some extra testing for these interfaces for LE in the
builtins-ppc-altivec.c.

llvm-svn: 210657
2014-06-11 15:48:46 +00:00
Bill Schmidt 7f6596bb13 [PPC64LE] Implement little-endian semantics for vec_sums
The PowerPC vsumsws instruction, accessed via vec_sums, is defined
architecturally with a big-endian bias, in that the second input vector
and the result always reference big-endian element 3 (little-endian
element 0).  For ease of porting, the programmer wants elements 3 in
both cases.

To provide this semantics, for little endian we generate a permute for
the second input vector prior to the vsumsws instruction, and generate
a permute for the result vector following the vsumsws instruction.

The correctness of this code is tested by the new sums.c test added in
a previous patch, as well as the modifications to
builtins-ppc-altivec.c in the present patch.

llvm-svn: 210449
2014-06-09 03:31:47 +00:00
Bill Schmidt d7c53a91df [PPC64LE] Implement little-endian semantics for vec_unpack[hl]
The PowerPC vector-unpack-high and vector-unpack-low instructions
are defined architecturally with a big-endian bias, in that the vector
element numbering is assumed to be "left to right" regardless of
whether the processor is in big-endian or little-endian mode.  This
effectively reverses the meaning of "high" and "low."  Such a
definition is unnatural for little-endian code generation.

To facilitate ease of porting, the vec_unpackh and vec_unpackl
interfaces are designed to use natural element ordering, so that
elements are numbered according to little-endian design principles
when code is generated for a little-endian target.  The desired
semantics can be achieved by using the opposite instruction for
little-endian mode.  That is, when a call to vec_unpackh appears in
the code, a vector-unpack-low is generated, and when a call to
vec_unpackl appears in the code, a vector-unpack-high is generated.

The correctness of this code is tested by the new unpack.c test
added in a previous patch, as well as the modifications to
builtins-ppc-altivec.c in the present patch.

Note that these interfaces were originally incorrectly implemented
when they take a vector pixel argument.  This patch corrects this
implementation for both big- and little-endian code generation.

llvm-svn: 210391
2014-06-07 02:20:52 +00:00
Bill Schmidt 86f673a005 [PPC64LE] Update test for vec_sum2s interface
Commit r210384 prematurely included changes to the little-endian
implementation of the vec_sum2s interface.  This patch modifies
test/CodeGen/builtins-ppc-altivec.c to test those changes.

llvm-svn: 210389
2014-06-07 01:47:42 +00:00
Bill Schmidt 7f0a5c5141 [PPC64LE] Update builtins-ppc-altivec.c for PPC64 and PPC64LE
The Altivec builtin test case test/CodeGen/builtins-ppc-altivec.c has
always been executed only for 32-bit PowerPC.  These tests are equally
valid for 64-bit PowerPC.  This patch updates the test to be run for
three targets:  powerpc-unknown-unknown, powerpc64-unknown-unknown,
and powerpc64le-unknown-unknown.  The expected code generation changes
for some of the Altivec builtins for little endian, so this patch adds
new CHECK-LE variants to the test for the powerpc64le target.

These tests satisfy the testing requirements for some previous patches
committed over the last couple of days for lib/Headers/altivec.h:
r210279 for vec_perm, r210337 for vec_mul[eo], and r210340 for
vec_pack.

llvm-svn: 210384
2014-06-06 23:12:00 +00:00
NAKAMURA Takumi 7cbe30fc43 clang/test: REQUIRES: s/ppc{32|64}-registered-target/powerpc-registered-target/
llvm-svn: 196349
2013-12-04 03:41:15 +00:00
Stephen Lin 4362261b00 CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail.
llvm-svn: 188447
2013-08-15 06:47:53 +00:00
Anton Yartsev a3c9ba364e PR15480: fixed second parameter types of vec_lde, vec_lvebx, vec_lvehx, and vec_lvewx according to AltiVec Programming Interface Manual
llvm-svn: 176789
2013-03-10 16:25:43 +00:00
Jim Grosbach 2987c57924 Tests: check for target availability for target-specific tests.
Lots of tests are using an explicit target triple w/o first checking that the
target is actually available. Add a REQUIRES clause to a bunch of them. This should
hopefully unbreak bots which don't configure w/ all targets enabled.

llvm-svn: 159949
2012-07-09 18:34:21 +00:00
Eli Friedman 409943efcb Don't emit nsw flags for vector operations; there's basically no benefit, and a lot of downside (like PR9850, which is about clang's xmmintrin.h making an unexpected transformation on an expression involving _mm_add_epi32).
llvm-svn: 131000
2011-05-06 18:04:18 +00:00
Anton Yartsev 28ccef788b supported: AltiVec vector initialization with a single literal according to PIM section 2.5.1 - after initialization all elements have the value specified by the literal
llvm-svn: 128375
2011-03-27 09:32:40 +00:00
Anton Yartsev 85129b8a86 pre/post ++/-- for AltiVec vectors. (with builtins-ppc-altivec.c failure fixed)
llvm-svn: 125000
2011-02-07 02:17:30 +00:00
Eric Christopher 23ec82fa47 Revert r124146 for now. It appears to be failing on a few platforms.
llvm-svn: 124153
2011-01-24 23:07:03 +00:00
Anton Yartsev 3bad9afaba pre/post increase/decrease for AltiVec vectors
llvm-svn: 124146
2011-01-24 20:55:22 +00:00
Anton Yartsev 3f8f2886c1 comparison of AltiVec vectors now gives bool result (fix for 7533)
llvm-svn: 119678
2010-11-18 03:19:30 +00:00
Anton Yartsev 73d4023114 support for AltiVec extensions from the Cell architecture
llvm-svn: 116478
2010-10-14 14:37:46 +00:00
Anton Yartsev 583a1cf7b5 support for predicates with bool/pixel arguments
llvm-svn: 111515
2010-08-19 11:57:49 +00:00
Anton Yartsev fc83c60755 support for the rest of AltiVec functions with bool/pixel arguments and return values (except predicates)
llvm-svn: 111511
2010-08-19 03:21:36 +00:00
Anton Yartsev 9e96898032 support for vec_perm and all dependent functions (vec_mergeh, vec_mergel, vec_pack, vec_sld, vec_splat) with bool/pixel arguments and return values
llvm-svn: 111509
2010-08-19 03:00:09 +00:00
Anton Yartsev 2cc136d4e3 support for vec_add, vec_adds, vec_and, vec_andc with bool arguments
llvm-svn: 111141
2010-08-16 16:22:12 +00:00
Chris Lattner 3fcc790cd8 Change IR generation for return (in the simple case) to avoid doing silly
load/store nonsense in the epilog.  For example, for:

int foo(int X) {
  int A[100];
  return A[X];
}

we used to generate:

  %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
  store i32 %tmp1, i32* %retval
  %0 = load i32* %retval                          ; <i32> [#uses=1]
  ret i32 %0
}

which codegen'd to this code:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	subq	$408, %rsp              ## imm = 0x198
	movl	%edi, 400(%rsp)
	movl	400(%rsp), %edi
	movslq	%edi, %rax
	movl	(%rsp,%rax,4), %edi
	movl	%edi, 404(%rsp)
	movl	404(%rsp), %eax
	addq	$408, %rsp              ## imm = 0x198
	ret

Now we generate:

  %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
  ret i32 %tmp1
}

and:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	subq	$408, %rsp              ## imm = 0x198
	movl	%edi, 404(%rsp)
	movl	404(%rsp), %edi
	movslq	%edi, %rax
	movl	(%rsp,%rax,4), %eax
	addq	$408, %rsp              ## imm = 0x198
	ret

This actually does matter, cutting out 2000 lines of IR from CGStmt.ll 
for example.

Another interesting effect is that altivec.h functions which are dead
now get dce'd by the inliner.  Hence all the changes to 
builtins-ppc-altivec.c to ensure the calls aren't dead.

llvm-svn: 106970
2010-06-27 01:06:27 +00:00
Chris Lattner 51924e517b Implement support for -fwrapv, rdar://7221421
As part of this, pull together trapv handling into the same enum.

This also add support for NSW multiplies.

This also makes PCH disagreement on overflow behavior silent, since it
really doesn't matter except for warnings and codegen (no macros get 
defined etc).

llvm-svn: 106956
2010-06-26 21:25:03 +00:00
Chris Lattner 217e056e40 implement rdar://7432000 - signed negate should codegen as NSW.
While I'm in there, adjust pointer to member adjustments as well.

llvm-svn: 106955
2010-06-26 20:27:24 +00:00
Anton Korobeynikov cc50b7d7d5 More AltiVec support.
Patch by Anton Yartsev!

llvm-svn: 106387
2010-06-19 09:47:18 +00:00
Chris Lattner dd6697b4fa improve altivec c++ support by adding casts, patch by
Anton Yartsev!

llvm-svn: 101281
2010-04-14 20:35:39 +00:00
Chris Lattner dad4062b4d implement altivec.h and a bunch of support code, patch by Anton Yartsev!
llvm-svn: 101215
2010-04-14 03:54:58 +00:00