Commit Graph

23797 Commits

Author SHA1 Message Date
Chris Lattner 78c788b450 Align vectors to the size in bytes, not bits.
llvm-svn: 27376
2006-04-03 19:28:50 +00:00
Chris Lattner e1e3adf802 Add a missing check, this fixes UnitTests/Vector/sumarray.c
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner 04c00fc844 Add a missing check, which broke a bunch of vector tests.
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Chris Lattner d97f038972 shrinkify intrinsics more by using some local classes
llvm-svn: 27373
2006-04-03 17:20:06 +00:00
Chris Lattner 9ccd61c893 Add the full set of min/max instructions
llvm-svn: 27372
2006-04-03 15:58:28 +00:00
Chris Lattner 36a519b081 Add some classes to make it easier to define intrinsics. Add min/max intrinsics.
llvm-svn: 27371
2006-04-03 15:43:07 +00:00
Andrew Lenharth df7abf8b74 support x * (c1 + c2) where c1 and c2 are pow2s. special case for c2 == 4
llvm-svn: 27370
2006-04-03 04:19:17 +00:00
Andrew Lenharth 2636d2ac89 test powers of 2
llvm-svn: 27369
2006-04-03 04:14:39 +00:00
Andrew Lenharth 4e2c073a33 mul by const conversion sequences. more coming soon
llvm-svn: 27368
2006-04-03 03:18:59 +00:00
Andrew Lenharth 94f012f606 back this out
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth 0288ba764a test some more mul by constant removal
llvm-svn: 27366
2006-04-03 03:16:09 +00:00
Andrew Lenharth 04a8429572 Make sure mul by constant 5 is turned into a s4addq
llvm-svn: 27365
2006-04-02 21:47:07 +00:00
Andrew Lenharth 015eaf5f33 This should be a win of every arch
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Andrew Lenharth 444bdb069a This makes McCat/12-IOtest go 8x faster or so
llvm-svn: 27363
2006-04-02 21:08:39 +00:00
Andrew Lenharth 01bd5523a3 This will be needed soon
llvm-svn: 27362
2006-04-02 20:13:57 +00:00
Reid Spencer 11dd4b9d9c For PR722:
Change the check for llvm-gcc from using LLVMGCCDIR to LLVMGCC. This checks
for the actual tool rather than the directory in which the tool resides. In
the case of this bug, it is possible that the directory exists but that the
tools in that directory do not. This fix should avoid the makefile from
erroneously proceeding without the actual tools being available.

llvm-svn: 27361
2006-04-02 14:34:26 +00:00
Chris Lattner acf1fc8a28 add a note
llvm-svn: 27360
2006-04-02 07:20:00 +00:00
Chris Lattner c5287c0ece Inform the dag combiner that the predicate compares only return a low bit.
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner 6c1321ca3f relax assertion
llvm-svn: 27358
2006-04-02 06:19:46 +00:00
Chris Lattner e6025525fb Allow targets to compute masked bits for intrinsics.
llvm-svn: 27357
2006-04-02 06:15:09 +00:00
Chris Lattner 4993249a04 Add a little dag combine to compile this:
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
        %tmp1 = load <4 x float>* %in           ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 )           ; <int> [#uses=1]
        %tmp = seteq int %tmp, 0                ; <bool> [#uses=1]
        %tmp3 = cast bool %tmp to int           ; <int> [#uses=1]
        ret int %tmp3
}

into this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        mtspr 256, r2
        blr

instead of this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner caba72b6ff vector casts of casts are eliminable. Transform this:
%tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]
        %tmp = cast <4 x int> %tmp to <4 x float>               ; <<4 x float>> [#uses=1]

into:

        %tmp = cast <4 x uint> %tmp to <4 x float>              ; <<4 x float>> [#uses=1]

llvm-svn: 27355
2006-04-02 05:43:13 +00:00
Chris Lattner 7ee10dec05 vector casts never reinterpret bits
llvm-svn: 27354
2006-04-02 05:40:28 +00:00
Chris Lattner ebca476b27 Allow transforming this:
%tmp = cast <4 x uint>* %testData to <4 x int>*         ; <<4 x int>*> [#uses=1]
        %tmp = load <4 x int>* %tmp             ; <<4 x int>> [#uses=1]

to this:

        %tmp = load <4 x uint>* %testData               ; <<4 x uint>> [#uses=1]
        %tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]

llvm-svn: 27353
2006-04-02 05:37:12 +00:00
Chris Lattner f42d0aeda1 Turn altivec lvx/stvx intrinsics into loads and stores. This allows the
elimination of one load from this:

int AreSecondAndThirdElementsBothNegative( vector float *in ) {
#define QNaN 0x7FC00000
const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN );
vector float test = vec_ld( 0, (float*) &testData );
return ! vec_any_ge( test, *in );
}

Now generating:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        addi r6, r1, -16
        lvx v0, r5, r4
        stvx v0, 0, r6
        lvx v1, 0, r3
        vcmpgefp. v0, v0, v1
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27352
2006-04-02 05:30:25 +00:00
Chris Lattner 80fdc1eb6b Remove done item
llvm-svn: 27351
2006-04-02 05:28:54 +00:00
Jeff Cohen 2379db02bf Fix tablegen related dependencies in Visual Studio.
llvm-svn: 27350
2006-04-02 05:20:53 +00:00
Chris Lattner 42a5fca47e Implement promotion for EXTRACT_VECTOR_ELT, allowing v16i8 multiplies to work with PowerPC.
llvm-svn: 27349
2006-04-02 05:06:04 +00:00
Chris Lattner b80f114707 add a note
llvm-svn: 27348
2006-04-02 03:59:11 +00:00
Chris Lattner 87f080949b Implement the Expand action for binary vector operations to break the binop
into elements and operate on each piece.  This allows generic vector integer
multiplies to work on PPC, though the generated code is horrible.

llvm-svn: 27347
2006-04-02 03:57:31 +00:00
Chris Lattner a9c59156be Intrinsics that just load from memory can be treated like loads: they don't
have to serialize against each other.  This allows us to schedule lvx's
across each other, for example.

llvm-svn: 27346
2006-04-02 03:41:14 +00:00
Chris Lattner 89df307b61 Adjust the Intrinsics.gen interface a little bit
llvm-svn: 27345
2006-04-02 03:35:30 +00:00
Chris Lattner 70ec96fa32 Adjust to change in Intrinsics.gen interface.
llvm-svn: 27344
2006-04-02 03:35:01 +00:00
Chris Lattner 0442a18758 Constant fold all of the vector binops. This allows us to compile this:
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"

aka:

void %test2(<16 x sbyte>* %P) {
  store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
  ret void
}

into this:

_test2:
        mfspr r2, 256
        oris r4, r2, 32768
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        lvx v0, r5, r4
        stvx v0, 0, r3
        mtspr 256, r2
        blr

instead of this:

_test2:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        vspltisb v0, 8
        lvx v1, r5, r4
        vxor v0, v1, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html

llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner ef598059f2 Add a new -view-legalize-dags command line option
llvm-svn: 27342
2006-04-02 03:07:27 +00:00
Chris Lattner e4e64b6b85 Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner 1c22728787 These entries already exist
llvm-svn: 27340
2006-04-02 02:51:27 +00:00
Chris Lattner 1985e1cbb8 Add some missing node names
llvm-svn: 27339
2006-04-02 02:41:18 +00:00
Chris Lattner 6e3b55792b simplify this method
llvm-svn: 27338
2006-04-02 02:28:52 +00:00
Chris Lattner 7a29cf3c7f New note
llvm-svn: 27337
2006-04-02 01:47:20 +00:00
Chris Lattner 6b3f475d23 Constant fold casts from things like <4 x int> -> <4 x uint>, likewise int<->fp.
llvm-svn: 27336
2006-04-02 01:38:28 +00:00
Chris Lattner 9b2d6e7886 Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
"vspltisb v0, 8" instead of a constant pool load.

llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner bec582f4cd Prefer larger register classes over smaller ones when a register occurs in
multiple register classes.  This fixes PowerPC/2006-04-01-FloatDoubleExtend.ll

llvm-svn: 27334
2006-04-02 00:24:45 +00:00
Chris Lattner a13540b896 New testcase that crashes the compiler.
llvm-svn: 27333
2006-04-02 00:23:59 +00:00
Chris Lattner 1b2436a624 add valuemapper support for inline asm
llvm-svn: 27332
2006-04-01 23:17:11 +00:00
Chris Lattner dc72c17798 Implement vnot using VNOR instead of using 'vspltisb v0, -1' and vxor
llvm-svn: 27331
2006-04-01 22:41:47 +00:00
Chris Lattner 6cf4914fd4 Fix InstCombine/2006-04-01-InfLoop.ll
llvm-svn: 27330
2006-04-01 22:05:01 +00:00
Chris Lattner 11739f7589 New testcase that caused instcombine to infinitely loop (with my recent
patch), distilled from Applications/JM/ldecod

llvm-svn: 27329
2006-04-01 22:04:40 +00:00
Chris Lattner dcd0792622 Fold A^(B&A) -> (B&A)^A
Fold (B&A)^A == ~B & A

This implements InstCombine/xor.ll:test2[56]

llvm-svn: 27328
2006-04-01 08:03:55 +00:00
Chris Lattner 2b11adcf5c new testcases
llvm-svn: 27327
2006-04-01 08:02:51 +00:00