Commit Graph

5487 Commits

Author SHA1 Message Date
Chris Lattner 13d5f3eb05 Revert Nate's CR patch from last night, which caused many regressions (e.g. fhourstones).
Loading and storing off R0 isn't what we wanted.  Also, taking some CR's out of
CRRC seems to cause failures as well.  Further investigation is required.

llvm-svn: 28097
2006-05-04 16:56:45 +00:00
Jeff Cohen 06041abeb6 Make external globals public; other minor cleanup.
llvm-svn: 28096
2006-05-04 16:20:22 +00:00
Jeff Cohen f812a4fa75 Make Intel syntax the default when LLVM is built with VC++.
llvm-svn: 28095
2006-05-04 16:19:27 +00:00
Chris Lattner ee64b6b40f Remove a bunch more dead V9 specific stuff
llvm-svn: 28094
2006-05-04 01:26:39 +00:00
Chris Lattner 940cc978ef Remove a bunch more SparcV9 specific stuff
llvm-svn: 28093
2006-05-04 01:15:02 +00:00
Chris Lattner 6e663f1c1e Remove some more V9-specific stuff.
llvm-svn: 28092
2006-05-04 00:49:59 +00:00
Chris Lattner 9f6639b64d Remove some more unused stuff from MachineInstr that was leftover from V9.
llvm-svn: 28091
2006-05-04 00:44:25 +00:00
Chris Lattner 2aef59f123 Simplify handling of relocations
llvm-svn: 28090
2006-05-04 00:42:08 +00:00
Evan Cheng 8b1cde2bbe Use movsd to shuffle in the lowest two elements of a v4f32 / v4i32 vector when
movlps cannot be used (e.g. when load from m64 has multiple uses).

llvm-svn: 28089
2006-05-03 20:32:03 +00:00
Chris Lattner e3a9c70ba0 Change from using MachineRelocation ctors to using static methods
in MachineRelocation to create Relocations.

llvm-svn: 28088
2006-05-03 20:30:20 +00:00
Chris Lattner 9e68942d78 inline a simple method
llvm-svn: 28083
2006-05-03 17:21:32 +00:00
Chris Lattner 1d8ee1fc80 Suck block address tracking out of targets into the JIT Emitter. This
simplifies the MachineCodeEmitter interface just a little bit and makes
BasicBlocks work like constant pools and jump tables.

llvm-svn: 28082
2006-05-03 17:10:41 +00:00
Chris Lattner 9954bc9c19 Fix a bug in Owen's checkin that broke the CBE on all non sparc v9 platforms.
llvm-svn: 28081
2006-05-03 05:48:41 +00:00
Nate Begeman 43b1ed7e3d Teach the x86 jit how to handle jump tables not directly used by a jump
instruction.

llvm-svn: 28080
2006-05-03 04:52:47 +00:00
Owen Anderson 20a631fde7 Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.
This fixes PR 759.

llvm-svn: 28074
2006-05-03 01:29:57 +00:00
Chris Lattner d8b192ba3b Change the BasicBlockAddrs map to be a vector, indexed by MBB number.
llvm-svn: 28069
2006-05-03 00:32:55 +00:00
Chris Lattner 0267807ddc Keep the alpha JIT similar to the PPC/X86 jits
llvm-svn: 28068
2006-05-03 00:31:21 +00:00
Chris Lattner b8065a9a3a Several related changes:
1. Change several methods in the MachineCodeEmitter class to be pure virtual.
2. Suck emitConstantPool/initJumpTableInfo into startFunction, removing them
   from the MachineCodeEmitter interface, and reducing the amount of target-
   specific code.
3. Change the JITEmitter so that it allocates constantpools and jump tables
   *right* next to the functions that they belong to, instead of in a separate
   pool of memory.  This makes all memory for a function be contiguous, and
   means the JITEmitter only tracks one block of memory now.

llvm-svn: 28065
2006-05-02 23:22:24 +00:00
Nate Begeman 233391f5f5 Remove some stuff from the README
llvm-svn: 28063
2006-05-02 22:43:31 +00:00
Chris Lattner e1c96369e2 Fix a purely hypothetical problem (for now): emitWord emits in the host
byte format.  This doesn't work when using the code emitter in a cross target
environment.  Since the code emitter is only really used by the JIT, this
isn't a current problem, but if we ever start emitting .o files, it would be.

llvm-svn: 28060
2006-05-02 19:14:47 +00:00
Chris Lattner c9aa3715e8 Refactor the machine code emitter interface to pull the pointers for the current
code emission location into the base class, instead of being in the derived classes.

This change means that low-level methods like emitByte/emitWord now are no longer
virtual (yaay for speed), and we now have a framework to support growable code
segments.  This implements feature request #1 of PR469.

llvm-svn: 28059
2006-05-02 18:27:26 +00:00
Nate Begeman bbcbf48aab Since we don't handle callee-save CRs right yet, don't allocate them. Also
don't step on R11 in the middle of a function when saving and restoring CRs

llvm-svn: 28058
2006-05-02 17:37:31 +00:00
Nate Begeman 287dc5be0d Hooray, everyone now uses the same printBasicBlockLabel implementation
llvm-svn: 28056
2006-05-02 17:34:51 +00:00
Chris Lattner 5bc9c583e3 There is no reason to use a virtual method to store this word.
llvm-svn: 28053
2006-05-02 17:16:20 +00:00
Nate Begeman b9d4f8324d Extend printBasicBlockLabel a bit so that it can be used to print all
basic block labels, consolidating the code to do so in one place for each
target.

llvm-svn: 28050
2006-05-02 05:37:32 +00:00
Nate Begeman 01364fbba8 Update the PPC compilation callback code to not need weird abi-violating
prologs and epilogs, keep all the asm in one place, and remove use of
compiler builtin functions.

llvm-svn: 28049
2006-05-02 04:50:05 +00:00
Jeff Cohen 470f431f44 De-virtualize SwitchSection.
llvm-svn: 28047
2006-05-02 03:58:45 +00:00
Jeff Cohen f34ddb1e0d De-virtualize EmitZeroes.
llvm-svn: 28046
2006-05-02 03:46:13 +00:00
Jeff Cohen bfe9ffb449 Finish support for Microsoft ML/MASM. May still be a few rough edges.
llvm-svn: 28045
2006-05-02 03:11:50 +00:00
Jeff Cohen 24a62a9bc1 Make Intel syntax mode friendlier to Microsoft ML assembler (still needs more work).
llvm-svn: 28044
2006-05-02 01:16:28 +00:00
Chris Lattner 85e9909755 Put PHI/INLINEASM into the correct namespace.
llvm-svn: 28037
2006-05-01 17:00:49 +00:00
Chris Lattner 563f0417d2 Remove %'s from register names when in intel mode.
llvm-svn: 28027
2006-05-01 05:53:50 +00:00
Jeff Cohen 71c2e0f262 Mingw32 patches supplied by Anton Korobeynikov.
llvm-svn: 28023
2006-04-29 18:41:44 +00:00
Evan Cheng d369603df9 I can't spell: Register, not Regsiter.
llvm-svn: 28021
2006-04-28 23:19:39 +00:00
Evan Cheng b244b80172 Implemented x86 inline asm b, h, w, k modifiers.
llvm-svn: 28020
2006-04-28 23:11:40 +00:00
Chris Lattner 84b49d51be Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Evan Cheng 88decded82 Initial caller side support (for CCC only, not FastCC) of 128-bit vector
passing by value.

llvm-svn: 28015
2006-04-28 21:29:37 +00:00
Evan Cheng 68a44dc445 Bare-bone X86 inline asm printer support.
llvm-svn: 28014
2006-04-28 21:19:05 +00:00
Evan Cheng 3cd4362ade Implement four-wide shuffle with 2 shufps if no more than two elements come
from each vector. e.g.
        shuffle(G1, G2, 7, 1, 5, 2)
==>
        movaps _G2, %xmm0
        shufps $151, _G1, %xmm0
        shufps $216, %xmm0, %xmm0

llvm-svn: 28011
2006-04-28 07:03:38 +00:00
Evan Cheng d43c5c6046 TargetLowering::LowerArguments should return a VBIT_CONVERT of
FORMAL_ARGUMENTS SDOperand in the return result vector.

llvm-svn: 28009
2006-04-28 05:25:15 +00:00
Evan Cheng f0157cb0bc Use movaps instead of movapd for spill / restore.
llvm-svn: 28005
2006-04-28 02:23:35 +00:00
Chris Lattner a4c2c4a276 Add a note
llvm-svn: 27999
2006-04-28 00:04:05 +00:00
Chris Lattner b209131b56 Add a note
llvm-svn: 27998
2006-04-27 21:40:57 +00:00
Evan Cheng f4f3f0d25f Make x86 isel lowering produce tailcall nodes. They are match to normal calls
for now.

Patch contributed by Alexander Friedman.

llvm-svn: 27994
2006-04-27 08:40:39 +00:00
Evan Cheng ec04a37edd A couple of new entries.
llvm-svn: 27993
2006-04-27 08:31:33 +00:00
Evan Cheng 89001ad729 Support for passing 128-bit vector arguments via XMM registers.
llvm-svn: 27992
2006-04-27 08:31:10 +00:00
Evan Cheng a0374e1bed Oops
llvm-svn: 27989
2006-04-27 05:44:50 +00:00
Evan Cheng 24eb3f4765 Bug fix: not updating NumIntRegs.
llvm-svn: 27988
2006-04-27 05:35:28 +00:00
Evan Cheng 48940d16b2 - Clean up formal argument lowering code. Prepare for vector pass by value work.
- Fixed vararg support.

llvm-svn: 27985
2006-04-27 01:32:22 +00:00
Evan Cheng 1c39903297 Fix fastcc failures.
llvm-svn: 27980
2006-04-26 18:21:31 +00:00
Evan Cheng e0bcfbe811 Switching over FORMAL_ARGUMENTS mechanism to lower call arguments.
llvm-svn: 27975
2006-04-26 01:20:17 +00:00
Nate Begeman 4530327c04 Keep the stack from on darwin 16-byte aligned. This fixes many JIT
failres.

llvm-svn: 27973
2006-04-25 20:54:26 +00:00
Evan Cheng a9467aab0a Separate LowerOperation() into multiple functions, one per opcode.
llvm-svn: 27972
2006-04-25 20:13:52 +00:00
Evan Cheng 4cc3e0b05f Fix a typo.
llvm-svn: 27968
2006-04-25 17:48:41 +00:00
Nate Begeman 318bb96f9e No functionality changes, but cleaner code with correct comments.
llvm-svn: 27966
2006-04-25 04:45:59 +00:00
Evan Cheng fb46b2bf5d Explicitly specify result type for def : Pat<> patterns (if it produces a vector
result). Otherwise tblgen will pick the default (v16i8 for 128-bit vector).

llvm-svn: 27965
2006-04-25 00:50:01 +00:00
Evan Cheng 25b09295f8 Added X86 SSE2 intrinsics which can be represented as vector_shuffles. This is
a temporary workaround for the 2-wide vector_shuffle problem (i.e. its mask
would have type v2i32 which is not legal).

llvm-svn: 27964
2006-04-24 23:34:56 +00:00
Evan Cheng d03631ee76 Add a new entry.
llvm-svn: 27963
2006-04-24 23:30:10 +00:00
Evan Cheng 5c2bfb069e Special case handling two wide build_vector(0, x).
llvm-svn: 27961
2006-04-24 22:58:52 +00:00
Evan Cheng 63bd4d3730 Some missing movlps, movhps, movlpd, and movhpd patterns.
llvm-svn: 27960
2006-04-24 21:58:20 +00:00
Evan Cheng b0461080e4 A little bit more build_vector enhancement for v8i16 cases.
llvm-svn: 27959
2006-04-24 18:01:45 +00:00
Evan Cheng 2f9b0bcbd5 Remove a completed entry.
llvm-svn: 27958
2006-04-24 17:38:16 +00:00
Evan Cheng ab0ee6340c MakeMIInst() should handle jump table index operands.
llvm-svn: 27955
2006-04-24 05:37:35 +00:00
Chris Lattner f110527a29 Add a note
llvm-svn: 27954
2006-04-23 19:47:09 +00:00
Evan Cheng b4f31dd1a8 MOVL shuffle (i.e. movd or movss / movsd from memory) of undef, V2 == V2
llvm-svn: 27953
2006-04-23 06:35:19 +00:00
Nate Begeman 9f0b13c885 Optimized stores to the constant pool, while cool, are unnecessary.
llvm-svn: 27948
2006-04-22 22:31:45 +00:00
Nate Begeman 4ca2ea5b43 JumpTable support! What this represents is working asm and jit support for
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.

llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Evan Cheng e728efdfce Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef.
llvm-svn: 27946
2006-04-22 08:34:05 +00:00
Evan Cheng 16ef94f4e8 Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector.
llvm-svn: 27945
2006-04-22 06:21:46 +00:00
Chris Lattner c8afdfec52 Teach the JIT how to relocate LI, this fixes the JIT on Prolangs-C/TimberWolfMC
llvm-svn: 27943
2006-04-22 06:17:56 +00:00
Evan Cheng 14215c36b6 Revamp build_vector lowering to take advantage of movss and movd instructions.
movd always clear the top 96 bits and movss does so when it's loading the
value from memory.
The net result is codegen for 4-wide shuffles is much improved. It is near
optimal if one or more elements is a zero. e.g.

__m128i test(int a, int b) {
  return _mm_set_epi32(0, 0, b, a);
}

compiles to

_test:
	movd 8(%esp), %xmm1
	movd 4(%esp), %xmm0
	punpckldq %xmm1, %xmm0
	ret

compare to gcc:

_test:
	subl	$12, %esp
	movd	20(%esp), %xmm0
	movd	16(%esp), %xmm1
	punpckldq	%xmm0, %xmm1
	movq	%xmm1, %xmm0
	movhps	LC0, %xmm0
	addl	$12, %esp
	ret

or icc:

_test:
        movd      4(%esp), %xmm0                                #5.10
        movd      8(%esp), %xmm3                                #5.10
        xorl      %eax, %eax                                    #5.10
        movd      %eax, %xmm1                                   #5.10
        punpckldq %xmm1, %xmm0                                  #5.10
        movd      %eax, %xmm2                                   #5.10
        punpckldq %xmm2, %xmm3                                  #5.10
        punpckldq %xmm3, %xmm0                                  #5.10
        ret                                                     #5.10

There are still room for improvement, for example the FP variant of the above example:

__m128 test(float a, float b) {
  return _mm_set_ps(0.0, 0.0, b, a);
}

_test:
	movss 8(%esp), %xmm1
	movss 4(%esp), %xmm0
	unpcklps %xmm1, %xmm0
	xorps %xmm1, %xmm1
	movlhps %xmm1, %xmm0
	ret

The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle.

llvm-svn: 27939
2006-04-21 23:03:30 +00:00
Nate Begeman 57a32f0bc1 Fix the comment
llvm-svn: 27938
2006-04-21 22:11:27 +00:00
Nate Begeman 516b393992 Change the PPC JIT to use a Static relocation model
llvm-svn: 27937
2006-04-21 22:04:15 +00:00
Chris Lattner 3e62d4b289 fix thinko
llvm-svn: 27935
2006-04-21 21:05:22 +00:00
Chris Lattner e1f9ab7d53 add some low-prio notes
llvm-svn: 27934
2006-04-21 21:03:21 +00:00
Evan Cheng e8b5180044 Now generating perfect (I think) code for "vector set" with a single non-zero
scalar value.

e.g.
        _mm_set_epi32(0, a, 0, 0);
==>
	movd 4(%esp), %xmm0
	pshufd $69, %xmm0, %xmm0

        _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
	movzbw 4(%esp), %ax
	movzwl %ax, %eax
	pxor %xmm0, %xmm0
	pinsrw $5, %eax, %xmm0

llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Chris Lattner 99d3da9d2c Fix the CodeGen/PowerPC/buildvec_canonicalize.ll regression last night.
llvm-svn: 27908
2006-04-20 19:01:30 +00:00
Chris Lattner d1c3a067ee add a note
llvm-svn: 27907
2006-04-20 18:49:28 +00:00
Chris Lattner 3e5521799c remove some v9 specific code
llvm-svn: 27900
2006-04-20 18:33:11 +00:00
Chris Lattner 2a875285f7 Remove this obsolete file
llvm-svn: 27895
2006-04-20 18:16:45 +00:00
Chris Lattner ac61195539 This target is no longer built. The ,v files now live in the reoptimizer.
llvm-svn: 27885
2006-04-20 17:15:44 +00:00
Evan Cheng 60f0b8998e - Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1>
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.

llvm-svn: 27875
2006-04-20 08:58:49 +00:00
Chris Lattner 0cd0065c58 Make sure that the new instructions selected have the right type. This fixes
CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll

llvm-svn: 27868
2006-04-20 05:58:10 +00:00
Evan Cheng 15c264b753 Handle v2i64 BUILD_VECTOR custom lowering correctly. v2i64 is a legal type,
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant)
and then cast it back.

llvm-svn: 27849
2006-04-20 00:11:39 +00:00
Evan Cheng 4a1b0d3292 isSplatMask() bug: first element can be an undef.
llvm-svn: 27847
2006-04-19 23:28:59 +00:00
Evan Cheng a3caaee503 - Added support to do aribitrary 4 wide shuffle with no more than three
instructions.
- Fixed a commute vector_shuff bug.

llvm-svn: 27845
2006-04-19 22:48:17 +00:00
Evan Cheng 6d5297dac3 Prefer {p}unpack* and mov*dup over {p}shuf* as well.
llvm-svn: 27844
2006-04-19 21:15:24 +00:00
Evan Cheng 52df74000a Renamed AddedCost to AddedComplexity.
llvm-svn: 27843
2006-04-19 20:38:28 +00:00
Evan Cheng b416a25174 - Renamed AddedCost to AddedComplexity.
- Added more movhlps and movlhps patterns.

llvm-svn: 27842
2006-04-19 20:37:34 +00:00
Evan Cheng 7855e4d032 Commute vector_shuffle to match more movlhps, movlp{s|d} cases.
llvm-svn: 27840
2006-04-19 20:35:22 +00:00
Evan Cheng cc7abc6c38 More mov{h|l}p{d|s} patterns.
llvm-svn: 27836
2006-04-19 18:20:17 +00:00
Evan Cheng aeb09ccdd3 - More mov{h|l}ps patterns.
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These
  are preferred over shufps in most cases.

llvm-svn: 27835
2006-04-19 18:11:52 +00:00
Evan Cheng aa3325e925 Allow "let AddedCost = n in" to increase pattern complexity.
llvm-svn: 27834
2006-04-19 18:07:24 +00:00
Chris Lattner 05bbec5020 add a note
llvm-svn: 27832
2006-04-19 16:22:38 +00:00
Chris Lattner a922a516b0 add a note
llvm-svn: 27828
2006-04-19 05:55:06 +00:00
Chris Lattner bfab82817a Add a note.
llvm-svn: 27827
2006-04-19 05:53:27 +00:00
Evan Cheng 3823aa1d0f - PEXTRW cannot take a memory location as its first source operand.
- PINSRWrmi encoding bug.

llvm-svn: 27818
2006-04-18 21:59:43 +00:00
Evan Cheng 43f4ef4ffb SHUFP{S|D}, PSHUF* encoding bugs. Left out the mask immediate operand.
llvm-svn: 27817
2006-04-18 21:56:36 +00:00
Evan Cheng a179ea631d Name change for clarity sake
llvm-svn: 27816
2006-04-18 21:55:35 +00:00
Evan Cheng 09e36ef710 Encoding bug: CMPPSrmi, CMPPDrmi dropped operand 2 (condtion immediate).
llvm-svn: 27815
2006-04-18 21:31:08 +00:00
Evan Cheng d799d680f4 Name change for clarity sake
llvm-svn: 27814
2006-04-18 21:29:50 +00:00
Evan Cheng 0ee281f37c Left a pattern out
llvm-svn: 27813
2006-04-18 21:29:08 +00:00
Chris Lattner 34c901b50e These are correctly encoded by the JIT. I checked :)
llvm-svn: 27810
2006-04-18 19:03:38 +00:00
Chris Lattner 197d762232 add a note
llvm-svn: 27809
2006-04-18 18:30:19 +00:00
Chris Lattner 518834c67e Fix a crash on:
void foo2(vector float *A, vector float *B) {
  vector float C = (vector float)vec_cmpeq(*A, *B);
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
  *A = C;
}

llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Evan Cheng e2d25a1a50 Fixed an encoding bug: movd from XMM to R32.
llvm-svn: 27807
2006-04-18 18:19:00 +00:00
Chris Lattner 1e174c87c3 pretty print node name
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner 9754d142a4 Implement an important entry from README_ALTIVEC:
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's.  Instead, just branch on CR6 directly. :)

For example, for:
void foo2(vector float *A, vector float *B) {
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
}

We now generate:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

instead of:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        cmpwi cr0, r3, 0
        beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

This implements CodeGen/PowerPC/vec_br_cmp.ll.

llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner 68c16a201e move some stuff around, clean things up
llvm-svn: 27802
2006-04-18 17:52:36 +00:00
Chris Lattner bfc2c68386 Teach the codegen about instructions used for SSE spill code, allowing it
to optimize cases where it has to spill a lot

llvm-svn: 27801
2006-04-18 16:44:51 +00:00
Chris Lattner 96d50487c9 Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
even/odd halves.  Thanks to Nate telling me what's what.

llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner d6d82aa889 Implement v16i8 multiply with this code:
vmuloub v5, v3, v2
        vmuleub v2, v3, v2
        vperm v2, v2, v5, v4

This implements CodeGen/PowerPC/vec_mul.ll.  With this, v16i8 multiplies are
6.79x faster than before.

Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.

Remove the 'integer multiplies' todo from the README file.

llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Evan Cheng 4d36a36900 Correct comments
llvm-svn: 27790
2006-04-18 03:45:01 +00:00
Chris Lattner 7e439874cb Lower v8i16 multiply into this code:
li r5, lo16(LCPI1_0)
        lis r6, ha16(LCPI1_0)
        lvx v4, r6, r5
        vmulouh v5, v3, v2
        vmuleuh v2, v3, v2
        vperm v2, v2, v5, v4

where v4 is:
LCPI1_0:                                        ;  <16 x ubyte>
        .byte   2
        .byte   3
        .byte   18
        .byte   19
        .byte   6
        .byte   7
        .byte   22
        .byte   23
        .byte   10
        .byte   11
        .byte   26
        .byte   27
        .byte   14
        .byte   15
        .byte   30
        .byte   31

This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.

llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner a2cae1bb10 Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll

llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Evan Cheng 0ef233509b Another entry
llvm-svn: 27786
2006-04-18 01:22:57 +00:00
Evan Cheng e008bd3d27 Another entry.
llvm-svn: 27784
2006-04-18 00:21:01 +00:00
Evan Cheng 5421206c4b Use movss to insert_vector_elt(v, s, 0).
llvm-svn: 27782
2006-04-17 22:45:49 +00:00
Evan Cheng 6e5e205841 Use two pinsrw to insert an element into v4i32 / v4f32 vector.
llvm-svn: 27779
2006-04-17 22:04:06 +00:00
Chris Lattner 63a5cdc423 remove done item
llvm-svn: 27778
2006-04-17 21:52:03 +00:00
Chris Lattner 6bd68ae81e Don't diddle VRSAVE if no registers need to be added/removed from it. This
allows us to codegen functions as:

_test_rol:
        vspltisw v2, -12
        vrlw v2, v2, v2
        blr

instead of:

_test_rol:
        mfvrsave r2, 256
        mr r3, r2
        mtvrsave r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtvrsave r2
        blr

Testcase here: CodeGen/PowerPC/vec_vrsave.ll

llvm-svn: 27777
2006-04-17 21:48:13 +00:00
Evan Cheng 22c06f054b Encoding bug
llvm-svn: 27773
2006-04-17 21:33:57 +00:00
Chris Lattner 72d7c27069 Vectors that are known live-in and live-out are clearly already marked in
the vrsave register for the caller.  This allows us to codegen a function as:

_test_rol:
        mfspr r2, 256
        mr r3, r2
        mtspr 256, r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtspr 256, r2
        blr

instead of:

_test_rol:
        mfspr r2, 256
        oris r3, r2, 40960
        mtspr 256, r3
        vspltisw v0, -12
        vrlw v2, v0, v0
        mtspr 256, r2
        blr

llvm-svn: 27772
2006-04-17 21:22:06 +00:00
Chris Lattner 14c4972b6d Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
vspltisw v2, -12
        vrlw v2, v2, v2

instead of:

        vspltisw v0, -12
        vrlw v2, v0, v0

when a function is returning a value.

llvm-svn: 27771
2006-04-17 21:19:12 +00:00
Chris Lattner 6df094b4ab Move some knowledge about registers out of the code emitter into the register info.
llvm-svn: 27770
2006-04-17 21:07:20 +00:00
Chris Lattner 0f28d48da2 Use a small table instead of macros to do this conversion.
llvm-svn: 27769
2006-04-17 20:59:25 +00:00
Evan Cheng 5022b3426e Implement v8i16, v16i8 splat using unpckl + pshufd.
llvm-svn: 27768
2006-04-17 20:43:08 +00:00
Chris Lattner c070c621ac implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll
llvm-svn: 27767
2006-04-17 20:32:50 +00:00
Chris Lattner e54133cfba Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.

llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Evan Cheng bf0d13c54f Incorrect foldMemoryOperand entries
llvm-svn: 27763
2006-04-17 18:06:12 +00:00
Evan Cheng 5112b5c544 Errors in patterns preventing load folding
llvm-svn: 27762
2006-04-17 18:05:01 +00:00
Jeff Cohen e3955a05e4 Add checks for __OpenBSD__.
llvm-svn: 27761
2006-04-17 17:55:41 +00:00
Chris Lattner 264c908e3a Teach the ppc backend to use rol and vsldoi to generate splatted constants.
This implements vec_constants.ll:test_vsldoi and test_rol

llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner 26fb8d9393 add a note
llvm-svn: 27758
2006-04-17 17:29:41 +00:00
Evan Cheng b3b41c4f3d FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly
llvm-svn: 27755
2006-04-17 07:24:10 +00:00
Chris Lattner 1b3806ace5 Make some code more general, adding support for constant formation of several
new patterns.

llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner f8dd76df5b Learn how to make odd splatted constants in range [17,29]. This implements
PowerPC/vec_constants.ll:test_29.

llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner 2a099c04c1 Pull some code out into a helper function.
Effeciently codegen even splats in the range [-32,30].

This allows us to codegen <30,30,30,30> as:

        vspltisw v0, 15
        vadduwm v2, v0, v0

instead of as a cp load.

llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner 071ad01ceb Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such.  This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll

llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner 85bfa3c2bc Regenerate with adjusted costs
llvm-svn: 27746
2006-04-17 05:26:20 +00:00
Chris Lattner aac2a200cd Regenerate with correct offset
llvm-svn: 27744
2006-04-17 05:08:46 +00:00
Chris Lattner 311b1a6e23 Increase the opcodes by one each to disambiguate COPY from VMRGHW.
llvm-svn: 27742
2006-04-17 00:47:48 +00:00
Chris Lattner 07a3d01a91 Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
of various 4-element vectors.

llvm-svn: 27739
2006-04-17 00:37:02 +00:00
Evan Cheng 20712deecb movduprm, movshduprm bugs
llvm-svn: 27734
2006-04-16 18:11:28 +00:00
Evan Cheng 3064f9aaa6 Encoding bugs
llvm-svn: 27733
2006-04-16 07:02:22 +00:00
Evan Cheng 685ddd8152 Can't fold loads into alias vector SSE ops used for scalar operation. The load
address has to be 16-byte aligned but the values aren't spilled to 128-bit
locations.

llvm-svn: 27732
2006-04-16 06:58:19 +00:00
Chris Lattner 06a21ba96b Implement a TODO: have the legalizer canonicalize a bunch of operations to
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.

llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner fa5aa396c2 Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
Remove some done items from the todo list.

llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner 24acbe46c0 Fix a crash when faced with a shuffle vector that has an undef in its mask.
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner 873202fabd Add patterns for matching vnots with bit converted inputs. Most of these will
go away when I start using evan's binop type canonicalizer

llvm-svn: 27725
2006-04-15 23:45:24 +00:00
Chris Lattner 41df12ff4c Add a new vnot_conv predicate for matching vnot's where the allones vector is
bitconverted from some other type.

llvm-svn: 27724
2006-04-15 23:39:14 +00:00
Evan Cheng 8f1d801389 More encoding bugs
llvm-svn: 27722
2006-04-15 06:10:09 +00:00
Evan Cheng 91944e8699 pslldrm, psrawrm, etc. encoding bug
llvm-svn: 27721
2006-04-15 05:59:08 +00:00
Evan Cheng 1220b31a31 hsubp{s|d} encoding bug
llvm-svn: 27720
2006-04-15 05:52:42 +00:00
Evan Cheng 6222cf2a36 Silly bug
llvm-svn: 27719
2006-04-15 05:37:34 +00:00
Evan Cheng 65bb720a8b Do not use movs{h|l}dup for a shuffle with a single non-undef node.
llvm-svn: 27718
2006-04-15 03:13:24 +00:00
Evan Cheng 0ba896c75b Added SSE (and other) entries to foldMemoryOperand().
llvm-svn: 27716
2006-04-14 23:33:27 +00:00
Evan Cheng 00a5b3d9d3 Some clean up
llvm-svn: 27715
2006-04-14 23:32:40 +00:00
Chris Lattner 559c8ba466 Allow undef in a shuffle mask
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Evan Cheng 5d247f81c1 Last few SSE3 intrinsics.
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng 3bd605397b Misc. SSE2 intrinsics: clflush, lfench, mfence
llvm-svn: 27699
2006-04-14 07:43:12 +00:00
Evan Cheng e349d01acf We were not adjusting the frame size to ensure proper alignment when alloca /
vla are present in the function. This causes a crash when a leaf function
allocates space on the stack used to store / load with 128-bit SSE
instructions.

llvm-svn: 27698
2006-04-14 07:26:43 +00:00
Evan Cheng 8d76f3922b New entry
llvm-svn: 27697
2006-04-14 07:24:04 +00:00
Chris Lattner 4211ca9108 Move the rest of the PPCTargetLowering::LowerOperation cases out into
separate functions, for simplicity and code clarity.

llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner 19e9055eb5 Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
functions, which makes the code much cleaner :)

llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Evan Cheng eb0063a34f pcmpeq* and pcmpgt* intrinsics.
llvm-svn: 27685
2006-04-14 01:39:53 +00:00
Evan Cheng 16287444ff psll*, psrl*, and psra* intrinsics.
llvm-svn: 27684
2006-04-14 00:14:05 +00:00
Reid Spencer 64f6c11c59 Remove the .cvsignore file so this directory can be pruned.
llvm-svn: 27683
2006-04-13 22:00:10 +00:00
Reid Spencer 497ecf6840 Remove .cvsignore so that this directory can be pruned.
llvm-svn: 27682
2006-04-13 21:59:03 +00:00
Evan Cheng a84319719c Doh. PANDrm, etc. are not commutable.
llvm-svn: 27668
2006-04-13 18:11:28 +00:00
Chris Lattner 883fb053bd Force non-darwin targets to use a static relo model. This fixes PR734,
tested by CodeGen/Generic/vector.ll

llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner 5879efe0c8 add a note, move an altivec todo to the altivec list.
llvm-svn: 27654
2006-04-13 16:48:00 +00:00
Reid Spencer 9857229aba Add the README files to the distribution.
llvm-svn: 27651
2006-04-13 06:39:24 +00:00
Evan Cheng ed3996743f psad, pmax, pmin intrinsics.
llvm-svn: 27647
2006-04-13 06:11:45 +00:00
Evan Cheng 58dad55959 Various SSE2 packed integer intrinsics: pmulhuw, pavgw, etc.
llvm-svn: 27645
2006-04-13 05:24:54 +00:00
Evan Cheng e4f97ccf7f X86 SSE2 supports v8i16 multiplication
llvm-svn: 27644
2006-04-13 05:10:25 +00:00
Evan Cheng d2eb662415 Update
llvm-svn: 27643
2006-04-13 05:09:45 +00:00
Evan Cheng b3fe00bdc6 padds{b|w}, paddus{b|w}, psubs{b|w}, psubus{b|w} intrinsics.
llvm-svn: 27639
2006-04-13 00:43:35 +00:00
Evan Cheng 0aab735a1a Naming inconsistency.
llvm-svn: 27638
2006-04-13 00:00:23 +00:00
Evan Cheng c88afc36a9 SSE / SSE2 conversion intrinsics.
llvm-svn: 27637
2006-04-12 23:42:44 +00:00
Evan Cheng 92232307d0 All "integer" logical ops (pand, por, pxor) are now promoted to v2i64.
Clean up and fix various logical ops issues.

llvm-svn: 27633
2006-04-12 21:21:57 +00:00
Chris Lattner 147e50e1c5 Add a new way to match vector constants, which make it easier to bang bits of
different types.

Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1.  This compiles:

typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
  *P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
  *P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
  *P3 = vec_abs((vector float)*P3);
}

to:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        vspltisw v0, -1
        vslw v0, v0, v0
        lvx v1, 0, r3
        vand v1, v1, v0
        stvx v1, 0, r3
        lvx v1, 0, r4
        vandc v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vandc v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

instead of (with two constant pool entries):

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        li r6, lo16(LCPI1_0)
        lis r7, ha16(LCPI1_0)
        li r8, lo16(LCPI1_1)
        lis r9, ha16(LCPI1_1)
        lvx v0, r7, r6
        lvx v1, 0, r3
        vand v0, v1, v0
        stvx v0, 0, r3
        lvx v0, r9, r8
        lvx v1, 0, r4
        vand v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vand v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

GCC produces (with 2 cp entries):

_test:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc00c
        mtspr 256,r0
        lis r2,ha16(LC0)
        lis r9,ha16(LC1)
        la r2,lo16(LC0)(r2)
        lvx v0,0,r3
        lvx v1,0,r5
        la r9,lo16(LC1)(r9)
        lwz r12,-4(r1)
        lvx v12,0,r2
        lvx v13,0,r9
        vand v0,v0,v12
        stvx v0,0,r3
        vspltisw v0,-1
        vslw v12,v0,v0
        vandc v1,v1,v12
        stvx v1,0,r5
        lvx v0,0,r4
        vand v0,v0,v13
        stvx v0,0,r4
        mtspr 256,r12
        blr

llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner 74cf9ff761 Rename get_VSPLI_elt -> get_VSPLTI_elt
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively.  This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI

llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Evan Cheng e2157c6e41 Promote v4i32, v8i16, v16i8 load to v2i64 load.
llvm-svn: 27612
2006-04-12 17:12:36 +00:00
Chris Lattner e318a7574e Ensure that zero vectors are always v4i32, which forces them to CSE with
each other.  This implements CodeGen/PowerPC/vxor-canonicalize.ll

llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Evan Cheng 29be057d92 Various SSE2 conversion intrinsics
llvm-svn: 27603
2006-04-12 05:20:24 +00:00
Evan Cheng 70c74a3ced Added __builtin_ia32_storelv4si, __builtin_ia32_movqv4si,
__builtin_ia32_loadlv4si, __builtin_ia32_loaddqu, __builtin_ia32_storedqu.

llvm-svn: 27599
2006-04-11 22:28:25 +00:00
Nate Begeman f19bcd5177 Fix SingleSource/UnitTests/Vector/sumarray-dbl
llvm-svn: 27594
2006-04-11 19:44:43 +00:00
Nate Begeman 1bb132099f Fix PR727, correctly handling large stack aligments on ppc
llvm-svn: 27593
2006-04-11 19:29:21 +00:00
Chris Lattner aaa04230bd we have a shuffle instr, add an example.
llvm-svn: 27592
2006-04-11 18:47:03 +00:00
Evan Cheng 6b60357f4a gcc lower SSE prefetch into generic prefetch intrinsic. Need to add support
later.

llvm-svn: 27591
2006-04-11 18:04:57 +00:00
Evan Cheng 6ea715af28 Misc. intrinsics.
llvm-svn: 27590
2006-04-11 17:35:57 +00:00
Jim Laskey 02b3b72bfc Suppress debug label when not debug.
llvm-svn: 27588
2006-04-11 08:11:53 +00:00
Evan Cheng 09a956271a movnt* and maskmovdqu intrinsics
llvm-svn: 27587
2006-04-11 06:57:30 +00:00
Chris Lattner e4db08a2f1 Vector function results go into V2 according to GCC. The darwin ABI doc
doesn't say where they go :-/

llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner 92533cfb4a Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
No functionality change.

llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Evan Cheng 12ba3e23d0 Added support for _mm_move_ss and _mm_move_sd.
llvm-svn: 27575
2006-04-11 00:19:04 +00:00
Jim Laskey dca2655daa Use existing information.
llvm-svn: 27574
2006-04-10 23:09:19 +00:00
Evan Cheng f8ac02283c Remove some bogus patterns; clean up.
llvm-svn: 27569
2006-04-10 22:35:16 +00:00
Chris Lattner d99f57c1e1 add a note
llvm-svn: 27567
2006-04-10 21:51:03 +00:00
Evan Cheng 051de9a82b Remove an entry that is now done.
llvm-svn: 27565
2006-04-10 21:42:57 +00:00
Evan Cheng 76112c3cb8 Added some missing shuffle patterns.
llvm-svn: 27564
2006-04-10 21:42:19 +00:00
Evan Cheng 664fcba5fa Correct an entry
llvm-svn: 27563
2006-04-10 21:41:39 +00:00
Evan Cheng 395fa3d2a6 movups / movupd
llvm-svn: 27562
2006-04-10 21:11:06 +00:00
Evan Cheng 617a6a812e Conditional move of vector types.
llvm-svn: 27556
2006-04-10 07:23:14 +00:00
Evan Cheng 014849e121 New entries
llvm-svn: 27555
2006-04-10 07:22:03 +00:00
Evan Cheng c9ed8e4c1a Use movaps to do VR128 reg-to-reg copies for now. It's shorter and available for SSE1.
llvm-svn: 27554
2006-04-10 07:21:31 +00:00
Chris Lattner 3a68f3c3ca properly mark vector selects as expanded to select_cc
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner 0a3d1bbca4 Add VRRC select support
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Nate Begeman 3f9c17906f Disable switch lowering for targets based on the selection dag isel,
letting the code generator handle them directly.

llvm-svn: 27539
2006-04-08 19:46:55 +00:00
Chris Lattner d9e80f4516 Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
constant pool load.

llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner d71a1f946d Change the interface to the predicate that determines if vsplti* can be used.
No functionality changes.

llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Reid Spencer cf905223c5 Initialize SDOperand values because the gcc 4.0.2 compiler complains about
them.

llvm-svn: 27534
2006-04-08 05:38:03 +00:00
Evan Cheng 0df9c9f57d ldmxcsr and stmxcsr.
llvm-svn: 27506
2006-04-08 00:47:44 +00:00
Evan Cheng ac847268c5 Code clean up.
llvm-svn: 27501
2006-04-07 21:53:05 +00:00
Evan Cheng aa18a52545 Added patterns for MOVHPSmr and MOVLPSmr.
llvm-svn: 27497
2006-04-07 21:20:58 +00:00
Evan Cheng 748e573ce5 Keep track of an Mac OS X / x86 ABI bug.
llvm-svn: 27496
2006-04-07 21:19:53 +00:00
Jim Laskey c0d6518f27 Make sure that debug labels are defined within the same section and after the
entry point of a function.

llvm-svn: 27494
2006-04-07 20:44:42 +00:00
Jim Laskey 2d7298c362 Foundation for call frame information.
llvm-svn: 27491
2006-04-07 16:34:46 +00:00
Evan Cheng d8e1a01be6 A MOVPS2SSmr, i.e. _mm_store_ss, encoding bug.
Also MOVPDI2DIrr.

llvm-svn: 27476
2006-04-06 23:53:29 +00:00
Evan Cheng c995b45f67 - movlp{s|d} and movhp{s|d} support.
- Normalize shuffle nodes so result vector lower half elements come from the
  first vector, the rest come from the second vector. (Except for the
  exceptions :-).
- Other minor fixes.

llvm-svn: 27474
2006-04-06 23:23:56 +00:00
Evan Cheng acf8b3c828 New entries.
llvm-svn: 27473
2006-04-06 23:21:24 +00:00
Andrew Lenharth 1596a1b276 This may be overconservative, but it lets the new cfe compile
llvm-svn: 27471
2006-04-06 23:18:45 +00:00
Chris Lattner e61cfad815 Add an item
llvm-svn: 27470
2006-04-06 23:16:19 +00:00
Chris Lattner 466841ddc7 Make sure to return the result in the right type.
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner a4bbfaed5c Match vpku[hw]um(x,x).
Convert vsldoi(x,x) to work the same way other (x,x) cases work.

llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner f38e033270 Add support for matching vmrg(x,x) patterns
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Andrew Lenharth cee782d514 fix some linking problems with the new gcc
llvm-svn: 27460
2006-04-06 21:26:32 +00:00
Chris Lattner d1dcb52093 Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner a4c727f1cc remove two done items
llvm-svn: 27453
2006-04-06 19:19:38 +00:00
Chris Lattner 1d33819194 Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
lower it and LLVM to have one fewer intrinsic.  This implements
CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner e8b83b4206 Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
vperm with a perm mask lvx'd from the constant pool.

llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Evan Cheng 695e45c252 POR encoded as PAND, yikes.
llvm-svn: 27446
2006-04-06 01:49:20 +00:00
Evan Cheng dddb688a40 An entry about comi / ucomi intrinsics.
llvm-svn: 27445
2006-04-05 23:46:04 +00:00
Evan Cheng 780382946e Support for comi / ucomi intrinsics.
llvm-svn: 27444
2006-04-05 23:38:46 +00:00
Chris Lattner c94d932447 Add all of the data stream intrinsics and instructions. woo
llvm-svn: 27442
2006-04-05 22:27:14 +00:00
Chris Lattner 39dc64c955 Fix a typo
llvm-svn: 27440
2006-04-05 20:15:25 +00:00
Chris Lattner 39cc717c65 Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng f3b52c84ea Handle canonical form of e.g.
vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7>

This is turned into
vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3>
by dag combiner.

It would match a {p}unpckl on x86.

llvm-svn: 27437
2006-04-05 07:20:06 +00:00
Evan Cheng 6d196db40d Bogus assert
llvm-svn: 27434
2006-04-05 06:11:20 +00:00
Evan Cheng 2cf4232ced Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Evan Cheng 59a6355e82 Handle v8i16 shuffle that must be broken into a pair of pshufhw / pshuflw.
llvm-svn: 27427
2006-04-05 01:47:37 +00:00
Chris Lattner 2f8e2b2895 add vsl
llvm-svn: 27425
2006-04-05 01:16:22 +00:00
Chris Lattner 575352ac20 add vmladduhm
llvm-svn: 27423
2006-04-05 00:49:48 +00:00
Chris Lattner 5a528e565b Add m[tf]vscr instructions.
llvm-svn: 27421
2006-04-05 00:03:57 +00:00
Chris Lattner 0c82447c66 add a note
llvm-svn: 27419
2006-04-04 23:45:11 +00:00
Chris Lattner 281bb5da1d Add missing byte merges.
llvm-svn: 27418
2006-04-04 23:43:56 +00:00
Chris Lattner fc50ae521c Add FP -> Int Conversions
llvm-svn: 27417
2006-04-04 23:25:02 +00:00
Chris Lattner 96338b6a21 add average intrinsics
llvm-svn: 27416
2006-04-04 23:14:00 +00:00
Chris Lattner 4464383a17 add a note
llvm-svn: 27414
2006-04-04 22:43:55 +00:00