Commit Graph

383 Commits

Author SHA1 Message Date
Owen Anderson e2f23a3abf Add lengthof and endof templates that hide a lot of sizeof computations.
Patch by Sterling Stein!

llvm-svn: 41758
2007-09-07 04:06:50 +00:00
Dale Johannesen 3cf889f75e Enhance APFloat to retain bits of NaNs (fixes oggenc).
Use APFloat interfaces for more references, mostly
of ConstantFPSDNode.

llvm-svn: 41632
2007-08-31 04:03:46 +00:00
Bill Wendling 10e18dea2a Use i64 on a PPC64 machine
llvm-svn: 41590
2007-08-30 00:59:19 +00:00
Chris Lattner d8c9cb9182 rename isOperandValidForConstraint to LowerAsmOperandForConstraint,
changing the interface to allow for future changes.

llvm-svn: 41384
2007-08-25 00:47:38 +00:00
Evan Cheng 581d2795dc Vector fneg must be expanded into fsub -0.0, X.
llvm-svn: 40586
2007-07-30 07:51:22 +00:00
Duncan Sands 644f917358 Support for trampolines, except for X86 codegen which is
still under discussion.

llvm-svn: 40549
2007-07-27 12:58:54 +00:00
Lauro Ramos Venancio 09d73c0305 Assert when TLS is not implemented.
llvm-svn: 39737
2007-07-11 17:19:51 +00:00
Dan Gohman f8f531bf69 Change getCopyToParts and getCopyFromParts to always use target-endian
register ordering, for both physical and virtual registers. Update the PPC
target lowering for calls to expect registers for the call result to
already be in target order.

llvm-svn: 38471
2007-07-09 20:59:04 +00:00
Dan Gohman 309d3d51b3 Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.

llvm-svn: 37704
2007-06-22 14:59:07 +00:00
Chris Lattner 7936d91f70 describe an argument, hide it.
llvm-svn: 37650
2007-06-19 05:46:06 +00:00
Chris Lattner 944200be45 If a function is vararg, never pass inreg arguments in registers. Thanks to
Anton for half of this patch.

llvm-svn: 37641
2007-06-19 00:13:10 +00:00
Dan Gohman 5c4413120f Rename MVT::getVectorBaseType to MVT::getVectorElementType.
llvm-svn: 37579
2007-06-14 22:58:02 +00:00
Dan Gohman c12dd5207d Apply this patch:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070514/049845.html

llvm-svn: 37240
2007-05-18 23:21:46 +00:00
Chris Lattner 0b7472da6f fix some subtle inline asm selection issues
llvm-svn: 37067
2007-05-15 01:31:05 +00:00
Chris Lattner 19ccd6226c Fix a bug in PPCTargetLowering::isLegalAddressingMode, scales other than 0/1/2
are always unsupported.

llvm-svn: 35835
2007-04-09 22:10:05 +00:00
Nicolas Geoffray 23710a7da3 Starting implementation of the ELF32 ABI specification of varargs handling.
LowerVASTART emits the right code if the subtarget is ELF32, the other intrinsics
(VAARG, VACOPY and VAEND) are not yet implemented.

llvm-svn: 35625
2007-04-03 13:59:52 +00:00
Nicolas Geoffray b3e99a18ee The PPC64 ELF ABI is "intended to use the same structure layout and calling convention rules
as the 64-bit PowerOpen ABI" (Reference http://www.linux-foundation.org/spec/ELF/ppc64/).
Change all ELF tests to ELF32.

llvm-svn: 35624
2007-04-03 12:35:28 +00:00
Nicolas Geoffray fbfc451ba9 The ELF ABI specifies F1-F8 registers as argument registers for double, not
F1-F10. This affects only ELF, not MachO.

llvm-svn: 35622
2007-04-03 10:27:07 +00:00
Chris Lattner 1eb94d973a implement the new addressing mode description hook.
llvm-svn: 35521
2007-03-30 23:15:24 +00:00
Lauro Ramos Venancio 682baf2dda "The C standards do say that "char" may either be a "signed char" or "unsigned
char" and it is up to the compilers implementation or the platform which is
followed."
http://www.arm.linux.org.uk/docs/faqs/signedchar.php

llvm-svn: 35382
2007-03-27 16:33:08 +00:00
Chris Lattner d685514e2e switch TargetLowering::getConstraintType to take the entire constraint,
not just the first letter.  No functionality change.

llvm-svn: 35322
2007-03-25 02:14:49 +00:00
Nicolas Geoffray 7aad92868c Stack and register alignment of call arguments in the ELF ABI
llvm-svn: 35083
2007-03-13 15:02:46 +00:00
Evan Cheng b9dce9db85 More flexible TargetLowering LSR hooks for testing whether an immediate is a legal target address immediate or scale.
llvm-svn: 35074
2007-03-12 23:29:01 +00:00
Chris Lattner 4f2e4e0f92 Switch PPC return lower to use an autogenerated CC description.
llvm-svn: 34940
2007-03-06 00:59:59 +00:00
Nicolas Geoffray 75ab9799df Implemented the frameaddress intrinsic for PPC.
llvm-svn: 34787
2007-03-01 13:11:38 +00:00
Nicolas Geoffray 89d81878d2 Differentiate between the MachO and the ELF ABI the CALL instruction.
llvm-svn: 34667
2007-02-27 13:01:19 +00:00
Chris Lattner 535bd6d3ba always lower to RETFLAG, never leave it as just ret.
llvm-svn: 34639
2007-02-26 19:44:02 +00:00
Chris Lattner 1ee61ab414 no really, this is the right patch
llvm-svn: 34605
2007-02-25 20:01:40 +00:00
Chris Lattner 4d2f5f8740 always promote float varargs to double.
llvm-svn: 34604
2007-02-25 19:59:18 +00:00
Chris Lattner 43df5b335c implement support for the linux/ppc function call ABI. Patch by
Nicolas Geoffray!

llvm-svn: 34574
2007-02-25 05:34:32 +00:00
Jim Laskey e0008e23cf Simplify lowering and selection of exception ops.
llvm-svn: 34488
2007-02-22 14:56:36 +00:00
Jim Laskey 3796abea0f Support to provide exception and selector registers.
llvm-svn: 34482
2007-02-21 22:54:50 +00:00
Chris Lattner 1f7d60262e Fix ixaddrs as well, allowing ppc64 to compile to:
_test2:
        li r2, 0
        lis r3, 1
        std r2, 9024(r3)
        blr

instead of:

_test2:
        lis r2, 1
        li r3, 0
        ori r2, r2, 9024
        std r3, 0(r2)
        blr

This implements CodeGen/PowerPC/LargeAbsoluteAddr.ll:test2

llvm-svn: 34373
2007-02-17 06:57:26 +00:00
Chris Lattner 4a9c0bb147 Compile test/CodeGen/PowerPC/LargeAbsoluteAddr.ll to:
_test:
        lis r2, 743
        li r3, 0
        stw r3, 32751(r2)
        blr

instead of:

_test:
        li r2, 0
        stw r2, 32751(48693248)
        blr

Implement support for ppc64 as well, allowing it to produce better code.

llvm-svn: 34371
2007-02-17 06:44:03 +00:00
Nate Begeman eda5997cc8 Finish off bug 680, allowing targets to custom lower frame and return
address nodes.

llvm-svn: 33636
2007-01-29 22:58:52 +00:00
Anton Korobeynikov 037c867b54 Propagate changes from my local tree. This patch includes:
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.

NOTEs: 1. Documentation will be updated soon.
       2. llvm-upgrade should be improved to translate csret => sret.
          Before this, there will be some unexpected test fails.
llvm-svn: 33597
2007-01-28 13:31:35 +00:00
Jim Laskey f9e5445ed4 Make LABEL a builtin opcode.
llvm-svn: 33537
2007-01-26 14:34:52 +00:00
Evan Cheng 8bc7ddb84d setSetCCIsExpensive is gone.
llvm-svn: 32941
2007-01-05 23:42:53 +00:00
Jim Laskey f07cc990dd Provide support for FP_TO_UINT.
llvm-svn: 32599
2006-12-15 14:32:57 +00:00
Chris Lattner f4646a7e54 Another step forward in PPC64 JIT support: we now no-longer need stubs
emitted for external globals in PPC64-JIT-PIC mode (which is good because
we didn't handle them before!).

This also fixes a bug handling the picbase delta, which we would get wrong
in some cases.

llvm-svn: 32451
2006-12-11 23:22:45 +00:00
Jim Laskey 2b136a73bd Missing opcode.
llvm-svn: 32439
2006-12-11 18:45:56 +00:00
Anton Korobeynikov 3b7c257cae Cleaned setjmp/longjmp lowering interfaces. Now we're producing right
code (both asm & cbe) for Mingw32 target.
Removed autoconf checks for underscored versions of setjmp/longjmp.

llvm-svn: 32415
2006-12-10 23:12:42 +00:00
Chris Lattner d8e7451dc3 Fix i64 uint_to_fp on ppc64
llvm-svn: 32297
2006-12-07 01:24:16 +00:00
Jim Laskey e4f4d048dd Restoration of the stack pointer after a deallocation of a alloca was not
updating the SP link.

llvm-svn: 32202
2006-12-04 22:04:42 +00:00
Jim Laskey 1b0bc794e6 1. In ppc64 mode we need only use one GPR.
2. Float values need to be promoted to double when they are vararg.

llvm-svn: 32074
2006-12-01 16:30:47 +00:00
Chris Lattner 09ed0ff2ac Fix the CodeGen/PowerPC/vec_constants.ll regression.
llvm-svn: 32057
2006-12-01 01:45:39 +00:00
Chris Lattner 2f648fc55d Fix bug codegen'ing FP constant vectors with integer splats. Make sure the
created intrinsics have the right integer types.  This fixes
PowerPC/2006-11-29-AltivecFPSplat.ll

llvm-svn: 32024
2006-11-29 19:58:49 +00:00
Jim Laskey 152671f0bf Offset for load of 32-bit arg in 64-bit world was incorrect.
llvm-svn: 32019
2006-11-29 13:37:09 +00:00
Jim Laskey 40182179b6 Remove debug code.
llvm-svn: 31970
2006-11-28 18:27:02 +00:00
Jim Laskey f4e2e009d9 32-bit int space was not accounted for properly in lowerCall.
llvm-svn: 31966
2006-11-28 14:53:52 +00:00
Evan Cheng 20350c4025 Change MachineInstr ctor's to take a TargetInstrDescriptor reference instead
of opcode and number of operands.

llvm-svn: 31947
2006-11-27 23:37:22 +00:00
Chris Lattner 2cca385fbb on ppc64, float arguments take 8-byte stack slots not 4-byte stack slots.
Also, valist should create a pointer RC reg class value, not a GPRC value.

llvm-svn: 31840
2006-11-18 01:57:19 +00:00
Chris Lattner be9377a1e3 convert PPC::BCC to use the 'pred' operand instead of separate predicate
value and CR reg #.  This requires swapping the order of these everywhere
that touches BCC and requires us to write custom matching logic for
PPCcondbranch :(

llvm-svn: 31835
2006-11-17 22:37:34 +00:00
Chris Lattner e0263794f4 rename PPC::COND_BRANCH to PPC::BCC
llvm-svn: 31834
2006-11-17 22:14:47 +00:00
Chris Lattner 8c6a41ea12 start using PPC predicates more consistently.
llvm-svn: 31833
2006-11-17 22:10:59 +00:00
Jim Laskey 48850c10c0 This is a general clean up of the PowerPC ABI. Address several problems and
bugs including making sure that the TOS links back to the previous frame,
that the maximum call frame size is not included twice when using frame
pointers, no longer growing the frame on calls, double storing of SP and
a cleaner/faster dynamic alloca.

llvm-svn: 31792
2006-11-16 22:43:37 +00:00
Chris Lattner 474b5b7c95 fix ldu/stu jit encoding. Swith 64-bit preinc load instrs to use memri
addrmodes.

llvm-svn: 31757
2006-11-15 19:55:13 +00:00
Chris Lattner 97ff46b3cc lower "X = seteq Y, Z" to '(shr (ctlz (xor Y, Z)), 5)' instead of
'(shr (ctlz (sub Y, Z)), 5)'.

The use of xor better exposes the operation to bit-twiddling logic in the
dag combiner.  For example, this:

typedef struct {
  unsigned prefix : 4;
  unsigned code : 4;
  unsigned unsigned_p : 4;
} tree_common;

int foo(tree_common *a, tree_common *b) {
  return a->code == b->code;
}

Now compiles to:

_foo:
        lwz r2, 0(r4)
        lwz r3, 0(r3)
        xor r2, r3, r2
        rlwinm r2, r2, 28, 28, 31
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

instead of:

_foo:
        lbz r2, 3(r4)
        lbz r3, 3(r3)
        srwi r2, r2, 4
        srwi r3, r3, 4
        subf r2, r2, r3
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

saving a cycle.

llvm-svn: 31725
2006-11-14 05:28:08 +00:00
Chris Lattner 683712583a minor tweaks, reject vector preinc.
llvm-svn: 31717
2006-11-14 01:38:31 +00:00
Chris Lattner b314b155ed ppc64 doesn't have lwau, don't attempt to form it.
llvm-svn: 31656
2006-11-11 00:08:42 +00:00
Chris Lattner c9fa36d706 implement preinc support for r+i loads on ppc64
llvm-svn: 31654
2006-11-10 23:58:45 +00:00
Chris Lattner ce6455489a add an initial cut at preinc loads for ppc32. This is broken for ppc64
(because the 64-bit reg target versions aren't implemented yet), doesn't
support r+r addr modes, and doesn't handle stores, but it works otherwise. :)

This is disabled unless -enable-ppc-preinc is passed to llc for now.

llvm-svn: 31621
2006-11-10 02:08:47 +00:00
Evan Cheng 36a8fbf771 PPC supports i32 / i64 pre-inc load / store.
llvm-svn: 31599
2006-11-09 19:11:50 +00:00
Evan Cheng b15000736c Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Chris Lattner a801fcedd3 Refactor all the addressing mode selection stuff into the isel lowering
class, where it can be used for preinc formation.

llvm-svn: 31536
2006-11-08 02:15:41 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Chris Lattner 584a11ae22 Implement the getRegForInlineAsmConstraint method for PPC. With recent
sdisel changes, this eliminates a ton of copies around common inline asms.
For example:

int test2(int Y, int X) {
  asm("foo %0, %1" : "=r"(X): "r"(X));
  return X;
}

now compiles to:

_test2:
        foo r3, r4
        blr

instead of:

_test2:
        mr r2, r4
        foo r2, r2
        mr r3, r2
        blr

GCC produces:

_test2:
        foo r4, r4
        mr r3,r4
        blr

llvm-svn: 31367
2006-11-02 01:44:04 +00:00
Chris Lattner 8c6949e5b2 Change the prototype for TargetLowering::isOperandValidForConstraint
llvm-svn: 31318
2006-10-31 19:40:43 +00:00
Evan Cheng 0d41d19427 All targets expand BR_JT for now.
llvm-svn: 31294
2006-10-30 08:02:39 +00:00
Chris Lattner 454436dcc5 set the ppc64 stack pointer right, dynamic alloca now works for ppc64
llvm-svn: 31028
2006-10-18 01:20:43 +00:00
Chris Lattner ab4df83426 Expand alloca for ppc64
llvm-svn: 31027
2006-10-18 01:18:48 +00:00
Evan Cheng ab51cf2e78 Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Evan Cheng e71fe34d75 Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Evan Cheng df9ac47e5e Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng 5d9fd977d3 Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
extra operand to LOADX to specify the exact value extension type.

llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Chris Lattner 601b86513d Legalize is no longer limited to cleverness with just constant shift amounts.
Allow it to be clever when possible and fall back to the gross code when needed.

This allows us to compile:

long long foo1(long long X, int C) {
  return X << (C|32);
}
long long foo2(long long X, int C) {
  return X << (C&~32);
}

to:
_foo1:
        rlwinm r2, r5, 0, 27, 31
        slw r3, r4, r2
        li r4, 0
        blr


        .globl  _foo2
        .align  4
_foo2:
        rlwinm r2, r5, 0, 27, 25
        subfic r5, r2, 32
        slw r3, r3, r2
        srw r5, r4, r5
        or r3, r3, r5
        slw r4, r4, r2
        blr

instead of:

_foo1:
        ori r2, r5, 32
        subfic r5, r2, 32
        addi r6, r2, -32
        srw r5, r4, r5
        slw r3, r3, r2
        slw r6, r4, r6
        or r3, r3, r5
        slw r4, r4, r2
        or r3, r3, r6
        blr


        .globl  _foo2
        .align  4
_foo2:
        rlwinm r2, r5, 0, 27, 25
        subfic r5, r2, 32
        addi r6, r2, -32
        srw r5, r4, r5
        slw r3, r3, r2
        slw r6, r4, r6
        or r3, r3, r5
        slw r4, r4, r2
        or r3, r3, r6
        blr

llvm-svn: 30507
2006-09-20 03:47:40 +00:00
Chris Lattner 3c48ea54ee Fold the PPCISD shifts when presented with 0 inputs. This occurs for code
like:
long long test(long long X, int Y) {
  return 1ULL << Y;
}
long long test2(long long X, int Y) {
  return -1LL << Y;
}

which we used to compile to:

_test:
        li r2, 1
        subfic r3, r5, 32
        li r4, 0
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r4, r5
        slw r6, r2, r6
        or r3, r4, r3
        slw r4, r2, r5
        or r3, r3, r6
        blr
_test2:
        li r2, -1
        subfic r3, r5, 32
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r2, r5
        slw r2, r2, r6
        or r3, r4, r3
        or r3, r3, r2
        blr

Now we produce:

_test:
        li r2, 1
        addi r3, r5, -32
        subfic r4, r5, 32
        slw r3, r2, r3
        srw r4, r2, r4
        or r3, r4, r3
        slw r4, r2, r5
        blr
_test2:
        li r2, -1
        subfic r3, r5, 32
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r2, r5
        slw r2, r2, r6
        or r3, r4, r3
        or r3, r3, r2
        blr

llvm-svn: 30479
2006-09-19 05:22:59 +00:00
Evan Cheng 9a083a4121 Reflects MachineConstantPoolEntry changes.
llvm-svn: 30279
2006-09-12 21:04:05 +00:00
Reid Spencer e7141c8be6 For PR387:
Close out this long standing bug by removing the remaining overloaded
virtual functions in LLVM. The -Woverloaded-virtual option is now turned on.

llvm-svn: 29934
2006-08-28 01:02:49 +00:00
Chris Lattner 095e4ad2ea Fix a bug in a recent refactoring that broke a bunch of stuff.
llvm-svn: 29649
2006-08-12 07:20:05 +00:00
Chris Lattner ed728e8dc9 Eliminate use of getNode that takes a vector.
llvm-svn: 29614
2006-08-11 17:38:39 +00:00
Chris Lattner d66f14e846 Convert vectors to fixed sized arrays and smallvectors. Eliminate use of getNode that takes a vector.
llvm-svn: 29609
2006-08-11 17:18:05 +00:00
Chris Lattner 66f1fbaaad Fix miscompilation of float vector returns. Compile code to this:
_func:
        vsldoi v2, v3, v2, 12
        vsldoi v2, v2, v2, 4
        blr

instead of:

_func:
        vsldoi v2, v3, v2, 12
        vsldoi v2, v2, v2, 4
***     vor f1, v2, v2
        blr

llvm-svn: 29607
2006-08-11 16:47:32 +00:00
Chris Lattner 8298265042 Fix some ppc64 issues with vector code.
llvm-svn: 29384
2006-07-28 16:45:47 +00:00
Chris Lattner 9e56e5c003 Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC.
llvm-svn: 29307
2006-07-26 21:12:04 +00:00
Chris Lattner a7976d329e Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps
into i16/i32 load/stores.

llvm-svn: 29089
2006-07-10 20:56:58 +00:00
Chris Lattner 8aed3cc46b Implement 64-bit select, bswap, etc.
llvm-svn: 28935
2006-06-27 20:14:52 +00:00
Chris Lattner a07410c95b PPC doesn't have bit converts to/from i64
llvm-svn: 28932
2006-06-27 18:40:08 +00:00
Chris Lattner d48ce27532 Implement 64-bit undef, sub, shl/shr, srem/urem
llvm-svn: 28929
2006-06-27 18:18:41 +00:00
Chris Lattner cb5a84f446 Use i32 for shift amounts instead of i64. This gets bisort working.
llvm-svn: 28927
2006-06-27 17:34:57 +00:00
Chris Lattner 97b3da1519 Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but
doesn't work right).

llvm-svn: 28921
2006-06-27 00:04:13 +00:00
Chris Lattner ec78cade34 Improve PPC64 calling convention support
llvm-svn: 28919
2006-06-26 22:48:35 +00:00
Chris Lattner dc38e6f322 Correct returns of 64-bit values, though they seemed to work before...
llvm-svn: 28892
2006-06-21 00:34:03 +00:00
Chris Lattner a5190ae7a9 fix some assumptions that pointers can only be 32-bits. With this, we can
now compile:

static unsigned long X;
void test1() {
  X = 0;
}

into:

_test1:
        lis r2, ha16(_X)
        li r3, 0
        stw r3, lo16(_X)(r2)
        blr

Totally amazing :)

llvm-svn: 28839
2006-06-16 21:01:35 +00:00
Chris Lattner a35f306740 Rename some subtarget features. A CPU now can *have* 64-bit instructions,
can in 32-bit mode we can choose to optionally *use* 64-bit registers.

llvm-svn: 28824
2006-06-16 17:34:12 +00:00
Evan Cheng 94bb93f8f7 Type of extract_element index operand should be iPTR.
llvm-svn: 28797
2006-06-15 08:18:06 +00:00
Chris Lattner 006b2c6ab9 Fix a problem exposed by the local allocator. CALL instructions are not marked
as using incoming argument registers, so the local allocator would clobber them
between their set and use.  To fix this, we give the call instructions a variable
number of uses in the CALL MachineInstr itself, so live variables understands
the live ranges of these register arguments.

llvm-svn: 28744
2006-06-10 01:14:28 +00:00
Chris Lattner b9342afa56 Always reserve space for 8 spilled GPRs. GCC apparently assumes that this
space will be available, even if the callee isn't varargs.

llvm-svn: 28571
2006-05-30 21:21:04 +00:00
Evan Cheng a3add0fea8 Change RET node to include signness information of the return values. i.e.
RET chain, value1, sign1, value2, sign2, ...

llvm-svn: 28510
2006-05-26 23:10:12 +00:00
Evan Cheng c2cd473d9b CALL node change (arg / sign pairs instead of just arguments).
llvm-svn: 28462
2006-05-25 00:57:32 +00:00
Chris Lattner aa2372562e Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov!  This is a step towards closing PR786.

llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner 33165c246c Fix CodeGen/Generic/vector.ll:test_div with altivec.
llvm-svn: 28445
2006-05-24 00:15:25 +00:00
Chris Lattner b56d22c2f6 Handle SETO* like we handle SET*, restoring behavior after Evan's setcc
change.  This fixes PowerPC/fnegsel.ll.

llvm-svn: 28443
2006-05-24 00:06:44 +00:00
Chris Lattner eb755fc1b3 Make PPC call lowering more aggressive, making the isel matching code simple
enough to be autogenerated.

llvm-svn: 28354
2006-05-17 19:00:46 +00:00
Chris Lattner b1e9e37c58 Switch PPC over to a call-selection model where the lowering code creates
the copyto/fromregs instead of making the PPCISD::CALL selection code create
them.  This vastly simplifies the selection code, and moves the ABI handling
parts into one place.

llvm-svn: 28346
2006-05-17 06:01:33 +00:00
Chris Lattner b7552a88d6 3 changes, 2 of which are cleanup one of which changes codegen:
1. Rearrange code a bit so that the special case doesn't require indenting lots
   of code.
2. Add comments describing PPC calling convention.
3. Only round up to 56-bytes of stack space for an outgoing call if the callee
   is varargs.  This saves a bit of stack space.

llvm-svn: 28342
2006-05-17 00:15:40 +00:00
Chris Lattner f058f5aef1 implement passing/returning vector regs to calls, at least non-varargs calls.
llvm-svn: 28341
2006-05-16 23:54:25 +00:00
Chris Lattner aa40ec1b32 Instead of implementing LowerCallTo directly, let the default impl produce an
ISD::CALL node, then custom lower that.  This means that we only have to handle
LEGAL call operands/results, not every possible type.  This allows us to
simplify the call code, shrinking it by about 1/3.

llvm-svn: 28339
2006-05-16 22:56:08 +00:00
Chris Lattner 26e2fcd8b1 Simplify the argument counting logic by only incrementing the index.
llvm-svn: 28335
2006-05-16 18:58:15 +00:00
Chris Lattner 76c47b50e7 Simplify the dead argument handling code.
llvm-svn: 28334
2006-05-16 18:54:32 +00:00
Chris Lattner 318f0d2122 Vector args passed in registers don't reserve stack space.
llvm-svn: 28333
2006-05-16 18:51:52 +00:00
Chris Lattner 4302e8fb67 Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
handling.  This makes the lower argument code significantly simpler (we
only need to handle legal argument types).

Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.

llvm-svn: 28331
2006-05-16 18:18:50 +00:00
Chris Lattner d2ca9abf57 Fit in 80 cols
llvm-svn: 28311
2006-05-16 04:20:24 +00:00
Chris Lattner ae48a894b1 Remove dead var, fix bad override.
llvm-svn: 28264
2006-05-12 21:09:57 +00:00
Chris Lattner 84b49d51be Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Nate Begeman 4ca2ea5b43 JumpTable support! What this represents is working asm and jit support for
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.

llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner 518834c67e Fix a crash on:
void foo2(vector float *A, vector float *B) {
  vector float C = (vector float)vec_cmpeq(*A, *B);
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
  *A = C;
}

llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Chris Lattner 1e174c87c3 pretty print node name
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner 9754d142a4 Implement an important entry from README_ALTIVEC:
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's.  Instead, just branch on CR6 directly. :)

For example, for:
void foo2(vector float *A, vector float *B) {
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
}

We now generate:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

instead of:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        cmpwi cr0, r3, 0
        beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

This implements CodeGen/PowerPC/vec_br_cmp.ll.

llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner 96d50487c9 Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
even/odd halves.  Thanks to Nate telling me what's what.

llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner d6d82aa889 Implement v16i8 multiply with this code:
vmuloub v5, v3, v2
        vmuleub v2, v3, v2
        vperm v2, v2, v5, v4

This implements CodeGen/PowerPC/vec_mul.ll.  With this, v16i8 multiplies are
6.79x faster than before.

Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.

Remove the 'integer multiplies' todo from the README file.

llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner 7e439874cb Lower v8i16 multiply into this code:
li r5, lo16(LCPI1_0)
        lis r6, ha16(LCPI1_0)
        lvx v4, r6, r5
        vmulouh v5, v3, v2
        vmuleuh v2, v3, v2
        vperm v2, v2, v5, v4

where v4 is:
LCPI1_0:                                        ;  <16 x ubyte>
        .byte   2
        .byte   3
        .byte   18
        .byte   19
        .byte   6
        .byte   7
        .byte   22
        .byte   23
        .byte   10
        .byte   11
        .byte   26
        .byte   27
        .byte   14
        .byte   15
        .byte   30
        .byte   31

This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.

llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner a2cae1bb10 Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll

llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner e54133cfba Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.

llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner 264c908e3a Teach the ppc backend to use rol and vsldoi to generate splatted constants.
This implements vec_constants.ll:test_vsldoi and test_rol

llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner 1b3806ace5 Make some code more general, adding support for constant formation of several
new patterns.

llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner f8dd76df5b Learn how to make odd splatted constants in range [17,29]. This implements
PowerPC/vec_constants.ll:test_29.

llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner 2a099c04c1 Pull some code out into a helper function.
Effeciently codegen even splats in the range [-32,30].

This allows us to codegen <30,30,30,30> as:

        vspltisw v0, 15
        vadduwm v2, v0, v0

instead of as a cp load.

llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner 071ad01ceb Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such.  This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll

llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner 06a21ba96b Implement a TODO: have the legalizer canonicalize a bunch of operations to
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.

llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner fa5aa396c2 Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
Remove some done items from the todo list.

llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner 24acbe46c0 Fix a crash when faced with a shuffle vector that has an undef in its mask.
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner 559c8ba466 Allow undef in a shuffle mask
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner 4211ca9108 Move the rest of the PPCTargetLowering::LowerOperation cases out into
separate functions, for simplicity and code clarity.

llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner 19e9055eb5 Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
functions, which makes the code much cleaner :)

llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner 883fb053bd Force non-darwin targets to use a static relo model. This fixes PR734,
tested by CodeGen/Generic/vector.ll

llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner 147e50e1c5 Add a new way to match vector constants, which make it easier to bang bits of
different types.

Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1.  This compiles:

typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
  *P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
  *P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
  *P3 = vec_abs((vector float)*P3);
}

to:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        vspltisw v0, -1
        vslw v0, v0, v0
        lvx v1, 0, r3
        vand v1, v1, v0
        stvx v1, 0, r3
        lvx v1, 0, r4
        vandc v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vandc v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

instead of (with two constant pool entries):

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        li r6, lo16(LCPI1_0)
        lis r7, ha16(LCPI1_0)
        li r8, lo16(LCPI1_1)
        lis r9, ha16(LCPI1_1)
        lvx v0, r7, r6
        lvx v1, 0, r3
        vand v0, v1, v0
        stvx v0, 0, r3
        lvx v0, r9, r8
        lvx v1, 0, r4
        vand v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vand v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

GCC produces (with 2 cp entries):

_test:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc00c
        mtspr 256,r0
        lis r2,ha16(LC0)
        lis r9,ha16(LC1)
        la r2,lo16(LC0)(r2)
        lvx v0,0,r3
        lvx v1,0,r5
        la r9,lo16(LC1)(r9)
        lwz r12,-4(r1)
        lvx v12,0,r2
        lvx v13,0,r9
        vand v0,v0,v12
        stvx v0,0,r3
        vspltisw v0,-1
        vslw v12,v0,v0
        vandc v1,v1,v12
        stvx v1,0,r5
        lvx v0,0,r4
        vand v0,v0,v13
        stvx v0,0,r4
        mtspr 256,r12
        blr

llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner 74cf9ff761 Rename get_VSPLI_elt -> get_VSPLTI_elt
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively.  This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI

llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner e318a7574e Ensure that zero vectors are always v4i32, which forces them to CSE with
each other.  This implements CodeGen/PowerPC/vxor-canonicalize.ll

llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Chris Lattner e4db08a2f1 Vector function results go into V2 according to GCC. The darwin ABI doc
doesn't say where they go :-/

llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner 92533cfb4a Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
No functionality change.

llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner 3a68f3c3ca properly mark vector selects as expanded to select_cc
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner 0a3d1bbca4 Add VRRC select support
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Chris Lattner d9e80f4516 Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
constant pool load.

llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner d71a1f946d Change the interface to the predicate that determines if vsplti* can be used.
No functionality changes.

llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner 466841ddc7 Make sure to return the result in the right type.
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner a4bbfaed5c Match vpku[hw]um(x,x).
Convert vsldoi(x,x) to work the same way other (x,x) cases work.

llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner f38e033270 Add support for matching vmrg(x,x) patterns
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner d1dcb52093 Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner 1d33819194 Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
lower it and LLVM to have one fewer intrinsic.  This implements
CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27450
2006-04-06 18:26:28 +00:00