Chris Lattner
dc38e6f322
Correct returns of 64-bit values, though they seemed to work before...
...
llvm-svn: 28892
2006-06-21 00:34:03 +00:00
Chris Lattner
1f1b096142
Make these predicates correct in 64-bit mode too.
...
llvm-svn: 28890
2006-06-20 23:21:20 +00:00
Chris Lattner
52a956da52
Rename OR4 -> OR. Move some PPC64-specific stuff to the 64-bit file
...
llvm-svn: 28889
2006-06-20 23:18:58 +00:00
Chris Lattner
5705d4d519
remove unused flag
...
llvm-svn: 28888
2006-06-20 23:15:07 +00:00
Chris Lattner
9d65f3507e
add some logical ops
...
llvm-svn: 28887
2006-06-20 23:11:59 +00:00
Chris Lattner
7a856a6d88
remove some unused patterns
...
llvm-svn: 28886
2006-06-20 23:11:36 +00:00
Chris Lattner
d881f8257b
Add some more immediate patterns. This allows us to compile:
...
void test6() {
Y = 0xABCD0123BCDE4567;
}
into:
_test6:
lis r2, -21555
lis r3, ha16(_Y)
ori r2, r2, 291
rldicr r2, r2, 32, 31
oris r2, r2, 48350
ori r2, r2, 17767
std r2, lo16(_Y)(r3)
blr
llvm-svn: 28885
2006-06-20 23:03:01 +00:00
Chris Lattner
9834ad2fc6
Instead of li/xoris use li/oris. Note that this doesn't work if bit 15 is
...
set, so disable the pattern in that case.
llvm-svn: 28884
2006-06-20 22:38:59 +00:00
Chris Lattner
7e742e46ac
Add some 64-bit logical ops.
...
Split imm16Shifted into a sext/zext form for 64-bit support.
Add some patterns for immediate formation. For example, we now compile this:
static unsigned long long Y;
void test3() {
Y = 0xF0F00F00;
}
into:
_test3:
li r2, 3840
lis r3, ha16(_Y)
xoris r2, r2, 61680
std r2, lo16(_Y)(r3)
blr
GCC produces:
_test3:
li r0,0
lis r2,ha16(_Y)
ori r0,r0,61680
sldi r0,r0,16
ori r0,r0,3840
std r0,lo16(_Y)(r2)
blr
llvm-svn: 28883
2006-06-20 22:34:10 +00:00
Chris Lattner
d6e160d14d
64-bit bugfix: 0xFFFF0000 cannot be formed with a single lis.
...
llvm-svn: 28880
2006-06-20 21:39:30 +00:00
Chris Lattner
2d4e8f7e86
Add some patterns for globals, so we can now compile this:
...
static unsigned long long X, Y;
void test1() {
X = Y;
}
into:
_test1:
lis r2, ha16(_Y)
lis r3, ha16(_X)
ld r2, lo16(_Y)(r2)
std r2, lo16(_X)(r3)
blr
llvm-svn: 28879
2006-06-20 21:23:06 +00:00
Chris Lattner
868a75bec6
Remove some now-unneeded casts from instruction patterns. With the casts
...
removed, tblgen produces identical output to with them in.
llvm-svn: 28867
2006-06-20 00:39:56 +00:00
Chris Lattner
94d18df658
Add some patterns for ppc64
...
llvm-svn: 28866
2006-06-20 00:38:36 +00:00
Chris Lattner
49cadab385
Implement the getPointerRegClass method, which is required for the ptr_rc
...
magic to work.
llvm-svn: 28847
2006-06-17 00:01:04 +00:00
Chris Lattner
638ee4ee15
Upgrade some load/store instructions to use the proper addressing mode stuff.
...
llvm-svn: 28841
2006-06-16 21:29:41 +00:00
Chris Lattner
e8fe5e2bf4
In 64-bit mode, addr mode operands use G8RC instead of GPRC.
...
llvm-svn: 28840
2006-06-16 21:29:03 +00:00
Chris Lattner
a5190ae7a9
fix some assumptions that pointers can only be 32-bits. With this, we can
...
now compile:
static unsigned long X;
void test1() {
X = 0;
}
into:
_test1:
lis r2, ha16(_X)
li r3, 0
stw r3, lo16(_X)(r2)
blr
Totally amazing :)
llvm-svn: 28839
2006-06-16 21:01:35 +00:00
Chris Lattner
b429983988
Split 64-bit instructions out into a separate .td file
...
llvm-svn: 28838
2006-06-16 20:22:01 +00:00
Chris Lattner
61d703183e
Force 64-bit register availability in 64-bit mode. For real.
...
llvm-svn: 28837
2006-06-16 20:05:06 +00:00
Chris Lattner
a7d9db2fa5
Remove the -darwin and -aix llc options, inferring darwinism and aixism from
...
the target triple & subtarget info. woo.
llvm-svn: 28835
2006-06-16 18:50:48 +00:00
Chris Lattner
f3b5b92e58
Don't pass target name into TargetData anymore, it is never used or needed.
...
Remove explicit casts to std::string now that there is no overload resolution
issues in the TargetData ctors.
llvm-svn: 28830
2006-06-16 18:22:52 +00:00
Chris Lattner
16682fff2b
Document the subtarget features better, make sure that 64-bit mode, 64-bit
...
support, and 64-bit register use are all consistent with each other.
Add a new "IsPPC" feature, to distinguish ppc32 vs ppc64 targets, use this
to configure TargetData differently. This not makes ppc64 blow up on lots
of stuff :)
llvm-svn: 28825
2006-06-16 17:50:12 +00:00
Chris Lattner
a35f306740
Rename some subtarget features. A CPU now can *have* 64-bit instructions,
...
can in 32-bit mode we can choose to optionally *use* 64-bit registers.
llvm-svn: 28824
2006-06-16 17:34:12 +00:00
Chris Lattner
0c4aa14deb
First baby step towards ppc64 support. This adds a new -march=ppc64 backend
...
that is currently just like ppc32 :)
llvm-svn: 28813
2006-06-16 01:37:27 +00:00
Jim Laskey
19f964e048
1. Support standard dwarf format (was bootstrapping in Apple format.)
...
2. Add vector support.
llvm-svn: 28807
2006-06-15 20:51:43 +00:00
Evan Cheng
94bb93f8f7
Type of extract_element index operand should be iPTR.
...
llvm-svn: 28797
2006-06-15 08:18:06 +00:00
Jim Laskey
f67bec0579
Place dwarf headers at earliest possible point. Well behaved when skipping
...
functions.
llvm-svn: 28781
2006-06-14 11:35:03 +00:00
Chris Lattner
ac59ab515a
Gaar! Don't use r11 for CR save/restore, use R0. R11 can be register
...
allocated, thus live across the save/reload. This fixes
llc-beta /MultiSource/Applications/spiff/spiff
llc-beta /MultiSource/Benchmarks/sim/sim:
llc-beta /MultiSource/Benchmarks/Ptrdist/bc/bc
llc-beta /MultiSource/Benchmarks/McCat/12-IOtest/iotest:
llc-beta /MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow
llc-beta /MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1
llc-beta /MultiSource/Benchmarks/mediabench/adpcm/rawdaudio/rawdaudio
llc-beta /MultiSource/Benchmarks/mediabench/adpcm/rawcaudio/rawcaudio
llc-beta /MultiSource/Benchmarks/mediabench/g721/g721encode/encode
llc-beta /MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg
and probably others, with -regalloc=local.
llvm-svn: 28761
2006-06-12 23:59:16 +00:00
Chris Lattner
6b043a24a1
Fix spilling and reloading of CR regs to reload the right values. This fixes
...
Olden/power (and probably others) with -regalloc=local.
llvm-svn: 28760
2006-06-12 21:50:57 +00:00
Chris Lattner
b055c8737f
Work around a nasty tblgen bug where it doesn't add operands for varargs
...
nodes correctly.
llvm-svn: 28745
2006-06-10 01:15:02 +00:00
Chris Lattner
006b2c6ab9
Fix a problem exposed by the local allocator. CALL instructions are not marked
...
as using incoming argument registers, so the local allocator would clobber them
between their set and use. To fix this, we give the call instructions a variable
number of uses in the CALL MachineInstr itself, so live variables understands
the live ranges of these register arguments.
llvm-svn: 28744
2006-06-10 01:14:28 +00:00
Chris Lattner
c8587d4b81
Add PowerPC intrinsics to support dcbz[l]
...
llvm-svn: 28696
2006-06-06 21:29:23 +00:00
Chris Lattner
4442a70b3a
Silence -pedantic warning
...
llvm-svn: 28633
2006-06-01 17:17:06 +00:00
Chris Lattner
b9342afa56
Always reserve space for 8 spilled GPRs. GCC apparently assumes that this
...
space will be available, even if the callee isn't varargs.
llvm-svn: 28571
2006-05-30 21:21:04 +00:00
Evan Cheng
a3add0fea8
Change RET node to include signness information of the return values. i.e.
...
RET chain, value1, sign1, value2, sign2, ...
llvm-svn: 28510
2006-05-26 23:10:12 +00:00
Chris Lattner
1fbb0d38c7
Fix build failure of povray
...
llvm-svn: 28473
2006-05-25 18:06:16 +00:00
Chris Lattner
630bbcef8d
Fix Benchmarks/MallocBench/cfrac
...
llvm-svn: 28471
2006-05-25 16:54:16 +00:00
Evan Cheng
c2cd473d9b
CALL node change (arg / sign pairs instead of just arguments).
...
llvm-svn: 28462
2006-05-25 00:57:32 +00:00
Evan Cheng
4af59dac0b
Assert if InflightSet is not cleared after instruction selecting a BB.
...
llvm-svn: 28459
2006-05-25 00:24:28 +00:00
Evan Cheng
1a8e74d113
Clear HandleMap and ReplaceMap after instruction selection. Or it may cause
...
non-deterministic behavior.
llvm-svn: 28454
2006-05-24 20:46:25 +00:00
Chris Lattner
aa2372562e
Patches to make the LLVM sources more -pedantic clean. Patch provided
...
by Anton Korobeynikov! This is a step towards closing PR786.
llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner
33165c246c
Fix CodeGen/Generic/vector.ll:test_div with altivec.
...
llvm-svn: 28445
2006-05-24 00:15:25 +00:00
Chris Lattner
b56d22c2f6
Handle SETO* like we handle SET*, restoring behavior after Evan's setcc
...
change. This fixes PowerPC/fnegsel.ll.
llvm-svn: 28443
2006-05-24 00:06:44 +00:00
Owen Anderson
80b1b4d41e
Make TargetData strings less redundant.
...
llvm-svn: 28423
2006-05-20 23:28:54 +00:00
Owen Anderson
88812b5c0a
Make all of the TargetMachine subclasses use the new string TargetData methods.
...
This is part of the on-going work on PR 761.
llvm-svn: 28414
2006-05-20 00:24:56 +00:00
Evan Cheng
305c49579c
getCalleeSaveRegs and getCalleeSaveRegClasses are no long TableGen'd.
...
llvm-svn: 28378
2006-05-18 00:12:58 +00:00
Evan Cheng
dcec882286
Remove PointerType from class Target
...
llvm-svn: 28368
2006-05-17 21:20:27 +00:00
Chris Lattner
6353807fdc
Add a note about a note
...
llvm-svn: 28355
2006-05-17 19:02:25 +00:00
Chris Lattner
eb755fc1b3
Make PPC call lowering more aggressive, making the isel matching code simple
...
enough to be autogenerated.
llvm-svn: 28354
2006-05-17 19:00:46 +00:00
Chris Lattner
b1e9e37c58
Switch PPC over to a call-selection model where the lowering code creates
...
the copyto/fromregs instead of making the PPCISD::CALL selection code create
them. This vastly simplifies the selection code, and moves the ABI handling
parts into one place.
llvm-svn: 28346
2006-05-17 06:01:33 +00:00
Chris Lattner
b7552a88d6
3 changes, 2 of which are cleanup one of which changes codegen:
...
1. Rearrange code a bit so that the special case doesn't require indenting lots
of code.
2. Add comments describing PPC calling convention.
3. Only round up to 56-bytes of stack space for an outgoing call if the callee
is varargs. This saves a bit of stack space.
llvm-svn: 28342
2006-05-17 00:15:40 +00:00
Chris Lattner
f058f5aef1
implement passing/returning vector regs to calls, at least non-varargs calls.
...
llvm-svn: 28341
2006-05-16 23:54:25 +00:00
Chris Lattner
aa40ec1b32
Instead of implementing LowerCallTo directly, let the default impl produce an
...
ISD::CALL node, then custom lower that. This means that we only have to handle
LEGAL call operands/results, not every possible type. This allows us to
simplify the call code, shrinking it by about 1/3.
llvm-svn: 28339
2006-05-16 22:56:08 +00:00
Chris Lattner
26e2fcd8b1
Simplify the argument counting logic by only incrementing the index.
...
llvm-svn: 28335
2006-05-16 18:58:15 +00:00
Chris Lattner
76c47b50e7
Simplify the dead argument handling code.
...
llvm-svn: 28334
2006-05-16 18:54:32 +00:00
Chris Lattner
318f0d2122
Vector args passed in registers don't reserve stack space.
...
llvm-svn: 28333
2006-05-16 18:51:52 +00:00
Chris Lattner
4302e8fb67
Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
...
handling. This makes the lower argument code significantly simpler (we
only need to handle legal argument types).
Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.
llvm-svn: 28331
2006-05-16 18:18:50 +00:00
Chris Lattner
d2ca9abf57
Fit in 80 cols
...
llvm-svn: 28311
2006-05-16 04:20:24 +00:00
Chris Lattner
04a9e38369
Remove some dead code, identified by coverity.
...
llvm-svn: 28303
2006-05-15 05:48:32 +00:00
Chris Lattner
ae48a894b1
Remove dead var, fix bad override.
...
llvm-svn: 28264
2006-05-12 21:09:57 +00:00
Chris Lattner
f76c42776d
remove dead variable.
...
llvm-svn: 28248
2006-05-12 17:33:59 +00:00
Chris Lattner
a296339c87
Fix PowerPC/2006-05-12-rlwimi-crash.ll
...
Nate, please verify that if InsertMask is 0, rlwimi shouldn't be used.
This fixes the crash and causes no PPC testsuite regressions.
llvm-svn: 28243
2006-05-12 16:29:37 +00:00
Owen Anderson
8c2c1e90c4
Refactor a bunch of includes so that TargetMachine.h doesn't have to include
...
TargetData.h. This should make recompiles a bit faster with my current
TargetData tinkering.
llvm-svn: 28238
2006-05-12 06:33:49 +00:00
Chris Lattner
b25cb79604
Fix the PowerPC JIT-only failure on UnitTests/Vector/sumarray-dbl, which is
...
really a bad codegen bug that LLC happens to get lucky with. I must chat with
Nate for the proper fix.
llvm-svn: 28213
2006-05-10 06:38:32 +00:00
Chris Lattner
2814134a5d
Indent .data/.text in the .s file
...
llvm-svn: 28204
2006-05-09 16:15:00 +00:00
Chris Lattner
8488ba2e41
Split SwitchSection into SwitchTo{Text|Data}Section methods.
...
llvm-svn: 28184
2006-05-09 04:59:56 +00:00
Nate Begeman
ce6646c366
Yet more readme updating
...
llvm-svn: 28172
2006-05-08 20:54:02 +00:00
Nate Begeman
68a45419cc
New note about something bad happening in target independent optimizers
...
llvm-svn: 28170
2006-05-08 20:08:28 +00:00
Nate Begeman
0eb8f2e496
Proving once again that I am not as smart as the compiler
...
llvm-svn: 28169
2006-05-08 19:09:24 +00:00
Nate Begeman
9b6d4c2968
Fold more shifts into inserts, and update the README
...
llvm-svn: 28168
2006-05-08 17:38:32 +00:00
Nate Begeman
dc996b3f6c
Update some stuff now that the new rlwimi code has gone in
...
llvm-svn: 28162
2006-05-08 02:52:38 +00:00
Nate Begeman
1333cead5b
New rlwimi implementation, which is superior to the old one. There are
...
still a couple missed optimizations, but we now generate all the possible
rlwimis for multiple inserts into the same bitfield. More regression tests
to come.
llvm-svn: 28156
2006-05-07 00:23:38 +00:00
Chris Lattner
8b9e11c110
Print a grouping around inline asm blocks so that we can tell when we are
...
using them.
llvm-svn: 28134
2006-05-05 21:50:04 +00:00
Chris Lattner
304bbf3b1d
New note, Nate, please check to see if I'm full of it :)
...
llvm-svn: 28118
2006-05-05 05:36:15 +00:00
Chris Lattner
10b71c0d08
Rename MO_VirtualRegister -> MO_Register. Clean up immediate handling.
...
llvm-svn: 28104
2006-05-04 18:05:43 +00:00
Chris Lattner
10d6341618
Move some methods out of MachineInstr into MachineOperand
...
llvm-svn: 28102
2006-05-04 17:52:23 +00:00
Chris Lattner
fef7a2d0f5
There shalt be only one "immediate" operand type!
...
llvm-svn: 28099
2006-05-04 17:21:20 +00:00
Chris Lattner
13d5f3eb05
Revert Nate's CR patch from last night, which caused many regressions (e.g. fhourstones).
...
Loading and storing off R0 isn't what we wanted. Also, taking some CR's out of
CRRC seems to cause failures as well. Further investigation is required.
llvm-svn: 28097
2006-05-04 16:56:45 +00:00
Chris Lattner
940cc978ef
Remove a bunch more SparcV9 specific stuff
...
llvm-svn: 28093
2006-05-04 01:15:02 +00:00
Chris Lattner
9f6639b64d
Remove some more unused stuff from MachineInstr that was leftover from V9.
...
llvm-svn: 28091
2006-05-04 00:44:25 +00:00
Chris Lattner
e3a9c70ba0
Change from using MachineRelocation ctors to using static methods
...
in MachineRelocation to create Relocations.
llvm-svn: 28088
2006-05-03 20:30:20 +00:00
Chris Lattner
1d8ee1fc80
Suck block address tracking out of targets into the JIT Emitter. This
...
simplifies the MachineCodeEmitter interface just a little bit and makes
BasicBlocks work like constant pools and jump tables.
llvm-svn: 28082
2006-05-03 17:10:41 +00:00
Owen Anderson
20a631fde7
Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.
...
This fixes PR 759.
llvm-svn: 28074
2006-05-03 01:29:57 +00:00
Chris Lattner
d8b192ba3b
Change the BasicBlockAddrs map to be a vector, indexed by MBB number.
...
llvm-svn: 28069
2006-05-03 00:32:55 +00:00
Chris Lattner
b8065a9a3a
Several related changes:
...
1. Change several methods in the MachineCodeEmitter class to be pure virtual.
2. Suck emitConstantPool/initJumpTableInfo into startFunction, removing them
from the MachineCodeEmitter interface, and reducing the amount of target-
specific code.
3. Change the JITEmitter so that it allocates constantpools and jump tables
*right* next to the functions that they belong to, instead of in a separate
pool of memory. This makes all memory for a function be contiguous, and
means the JITEmitter only tracks one block of memory now.
llvm-svn: 28065
2006-05-02 23:22:24 +00:00
Chris Lattner
e1c96369e2
Fix a purely hypothetical problem (for now): emitWord emits in the host
...
byte format. This doesn't work when using the code emitter in a cross target
environment. Since the code emitter is only really used by the JIT, this
isn't a current problem, but if we ever start emitting .o files, it would be.
llvm-svn: 28060
2006-05-02 19:14:47 +00:00
Chris Lattner
c9aa3715e8
Refactor the machine code emitter interface to pull the pointers for the current
...
code emission location into the base class, instead of being in the derived classes.
This change means that low-level methods like emitByte/emitWord now are no longer
virtual (yaay for speed), and we now have a framework to support growable code
segments. This implements feature request #1 of PR469.
llvm-svn: 28059
2006-05-02 18:27:26 +00:00
Nate Begeman
bbcbf48aab
Since we don't handle callee-save CRs right yet, don't allocate them. Also
...
don't step on R11 in the middle of a function when saving and restoring CRs
llvm-svn: 28058
2006-05-02 17:37:31 +00:00
Nate Begeman
287dc5be0d
Hooray, everyone now uses the same printBasicBlockLabel implementation
...
llvm-svn: 28056
2006-05-02 17:34:51 +00:00
Nate Begeman
b9d4f8324d
Extend printBasicBlockLabel a bit so that it can be used to print all
...
basic block labels, consolidating the code to do so in one place for each
target.
llvm-svn: 28050
2006-05-02 05:37:32 +00:00
Nate Begeman
01364fbba8
Update the PPC compilation callback code to not need weird abi-violating
...
prologs and epilogs, keep all the asm in one place, and remove use of
compiler builtin functions.
llvm-svn: 28049
2006-05-02 04:50:05 +00:00
Chris Lattner
84b49d51be
Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
...
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Chris Lattner
a4c2c4a276
Add a note
...
llvm-svn: 27999
2006-04-28 00:04:05 +00:00
Nate Begeman
318bb96f9e
No functionality changes, but cleaner code with correct comments.
...
llvm-svn: 27966
2006-04-25 04:45:59 +00:00
Nate Begeman
4ca2ea5b43
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
c8afdfec52
Teach the JIT how to relocate LI, this fixes the JIT on Prolangs-C/TimberWolfMC
...
llvm-svn: 27943
2006-04-22 06:17:56 +00:00
Nate Begeman
57a32f0bc1
Fix the comment
...
llvm-svn: 27938
2006-04-21 22:11:27 +00:00
Nate Begeman
516b393992
Change the PPC JIT to use a Static relocation model
...
llvm-svn: 27937
2006-04-21 22:04:15 +00:00
Chris Lattner
99d3da9d2c
Fix the CodeGen/PowerPC/buildvec_canonicalize.ll regression last night.
...
llvm-svn: 27908
2006-04-20 19:01:30 +00:00
Chris Lattner
0cd0065c58
Make sure that the new instructions selected have the right type. This fixes
...
CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll
llvm-svn: 27868
2006-04-20 05:58:10 +00:00
Chris Lattner
05bbec5020
add a note
...
llvm-svn: 27832
2006-04-19 16:22:38 +00:00
Chris Lattner
a922a516b0
add a note
...
llvm-svn: 27828
2006-04-19 05:55:06 +00:00
Chris Lattner
34c901b50e
These are correctly encoded by the JIT. I checked :)
...
llvm-svn: 27810
2006-04-18 19:03:38 +00:00
Chris Lattner
197d762232
add a note
...
llvm-svn: 27809
2006-04-18 18:30:19 +00:00
Chris Lattner
518834c67e
Fix a crash on:
...
void foo2(vector float *A, vector float *B) {
vector float C = (vector float)vec_cmpeq(*A, *B);
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
*A = C;
}
llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Chris Lattner
1e174c87c3
pretty print node name
...
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
9754d142a4
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
68c16a201e
move some stuff around, clean things up
...
llvm-svn: 27802
2006-04-18 17:52:36 +00:00
Chris Lattner
96d50487c9
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
d6d82aa889
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner
7e439874cb
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
a2cae1bb10
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner
63a5cdc423
remove done item
...
llvm-svn: 27778
2006-04-17 21:52:03 +00:00
Chris Lattner
6bd68ae81e
Don't diddle VRSAVE if no registers need to be added/removed from it. This
...
allows us to codegen functions as:
_test_rol:
vspltisw v2, -12
vrlw v2, v2, v2
blr
instead of:
_test_rol:
mfvrsave r2, 256
mr r3, r2
mtvrsave r3
vspltisw v2, -12
vrlw v2, v2, v2
mtvrsave r2
blr
Testcase here: CodeGen/PowerPC/vec_vrsave.ll
llvm-svn: 27777
2006-04-17 21:48:13 +00:00
Chris Lattner
72d7c27069
Vectors that are known live-in and live-out are clearly already marked in
...
the vrsave register for the caller. This allows us to codegen a function as:
_test_rol:
mfspr r2, 256
mr r3, r2
mtspr 256, r3
vspltisw v2, -12
vrlw v2, v2, v2
mtspr 256, r2
blr
instead of:
_test_rol:
mfspr r2, 256
oris r3, r2, 40960
mtspr 256, r3
vspltisw v0, -12
vrlw v2, v0, v0
mtspr 256, r2
blr
llvm-svn: 27772
2006-04-17 21:22:06 +00:00
Chris Lattner
14c4972b6d
Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
...
vspltisw v2, -12
vrlw v2, v2, v2
instead of:
vspltisw v0, -12
vrlw v2, v0, v0
when a function is returning a value.
llvm-svn: 27771
2006-04-17 21:19:12 +00:00
Chris Lattner
6df094b4ab
Move some knowledge about registers out of the code emitter into the register info.
...
llvm-svn: 27770
2006-04-17 21:07:20 +00:00
Chris Lattner
0f28d48da2
Use a small table instead of macros to do this conversion.
...
llvm-svn: 27769
2006-04-17 20:59:25 +00:00
Chris Lattner
e54133cfba
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner
264c908e3a
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
26fb8d9393
add a note
...
llvm-svn: 27758
2006-04-17 17:29:41 +00:00
Chris Lattner
1b3806ace5
Make some code more general, adding support for constant formation of several
...
new patterns.
llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
f8dd76df5b
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
2a099c04c1
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
071ad01ceb
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
85bfa3c2bc
Regenerate with adjusted costs
...
llvm-svn: 27746
2006-04-17 05:26:20 +00:00
Chris Lattner
aac2a200cd
Regenerate with correct offset
...
llvm-svn: 27744
2006-04-17 05:08:46 +00:00
Chris Lattner
311b1a6e23
Increase the opcodes by one each to disambiguate COPY from VMRGHW.
...
llvm-svn: 27742
2006-04-17 00:47:48 +00:00
Chris Lattner
07a3d01a91
Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
...
of various 4-element vectors.
llvm-svn: 27739
2006-04-17 00:37:02 +00:00
Chris Lattner
06a21ba96b
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
fa5aa396c2
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
24acbe46c0
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
873202fabd
Add patterns for matching vnots with bit converted inputs. Most of these will
...
go away when I start using evan's binop type canonicalizer
llvm-svn: 27725
2006-04-15 23:45:24 +00:00
Chris Lattner
559c8ba466
Allow undef in a shuffle mask
...
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
4211ca9108
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
19e9055eb5
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
883fb053bd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
5879efe0c8
add a note, move an altivec todo to the altivec list.
...
llvm-svn: 27654
2006-04-13 16:48:00 +00:00
Reid Spencer
9857229aba
Add the README files to the distribution.
...
llvm-svn: 27651
2006-04-13 06:39:24 +00:00
Chris Lattner
147e50e1c5
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
74cf9ff761
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
e318a7574e
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Nate Begeman
f19bcd5177
Fix SingleSource/UnitTests/Vector/sumarray-dbl
...
llvm-svn: 27594
2006-04-11 19:44:43 +00:00
Nate Begeman
1bb132099f
Fix PR727, correctly handling large stack aligments on ppc
...
llvm-svn: 27593
2006-04-11 19:29:21 +00:00
Chris Lattner
aaa04230bd
we have a shuffle instr, add an example.
...
llvm-svn: 27592
2006-04-11 18:47:03 +00:00
Jim Laskey
02b3b72bfc
Suppress debug label when not debug.
...
llvm-svn: 27588
2006-04-11 08:11:53 +00:00
Chris Lattner
e4db08a2f1
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
92533cfb4a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3a68f3c3ca
properly mark vector selects as expanded to select_cc
...
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
0a3d1bbca4
Add VRRC select support
...
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Nate Begeman
3f9c17906f
Disable switch lowering for targets based on the selection dag isel,
...
letting the code generator handle them directly.
llvm-svn: 27539
2006-04-08 19:46:55 +00:00
Chris Lattner
d9e80f4516
Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
...
constant pool load.
llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
d71a1f946d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Jim Laskey
c0d6518f27
Make sure that debug labels are defined within the same section and after the
...
entry point of a function.
llvm-svn: 27494
2006-04-07 20:44:42 +00:00
Jim Laskey
2d7298c362
Foundation for call frame information.
...
llvm-svn: 27491
2006-04-07 16:34:46 +00:00
Chris Lattner
e61cfad815
Add an item
...
llvm-svn: 27470
2006-04-06 23:16:19 +00:00
Chris Lattner
466841ddc7
Make sure to return the result in the right type.
...
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
a4bbfaed5c
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
f38e033270
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
d1dcb52093
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
a4c727f1cc
remove two done items
...
llvm-svn: 27453
2006-04-06 19:19:38 +00:00
Chris Lattner
1d33819194
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
e8b83b4206
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
c94d932447
Add all of the data stream intrinsics and instructions. woo
...
llvm-svn: 27442
2006-04-05 22:27:14 +00:00
Chris Lattner
39dc64c955
Fix a typo
...
llvm-svn: 27440
2006-04-05 20:15:25 +00:00
Chris Lattner
39cc717c65
Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
...
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
2cf4232ced
Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
...
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
2f8e2b2895
add vsl
...
llvm-svn: 27425
2006-04-05 01:16:22 +00:00
Chris Lattner
575352ac20
add vmladduhm
...
llvm-svn: 27423
2006-04-05 00:49:48 +00:00
Chris Lattner
5a528e565b
Add m[tf]vscr instructions.
...
llvm-svn: 27421
2006-04-05 00:03:57 +00:00
Chris Lattner
0c82447c66
add a note
...
llvm-svn: 27419
2006-04-04 23:45:11 +00:00
Chris Lattner
281bb5da1d
Add missing byte merges.
...
llvm-svn: 27418
2006-04-04 23:43:56 +00:00
Chris Lattner
fc50ae521c
Add FP -> Int Conversions
...
llvm-svn: 27417
2006-04-04 23:25:02 +00:00
Chris Lattner
96338b6a21
add average intrinsics
...
llvm-svn: 27416
2006-04-04 23:14:00 +00:00
Chris Lattner
4464383a17
add a note
...
llvm-svn: 27414
2006-04-04 22:43:55 +00:00
Chris Lattner
4a744e5c9d
Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
...
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
95c7adc7cb
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
b1e6d84544
Plug in the byte and short splats
...
llvm-svn: 27387
2006-04-04 00:05:13 +00:00
Chris Lattner
447a7968af
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
533aed9a35
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
5400727595
Force use of a frame-pointer if there is anything on the stack that is aligned
...
more than the OS keeps the stack aligned.
llvm-svn: 27381
2006-04-03 22:03:29 +00:00
Chris Lattner
9ccd61c893
Add the full set of min/max instructions
...
llvm-svn: 27372
2006-04-03 15:58:28 +00:00
Chris Lattner
acf1fc8a28
add a note
...
llvm-svn: 27360
2006-04-02 07:20:00 +00:00
Chris Lattner
c5287c0ece
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
80fdc1eb6b
Remove done item
...
llvm-svn: 27351
2006-04-02 05:28:54 +00:00
Chris Lattner
b80f114707
add a note
...
llvm-svn: 27348
2006-04-02 03:59:11 +00:00
Chris Lattner
9b2d6e7886
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
dc72c17798
Implement vnot using VNOR instead of using 'vspltisb v0, -1' and vxor
...
llvm-svn: 27331
2006-04-01 22:41:47 +00:00
Chris Lattner
ff77dc0a08
Shrinkify some more intrinsic definitions.
...
llvm-svn: 27322
2006-03-31 22:41:56 +00:00
Chris Lattner
20d3f3726f
Pull operand asm string into base class, shrinkifying intrinsic definitions.
...
No functionality change.
llvm-svn: 27320
2006-03-31 22:34:05 +00:00
Chris Lattner
110fc74b97
Fix 80 column violations :)
...
llvm-svn: 27315
2006-03-31 21:57:36 +00:00
Chris Lattner
a4150f751d
fix a pasto
...
llvm-svn: 27308
2006-03-31 21:19:06 +00:00
Chris Lattner
e7fd4b0274
Add vperm support for all datatypes
...
llvm-svn: 27307
2006-03-31 20:00:35 +00:00
Chris Lattner
baa73e0d91
Rearrange code a bit
...
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
754b41c84b
Add, sub and shuffle are legal for all vector types
...
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
40ff17dc22
add a note
...
llvm-svn: 27302
2006-03-31 19:00:22 +00:00
Chris Lattner
829a061abf
note to self: *save* file, then check it in
...
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
d4058a59d4
Implement an item from the readme, folding vcmp/vcmp. instructions with
...
identical instructions into a single instruction. For example, for:
void test(vector float *x, vector float *y, int *P) {
int v = vec_any_out(*x, *y);
*x = (vector float)vec_cmpb(*x, *y);
*P = v;
}
we now generate:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v0, v1, v0
mfcr r4, 2
stvx v0, 0, r3
rlwinm r3, r4, 27, 31, 31
xori r3, r3, 1
stw r3, 0(r5)
mtspr 256, r2
blr
instead of:
_test:
mfspr r2, 256
oris r6, r2, 57344
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v2, v1, v0
mfcr r4, 2
*** vcmpbfp v0, v1, v0
rlwinm r4, r4, 27, 31, 31
stvx v0, 0, r3
xori r3, r4, 1
stw r3, 0(r5)
mtspr 256, r2
blr
Testcase here: CodeGen/PowerPC/vcmp-fold.ll
llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
070181c927
compactify some more instruction definitions
...
llvm-svn: 27288
2006-03-31 05:38:32 +00:00
Chris Lattner
45c709388a
Compactify comparisons.
...
llvm-svn: 27287
2006-03-31 05:32:57 +00:00