Evan Cheng
84aec09fdb
Fix PR2138. Apparently any modification to a std::multimap (including remove entries for a different key) can invalidate multimap iterators.
...
llvm-svn: 48371
2008-03-14 20:44:01 +00:00
Evan Cheng
442d708bfb
New test case.
...
llvm-svn: 48338
2008-03-13 08:05:02 +00:00
Evan Cheng
ecde45ecb5
A test case I forgot to check in.
...
llvm-svn: 48335
2008-03-13 06:42:46 +00:00
Evan Cheng
5c26bde55e
TwoAddressInstructionPass enhancement. After it converts a two address instruction into a 3-address one, sink it past the instruction that kills the read-mod-write register if its definition is used past the kill. This reduces the number of live register by one.
...
llvm-svn: 48333
2008-03-13 06:37:55 +00:00
Evan Cheng
65e9d5f1a8
Experimental scheduler change to schedule / coalesce the copies added for function livein's. Take 2008-03-10-RegAllocInfLoop.ll, the schedule looks like this after these copies are inserted:
...
entry: 0x12049d0, LLVM BB @0x1201fd0, ID#0:
Live Ins: %EAX %EDX %ECX
%reg1031<def> = MOVPC32r 0
%reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
%reg1028<def> = MOV32rr %EAX
%reg1029<def> = MOV32rr %EDX
%reg1030<def> = MOV32rr %ECX
%reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x1201910 + 0]
%reg1025<def> = MOV32rr %reg1029
%reg1026<def> = MOV32rr %reg1030
%reg1024<def> = MOV32rr %reg1028
The copies unnecessarily increase register pressure and it will end up requiring a physical register to be spilled.
With -schedule-livein-copies:
entry: 0x12049d0, LLVM BB @0x1201fa0, ID#0:
Live Ins: %EAX %EDX %ECX
%reg1031<def> = MOVPC32r 0
%reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
%reg1024<def> = MOV32rr %EAX
%reg1025<def> = MOV32rr %EDX
%reg1026<def> = MOV32rr %ECX
%reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x12018e0 + 0]
Much better!
llvm-svn: 48307
2008-03-12 22:19:41 +00:00
Dan Gohman
f7492cf0ec
Fix this test on hosts that don't have sse2.
...
llvm-svn: 48296
2008-03-12 20:40:51 +00:00
Dan Gohman
35f8f07c00
Make this test x86-specific for now; targets that don't use
...
the automated CallingConv code to handle return values typically
don't support multiple return values.
llvm-svn: 48265
2008-03-12 00:25:14 +00:00
Anton Korobeynikov
80b53b8f6b
Testcase for PR2137
...
llvm-svn: 48258
2008-03-11 22:43:42 +00:00
Anton Korobeynikov
6f51973734
Update testcase for recent aliases change
...
llvm-svn: 48250
2008-03-11 21:42:20 +00:00
Dan Gohman
6616836e71
Add a test to ensure that all-ones vectors are materialized with pcmpeqd.
...
llvm-svn: 48247
2008-03-11 21:37:00 +00:00
Dan Gohman
44b4c07cd1
Use the correct value for InSignBit.
...
llvm-svn: 48245
2008-03-11 21:29:43 +00:00
Chris Lattner
8abed80a69
Implement basic support for the 'f' register class constraint. This basically
...
works, but probably won't if you mix it with 't' or 'u' yet.
llvm-svn: 48243
2008-03-11 19:50:13 +00:00
Evan Cheng
e88a625ecd
When the register allocator runs out of registers, spill a physical register around the def's and use's of the interval being allocated to make it possible for the interval to target a register and spill it right away and restore a register for uses. This likely generates terrible code but is before than aborting.
...
llvm-svn: 48218
2008-03-11 07:19:34 +00:00
Chris Lattner
7362d38391
Don't emit FP_REG_KILL into a block that just returns. Nothing
...
can be live out of the block anyway, so it isn't needed.
llvm-svn: 48192
2008-03-10 23:34:12 +00:00
Dan Gohman
272e234477
Fix mul expansion to check the correct number of bits for
...
zero extension when checking if an unsigned multiply is
safe.
llvm-svn: 48171
2008-03-10 20:42:19 +00:00
Dale Johannesen
fe2c0e2dca
These tests don't work unless SSE2 is active.
...
Judging from the checking comments this is intentional,
so add the flag (makes them pass on non-x86 host).
llvm-svn: 48157
2008-03-10 17:33:57 +00:00
Dale Johannesen
65aada6e8f
There is no "-mattr=+sse1" flag; fix test for non-x86 hosts.
...
llvm-svn: 48156
2008-03-10 17:13:37 +00:00
Evan Cheng
4a3c5eab34
- Fix a subtle bug in RemoveCopyByCommutingDef. ALR is the live range where the source is defined; BLR is the live range which is defined by the copy.
...
If ALR and BLR overlaps and end of BLR extends beyond end of ALR, e.g.
A = or A, B
...
B = A
...
C = A<kill>
...
= B
then do not add kills of A to the newly created B interval.
- Also fix some kill info update bug.
llvm-svn: 48141
2008-03-10 08:11:32 +00:00
Evan Cheng
b5d11980d9
Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case.
...
llvm-svn: 48136
2008-03-10 07:19:13 +00:00
Chris Lattner
86829f0ff7
teach X86InstrInfo::copyRegToReg how to copy into ST(0) from
...
an RFP register class.
Teach ScheduleDAG how to handle CopyToReg with different src/dst
reg classes.
This allows us to compile trivial inline asms that expect stuff
on the top of x87-fp stack.
llvm-svn: 48107
2008-03-09 09:15:31 +00:00
Chris Lattner
9e07537e8c
Add ScheduleDAG support for copytoreg where the src/dst register are
...
in different register classes, e.g. copy of ST(0) to RFP*. This gets
some really trivial inline asm working that plops things on the top of
stack (PR879)
llvm-svn: 48105
2008-03-09 08:49:15 +00:00
Chris Lattner
a6ce71fb84
reduce this testcase more
...
llvm-svn: 48092
2008-03-09 06:57:21 +00:00
Chris Lattner
b6387c8a74
Finish implementing a readme entry: when inserting an i64 variable
...
into a vector of zeros or undef, and when the top part is obviously
zero, we can just use movd + shuffle. This allows us to compile
vec_set-B.ll into:
_test3:
movl $1234567, %eax
andl 4(%esp), %eax
movd %eax, %xmm0
ret
instead of:
_test3:
subl $28, %esp
movl $1234567, %eax
andl 32(%esp), %eax
movl %eax, (%esp)
movl $0, 4(%esp)
movq (%esp), %xmm0
addl $28, %esp
ret
llvm-svn: 48090
2008-03-09 05:42:06 +00:00
Chris Lattner
eef374c197
Implement a readme entry, compiling
...
#include <xmmintrin.h>
__m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);}
into:
movl $1, %eax
movd %eax, %xmm0
ret
instead of a constant pool load.
llvm-svn: 48063
2008-03-09 01:05:04 +00:00
Chris Lattner
031e04b7a3
make this test harder
...
llvm-svn: 48061
2008-03-09 00:30:06 +00:00
Chris Lattner
a1f25b0020
Teach SD some vector identities, allowing us to compile vec_set-9 into:
...
_test3:
movd %rdi, %xmm1
#IMPLICIT_DEF %xmm0
punpcklqdq %xmm1, %xmm0
ret
instead of:
_test3:
#IMPLICIT_DEF %rax
movd %rax, %xmm0
movd %rdi, %xmm1
punpcklqdq %xmm1, %xmm0
ret
This is still not ideal. There is no reason to two xmm regs.
llvm-svn: 48058
2008-03-08 23:43:36 +00:00
Evan Cheng
95cf661534
Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions.
...
llvm-svn: 48042
2008-03-08 00:58:38 +00:00
Chris Lattner
d4defb00df
mark frem as expand for all legal fp types on x86, regardless of whether
...
we're using SSE or not. This fixes PR2122.
llvm-svn: 48006
2008-03-07 06:36:32 +00:00
Chris Lattner
78e9cab229
Generalize FP constant shrinking optimization to apply to any vt
...
except ppc long double. This allows us to shrink constant pool
entries for x86 long double constants, which in turn allows us to
use flds/fldl instead of fldt.
llvm-svn: 47938
2008-03-05 06:48:13 +00:00
Evan Cheng
0a62cb44ce
Add a target lowering hook to control whether it's worthwhile to compress fp constant.
...
For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive.
llvm-svn: 47931
2008-03-05 01:30:59 +00:00
Evan Cheng
16e2cf6db1
Really fix the test.
...
llvm-svn: 47882
2008-03-04 08:01:56 +00:00
Evan Cheng
7473c74d52
Fix broken test.
...
llvm-svn: 47881
2008-03-04 07:59:13 +00:00
Evan Cheng
62240d65fd
Add PR1501 test case.
...
llvm-svn: 47874
2008-03-04 00:47:45 +00:00
Chris Lattner
a70df9e2ee
Evan implemented these.
...
llvm-svn: 47828
2008-03-02 18:05:14 +00:00
Evan Cheng
507713de08
Set to default: x86 no longer fold and into test if it has more than one use.
...
llvm-svn: 47711
2008-02-28 07:46:38 +00:00
Evan Cheng
fdc732ab9a
Fix a bug in dead spill slot elimination.
...
llvm-svn: 47687
2008-02-27 19:57:11 +00:00
Chris Lattner
3df31ba41b
actually run llc, thanks Dan :)
...
llvm-svn: 47677
2008-02-27 17:46:54 +00:00
Evan Cheng
8ae8e2d50b
Don't track max alignment during stack object allocations since they can be deleted later. Let PEI compute it.
...
llvm-svn: 47668
2008-02-27 10:04:56 +00:00
Chris Lattner
83263b8cfb
Make X86TargetLowering::LowerSINT_TO_FP return without creating a dead
...
stack slot and store if the SINT_TO_FP is actually legal. This allows
us to compile:
double a(double b) {return (unsigned)b;}
to:
_a:
cvttsd2siq %xmm0, %rax
movl %eax, %eax
cvtsi2sdq %rax, %xmm0
ret
instead of:
_a:
subq $8, %rsp
cvttsd2siq %xmm0, %rax
movl %eax, %eax
cvtsi2sdq %rax, %xmm0
addq $8, %rsp
ret
crazy.
llvm-svn: 47660
2008-02-27 05:57:41 +00:00
Chris Lattner
3c7d3d5700
Compile x86-64-and-mask.ll into:
...
_test:
movl %edi, %eax
ret
instead of:
_test:
movl $4294967295, %ecx
movq %rdi, %rax
andq %rcx, %rax
ret
It would be great to write this as a Pat pattern that used subregs
instead of a 'pseudo' instruction, but I don't know how to do that
in td files.
llvm-svn: 47658
2008-02-27 05:47:54 +00:00
Evan Cheng
6d56368caf
Spiller now remove unused spill slots.
...
llvm-svn: 47657
2008-02-27 03:04:06 +00:00
Evan Cheng
fa6b366892
Enable -coalescer-commute-instrs by default.
...
llvm-svn: 47623
2008-02-26 20:40:22 +00:00
Dan Gohman
9db0aa86d9
Avoid aborting on invalid shift counts.
...
llvm-svn: 47612
2008-02-26 18:50:50 +00:00
Eli Friedman
666bbe34f4
Fix for pr2093: direct operands aren't necessarily addresses, so don't
...
try to simplify them.
llvm-svn: 47610
2008-02-26 18:37:49 +00:00
Evan Cheng
2ff0b0e681
This is possible:
...
vr1 = extract_subreg vr2, 3
...
vr3 = extract_subreg vr1, 2
The end result is vr3 is equal to vr2 with subidx 2.
llvm-svn: 47592
2008-02-26 08:03:41 +00:00
Evan Cheng
1da250097b
Fix PR2076. CodeGenPrepare now sinks address computation for inline asm memory
...
operands into inline asm block.
llvm-svn: 47589
2008-02-26 02:42:37 +00:00
Evan Cheng
504c645b3e
Rematerialization logic was overly conservative when it comes to loads from fixed stack slots.
...
llvm-svn: 47529
2008-02-23 03:38:34 +00:00
Evan Cheng
405827dc26
Update test.
...
llvm-svn: 47527
2008-02-23 02:57:25 +00:00
Evan Cheng
174ef9a0bb
Remat of pic loads are now on by default.
...
llvm-svn: 47525
2008-02-23 02:08:30 +00:00
Evan Cheng
26e5d01ca1
Really. Why doesn't every arch support MMX?
...
llvm-svn: 47513
2008-02-23 00:56:14 +00:00