Chris Lattner
2237973438
Implement a signficant optimization for inline asm:
...
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible. This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:
void test () {
asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}
Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.
Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??
Incidentally, this was the todo in
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll
Please do NOT pull this into Tak.
llvm-svn: 50315
2008-04-27 00:37:18 +00:00
Chris Lattner
67ca6f6347
When SRoA'ing a global variable, make sure the new globals get the
...
appropriate alignment. This fixes a miscompilation of 252.eon on
x86-64 (rdar://5891920).
Bill, please pull this into Tak.
llvm-svn: 50308
2008-04-26 07:40:11 +00:00
Nate Begeman
98f0898e42
Feedback from chris
...
llvm-svn: 50305
2008-04-25 21:47:35 +00:00
Nate Begeman
f10b493fc0
Add a testcase for the recent "handle variable vector insert elt in mem" patch
...
llvm-svn: 50303
2008-04-25 21:26:59 +00:00
Evan Cheng
402572a149
Update tests.
...
llvm-svn: 50293
2008-04-25 20:13:47 +00:00
Evan Cheng
ccde6dd016
Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
...
llvm-svn: 50289
2008-04-25 19:11:04 +00:00
Dan Gohman
ca95a5f49f
Remove the code from CodeGenPrepare that moved getresult instructions
...
to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.
Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.
llvm-svn: 50279
2008-04-25 18:27:55 +00:00
Chris Lattner
b58a365cb6
new testcase
...
llvm-svn: 50274
2008-04-25 18:11:06 +00:00
Anton Korobeynikov
f18ec8b160
Update test
...
llvm-svn: 50272
2008-04-25 17:54:21 +00:00
Nick Lewycky
4d43d3c72c
Remove 'unwinds to' support from mainline. This patch undoes r47802 r47989
...
r48047 r48084 r48085 r48086 r48088 r48096 r48099 r48109 and r48123.
llvm-svn: 50265
2008-04-25 16:53:59 +00:00
Evan Cheng
df38b35a1e
MMX argument passing fixes:
...
On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].
On Darwin / Linux x86-32, v1i64 values are passed in memory.
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.
llvm-svn: 50257
2008-04-25 07:56:45 +00:00
Chris Lattner
741c7a3b49
Loosen up an assertion to allow intrinsics. I really have no
...
idea what this code (findNonImmUse) does, so I'm only guessing
that this is the right thing. It would be really really nice
if this had comments and perhaps switched to SmallPtrSet
(hint hint) :)
This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c
llvm-svn: 50252
2008-04-25 05:13:01 +00:00
Chris Lattner
f7de528463
Don't infininitely thread branches when a threaded edge
...
goes back to the block, e.g.:
Threading edge through bool from 'bb37.us.thread3829' to 'bb37.us' with cost: 1, across block:
bb37.us: ; preds = %bb37.us.thread3829, %bb37.us, %bb33
%D1361.1.us = phi i32 [ %tmp36, %bb33 ], [ %D1361.1.us, %bb37.us ], [ 0, %bb37.us.thread3829 ] ; <i32> [#uses=2]
%tmp39.us = icmp eq i32 %D1361.1.us, 0 ; <i1> [#uses=1]
br i1 %tmp39.us, label %bb37.us, label %bb42.us
llvm-svn: 50251
2008-04-25 04:12:29 +00:00
Evan Cheng
9165e165dc
Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.
...
llvm-svn: 50239
2008-04-25 00:26:43 +00:00
Evan Cheng
a42d24003d
New test.
...
llvm-svn: 50229
2008-04-24 20:01:58 +00:00
Devang Patel
f7c3979bb0
Add EXTRA_OPTIONS on the llvmgxx command line.
...
llvm-svn: 50217
2008-04-24 17:59:03 +00:00
Devang Patel
1a5cfb05b5
Add EXTRA_OPTIONS on the llvmgcc command line.
...
llvm-svn: 50216
2008-04-24 17:54:25 +00:00
Chris Lattner
86bbf338e5
Split some code out of the main SimplifyCFG loop into its own function.
...
Fix said code to handle merging return instructions together correctly
when handling multiple return values.
llvm-svn: 50199
2008-04-24 00:01:19 +00:00
Anton Korobeynikov
3aec21fa0e
Fix tests due to llvm2cpp move to llc target
...
llvm-svn: 50191
2008-04-23 22:41:53 +00:00
Dan Gohman
b418aafabf
Add support to codegen for getresult instructions with undef operands.
...
llvm-svn: 50180
2008-04-23 20:21:29 +00:00
Anton Korobeynikov
dd4ef2e30c
Disable stack realignment for these tests
...
llvm-svn: 50172
2008-04-23 18:25:44 +00:00
Anton Korobeynikov
c3ada5c9c4
Fix test becase ABI stack alignment dropped to 'normal' value
...
llvm-svn: 50171
2008-04-23 18:25:16 +00:00
Anton Korobeynikov
955a8a9101
Fix test, instruction count is valid only if stack is not realigned
...
llvm-svn: 50170
2008-04-23 18:24:48 +00:00
Chris Lattner
5a58a4dc6d
Rewrite multiple return value handling in SCCP. Before, the -sccp pass
...
would turn every getresult instruction into undef. This helps with
rdar://5778210
llvm-svn: 50140
2008-04-23 05:38:20 +00:00
Chris Lattner
14f41bfc49
remove this testcase. It isn't testing loop rotate, it is testing all
...
of -std-compile-opts and is now failing because other passes are generating
IR that looks different to input of loop rotate. Devang, please
introduce a testcase that only runs loop rotate.
llvm-svn: 50136
2008-04-23 05:36:04 +00:00
Chris Lattner
f9a4e4d723
returning an empty multiple return list is not valid.
...
llvm-svn: 50135
2008-04-23 05:29:14 +00:00
Chris Lattner
3376d6d824
make this test more interesting.
...
llvm-svn: 50128
2008-04-23 03:49:32 +00:00
Chris Lattner
2161d6c075
distill down the essense of this test.
...
llvm-svn: 50125
2008-04-23 03:03:42 +00:00
Dale Johannesen
c4d3c1cbe0
new test
...
llvm-svn: 50123
2008-04-23 01:22:22 +00:00
Evan Cheng
1c89ca7295
Don't do: "(X & 4) >> 1 == 2 --> (X & 4) == 4" if there are more than one uses of the shift result.
...
llvm-svn: 50118
2008-04-23 00:38:06 +00:00
Chris Lattner
37e9c187b0
Start doing the significantly useful part of jump threading: handle cases
...
where a comparison has a phi input and that phi is a constant. For example,
stuff like:
Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block:
bb2237: ; preds = %bb2231, %bb2149
%tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ] ; <i32> [#uses=2]
%done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ] ; <i32> [#uses=1]
%tmp2239 = icmp eq i32 %done.0, 0 ; <i1> [#uses=1]
br i1 %tmp2239, label %bb2231, label %bb2327
or
bb38.i298: ; preds = %bb33.i295, %bb1693
%tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ] ; <%struct.ibox*> [#uses=2]
%minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ] ; <i32> [#uses=1]
%tmp40.i297 = icmp eq %struct.ibox* %tmp39.i296.rle, null ; <i1> [#uses=1]
br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301
This triggers thousands of times in spec.
llvm-svn: 50110
2008-04-22 21:40:39 +00:00
Chris Lattner
d5425e8f8d
Dig through multiple levels of AND to thread jumps if needed.
...
llvm-svn: 50106
2008-04-22 20:46:09 +00:00
Chris Lattner
3df4c15dc7
Teach jump threading to thread through blocks like:
...
br (and X, phi(Y, Z, false)), label L1, label L2
This triggers once on 252.eon and 6 times on 176.gcc. Blocks
in question often look like this:
bb262: ; preds = %bb261, %bb248
%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ] ; <i1> [#uses=4]
%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null ; <i1> [#uses=1]
%bothcond = or i1 %iftmp.251.0, %tmp270 ; <i1> [#uses=1]
br i1 %bothcond, label %bb288, label %bb273
In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261. When coming from bb248, it is all that matters.
Another random example:
check_asm_operands.exit: ; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
call void @llvm.stackrestore( i8* %savedstack ) nounwind
%tmp4389 = icmp eq i32 %added_sets_1.0, 0 ; <i1> [#uses=1]
%tmp4394 = icmp eq i32 %added_sets_2.0, 0 ; <i1> [#uses=1]
%bothcond80 = and i1 %tmp4389, %tmp4394 ; <i1> [#uses=1]
%bothcond81 = and i1 %bothcond80, %tmp.0.i420 ; <i1> [#uses=1]
br i1 %bothcond81, label %bb4398, label %bb4397
Here is the case from 252.eon:
bb290.i.i: ; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ] ; <i1> [#uses=2]
%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ] ; <i32> [#uses=3]
%tmp292.i.i = load i8* %tmp16.i.i100, align 1 ; <i8> [#uses=1]
%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0 ; <i1> [#uses=1]
%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i ; <i1> [#uses=1]
br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
Factoring out 3 common predecessors.
On the path from any blocks other than bb23.i57.i.i, the load and compare
are dead.
llvm-svn: 50096
2008-04-22 07:05:46 +00:00
Chris Lattner
3cc28ce1ed
add a basic testcase.
...
llvm-svn: 50093
2008-04-22 06:35:14 +00:00
Nick Lewycky
cd92245311
Start removing 'unwinds to' support from mainline in preparation for 2.3.
...
llvm-svn: 50086
2008-04-22 05:16:02 +00:00
Chris Lattner
c3a439351c
optimize "p != gep p, ..." better. This allows us to compile
...
getelementptr-seteq.ll into:
define i1 @test(i64 %X, %S* %P) {
%C = icmp eq i64 %X, -1 ; <i1> [#uses=1]
ret i1 %C
}
instead of:
define i1 @test(i64 %X, %S* %P) {
%A.idx.mask = and i64 %X, 4611686018427387903 ; <i64> [#uses=1]
%C = icmp eq i64 %A.idx.mask, 4611686018427387903 ; <i1> [#uses=1]
ret i1 %C
}
And fixes the second half of PR2235. This speeds up the insertion sort
case by 45%, from 1.12s to 0.77s. In practice, this will significantly
speed up for loops structured like:
for (double *P = Base + N; P != Base; --P)
...
Which happens frequently for C++ iterators.
llvm-svn: 50079
2008-04-22 02:53:33 +00:00
Dan Gohman
f166d2d0d6
Implement an x86-64 ABI detail of passing structs by hidden first
...
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.
Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.
llvm-svn: 50075
2008-04-21 23:59:07 +00:00
Duncan Sands
db70198618
Make these structs larger to ensure that they
...
are returned by struct return.
llvm-svn: 50038
2008-04-21 08:17:05 +00:00
Duncan Sands
568e5c2461
Make the struct bigger, to ensure it is returned
...
by struct return.
llvm-svn: 50037
2008-04-21 08:12:03 +00:00
Owen Anderson
6a7355caa2
Refactor memcpyopt based on Chris' suggestions. Consolidate several functions
...
and simplify code that was fallout from the separation of memcpyopt and gvn.
llvm-svn: 50034
2008-04-21 07:45:10 +00:00
Chris Lattner
470ab00c76
A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
...
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner
a124f1e219
Not all x86-64 machines have sse3 apparently.
...
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner
b839c05a05
rename *.llx -> *.ll, last batch.
...
llvm-svn: 49971
2008-04-19 22:32:52 +00:00
Chris Lattner
50fb77f829
rename *.llx -> *.ll
...
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Chris Lattner
fe48fbc1f1
rename *.llx -> *.ll
...
llvm-svn: 49969
2008-04-19 22:26:29 +00:00
Chris Lattner
bc26e1bb8a
Implement PR2206.
...
llvm-svn: 49967
2008-04-19 22:17:26 +00:00
Chris Lattner
334d33cad1
refactor handling of symbolic constant folding, picking up
...
a few new cases( see Integer/a1.ll), but not anything that
would happen in practice.
llvm-svn: 49965
2008-04-19 21:58:19 +00:00
Evan Cheng
5102bd9359
64-bit atomic operations.
...
llvm-svn: 49949
2008-04-19 02:30:38 +00:00
Dan Gohman
41eb949aaf
Teach llvm-as to accept function types with multiple return types.
...
llvm-svn: 49945
2008-04-19 00:24:39 +00:00
Evan Cheng
7e4a55bc58
Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg.
...
llvm-svn: 49843
2008-04-17 07:58:04 +00:00