Chris Lattner
b6a4ba808b
Change the rename pass to be "tail recursive", only adding N-1 successors
...
to the worklist, and handling the last one with a 'tail call'. This speeds
up PR1432 from 2.0578s to 2.0012s (2.8%)
llvm-svn: 40822
2007-08-04 20:40:27 +00:00
Chris Lattner
840259c8d3
cache computation of #preds for a BB. This speeds up
...
mem2reg from 2.0742->2.0522s on PR1432.
llvm-svn: 40821
2007-08-04 20:24:50 +00:00
Chris Lattner
050bac4bed
reserve operand space for phi nodes when we insert them.
...
llvm-svn: 40820
2007-08-04 20:14:34 +00:00
Chris Lattner
9318785df5
use continue to avoid nesting, no functionality change.
...
llvm-svn: 40819
2007-08-04 20:07:06 +00:00
Chris Lattner
6b04ecbaf9
Promoting allocas with the 'single store' fastpath is
...
faster than with the 'local to a block' fastpath. This speeds
up PR1432 from 2.1232 to 2.0686s (2.6%)
llvm-svn: 40818
2007-08-04 20:03:23 +00:00
Chris Lattner
4a930f9444
When PromoteLocallyUsedAllocas promoted allocas, it didn't remember
...
to increment NumLocalPromoted, and didn't actually delete the
dead alloca, leading to an extra iteration of mem2reg.
llvm-svn: 40817
2007-08-04 20:01:43 +00:00
Chris Lattner
63c039780c
std::map -> DenseMap
...
llvm-svn: 40816
2007-08-04 19:52:20 +00:00
Nick Lewycky
20f0811fc0
Clean up comments, fix up some confusing code logic.
...
Predsimplify fails llvm-gcc bootstrap.
llvm-svn: 40815
2007-08-04 18:45:32 +00:00
Chris Lattner
7d382f7680
fix a logic bug where we wouldn't promote single store allocas if the
...
stored value was a non-instruction value. Doh.
This increase the # single store allocas from 8982 to 9026, and
speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s.
llvm-svn: 40813
2007-08-04 02:45:02 +00:00
Chris Lattner
1b215f0661
When we do the single-store optimization, delete both the store
...
and the alloca so they don't get reprocessed.
This speeds up PR1432 from 2.20s to 2.17s.
llvm-svn: 40812
2007-08-04 02:38:38 +00:00
Chris Lattner
862f125457
Three improvements:
...
1. Check for revisiting a block before checking domination, which is faster.
2. If the stored value isn't an instruction, we don't have to check for domination.
3. If we have a value used in the same block more than once, make sure to remove the
block from the UsingBlocks vector. Not doing so forces us to go through the slow
path for the alloca.
The combination of these improvements increases the number of allocas on the fastpath
from 8935 to 8982 on PR1432. This speeds it up from 2.90s to 2.20s (31%)
llvm-svn: 40811
2007-08-04 02:32:22 +00:00
Chris Lattner
ae1e00eb36
switch from using a std::set to using a SmallPtrSet. This speeds up the
...
testcase in PR1432 from 6.33s to 2.90s (2.22x)
llvm-svn: 40810
2007-08-04 02:21:22 +00:00
Chris Lattner
9181801bb7
In mem2reg, when handling the single-store case, make sure to remove
...
a using block from the list if we handle it. Not doing this caused us
to not be able to promote (with the fast path) allocas which have uses (whoops).
This increases the # allocas hitting this fastpath from 4042 to 8935 on the
testcase in PR1432, speeding up mem2reg by 2.6x
llvm-svn: 40809
2007-08-04 02:15:24 +00:00
Chandler Carruth
450f95c857
Regenerating.
...
llvm-svn: 40808
2007-08-04 01:56:21 +00:00
Chandler Carruth
7132e00de7
This is the patch to provide clean intrinsic function overloading support in LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future.
...
This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported.
llvm-svn: 40807
2007-08-04 01:51:18 +00:00
Chris Lattner
886a41a007
split rewriting of single-store allocas into its own
...
method.
llvm-svn: 40806
2007-08-04 01:47:41 +00:00
Chris Lattner
3cede09c67
refactor some code to shrink PromoteMem2Reg::run a bit
...
llvm-svn: 40805
2007-08-04 01:41:18 +00:00
Chris Lattner
d524537fe9
add a typedef, no other change.
...
llvm-svn: 40804
2007-08-04 01:19:38 +00:00
Chris Lattner
df138be527
avoid an unneeded vector copy. This speeds up mem2reg on the testcase
...
in PR1432 by 6%
llvm-svn: 40803
2007-08-04 01:07:49 +00:00
Chris Lattner
fd838f0770
make RenamePassWorkList a local var instead of an ivar.
...
llvm-svn: 40802
2007-08-04 01:04:40 +00:00
Chris Lattner
81a9688e93
Implement codegen for __builtin_choose_expr. For example:
...
struct X { int A; };
void foo() {
struct X s;
int i;
i = __builtin_choose_expr(0, s, i);
}
compiles to:
%tmp = load i32* %i ; <i32> [#uses=1]
store i32 %tmp, i32* %i
wow :)
llvm-svn: 40801
2007-08-04 00:20:15 +00:00
Chris Lattner
374b06a080
the sse intrinsics are missing, leading to errors.
...
llvm-svn: 40800
2007-08-04 00:19:10 +00:00
Chris Lattner
2f48d5e2ed
fix hang in testsuite
...
llvm-svn: 40799
2007-08-04 00:18:28 +00:00
Chris Lattner
c9a15cabdc
fix constness issues.
...
llvm-svn: 40798
2007-08-04 00:14:36 +00:00
Dale Johannesen
b0c7585f2d
Make x86 long double alignment 32 for everything but
...
Darwin (which makes size within a struct==96)
llvm-svn: 40796
2007-08-03 22:46:15 +00:00
Steve Naroff
0104731e62
Restrict vector component access (using "." and "[]") to variables.
...
Chris suggested this, since it simplifies the code generator.
If this features is needed (and we don't think it is), we can revisit.
The following test case now produces an error.
[dylan:~/llvm/tools/clang] admin% cat t.c
typedef __attribute__(( ocu_vector_type(4) )) float float4;
static void test() {
float4 vec4;
vec4.rg.g;
vec4.rg[1];
}
[dylan:~/llvm/tools/clang] admin% ../../Debug/bin/clang t.c
t.c:8:12: error: vector component access limited to variables
vec4.rg.g;
^~
t.c:9:12: error: vector component access limited to variables
vec4.rg[1];
^~~
2 diagnostics generated.
llvm-svn: 40795
2007-08-03 22:40:33 +00:00
Steve Naroff
9efdabc565
Implement __builtin_choose_expr.
...
llvm-svn: 40794
2007-08-03 21:21:27 +00:00
Dale Johannesen
65c8a9ee9a
long double patch 3 of N. Add to MVT.
...
llvm-svn: 40793
2007-08-03 20:51:37 +00:00
Dale Johannesen
c5283ecd6f
long double patch 2 of N. Handle it in TargetData.
...
(I've tried to get the info right for all targets,
but I'm not expert on all of them - check yours.)
llvm-svn: 40792
2007-08-03 20:20:50 +00:00
Owen Anderson
2d19aae4ca
Fix a subtle miscompilation. This allows 197.parser to be compiled correctly.
...
llvm-svn: 40791
2007-08-03 19:59:35 +00:00
Steve Naroff
7d451d614c
Add a test case to validate code gen for typeof/builtin_types_compatible.
...
This test case currently generates the following unexpected warnings (when compared with gcc).
[dylan:clang/test/Parser] admin% ../../../../Debug/bin/clang -parse-ast-check builtin_types_compatible.c
Warnings seen but not expected:
Line 28: expression result unused
Line 29: expression result unused
Line 30: expression result unused
Line 31: expression result unused
Line 32: expression result unused
Line 33: expression result unused
llvm-svn: 40789
2007-08-03 18:38:22 +00:00
Chris Lattner
4048005ad8
implement codegen support for __builtin_types_compatible_p
...
llvm-svn: 40788
2007-08-03 17:51:03 +00:00
Chris Lattner
477ac778ed
fix a buggy comment I added
...
llvm-svn: 40787
2007-08-03 17:47:51 +00:00
Chris Lattner
d268a7a268
Rename AddrLabel and OCUVectorComponent -> AddrLabelExpr and OCUVectorElementExpr respectively. This is for consistency with other expr nodes end with *Expr.
...
llvm-svn: 40785
2007-08-03 17:31:20 +00:00
Chris Lattner
b20b94de3a
testcase for vector element access stuff.
...
llvm-svn: 40783
2007-08-03 16:42:43 +00:00
Chris Lattner
3a44aa7461
implement codegen for multidest ocuvector expressions, like:
...
vec2.yx = vec2; // reverse
llvm-svn: 40782
2007-08-03 16:37:04 +00:00
Chris Lattner
41d480e4d8
add codegen support for storing into a single-element ocu lvalue, such as:
...
vec2.x = f;
llvm-svn: 40781
2007-08-03 16:28:33 +00:00
Chris Lattner
40ff701674
refactor handling of ocuvector lvalue->rvalue codegen into its own method.
...
llvm-svn: 40780
2007-08-03 16:18:34 +00:00
Chris Lattner
fb837dccac
In the common case where we are shuffling a vector, emit an
...
llvm vector shuffle instead of a bunch of insert/extract operations.
For: vec4 = vec4.yyyy; // splat
Emit:
%tmp1 = shufflevector <4 x float> %tmp, <4 x float> undef, <4 x i32> < i32 1, i32 1, i32 1, i32 1 >
instead of:
%tmp1 = extractelement <4 x float> %tmp, i32 1
%tmp2 = insertelement <4 x float> undef, float %tmp1, i32 0
%tmp3 = extractelement <4 x float> %tmp, i32 1
%tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 1
%tmp5 = extractelement <4 x float> %tmp, i32 1
%tmp6 = insertelement <4 x float> %tmp4, float %tmp5, i32 2
%tmp7 = extractelement <4 x float> %tmp, i32 1
%tmp8 = insertelement <4 x float> %tmp6, float %tmp7, i32 3
llvm-svn: 40779
2007-08-03 16:09:33 +00:00
Chris Lattner
177bd450e0
add OCUVectorComponent::getNumComponents()
...
llvm-svn: 40778
2007-08-03 16:00:20 +00:00
Chris Lattner
a1036f9155
Add support for scalar-returning element accesses like V.x
...
llvm-svn: 40777
2007-08-03 15:52:31 +00:00
Owen Anderson
774761c503
Fix a subtle iterator invalidation bug in a recursive algorithm.
...
llvm-svn: 40776
2007-08-03 11:03:26 +00:00
Reid Spencer
d8a382f66d
Prepare for "core" website.
...
llvm-svn: 40775
2007-08-03 05:43:35 +00:00
Dale Johannesen
ff4c3be741
Long double, part 1 of N. Support in IR.
...
llvm-svn: 40774
2007-08-03 01:03:46 +00:00
Chris Lattner
99fbf13dc3
add an observation
...
llvm-svn: 40772
2007-08-03 00:17:42 +00:00
Chris Lattner
73ab9b3c14
implement lvalue to rvalue conversion for ocuvector components. We can now compile stuff
...
like this:
typedef __attribute__(( ocu_vector_type(4) )) float float4;
float4 test1(float4 V) {
return V.wzyx+V;
}
to:
_test1:
pshufd $27, %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, %xmm0
ret
and:
_test1:
mfspr r2, 256
oris r3, r2, 4096
mtspr 256, r3
li r3, lo16(LCPI1_0)
lis r4, ha16(LCPI1_0)
lvx v3, r4, r3
vperm v3, v2, v2, v3
vaddfp v2, v3, v2
mtspr 256, r2
blr
llvm-svn: 40771
2007-08-03 00:16:29 +00:00
Chris Lattner
9e751cae27
add support for codegen of an OCUVectorComponent as an lvalue.
...
We can now codegen:
vec4.xy;
as nothing!
llvm-svn: 40769
2007-08-02 23:37:31 +00:00
Chris Lattner
885b4959b6
Add support for encoding a OCUVectorComponent into a single integer.
...
llvm-svn: 40768
2007-08-02 23:36:59 +00:00
Chris Lattner
30709dc432
oops, this is the real fix.
...
llvm-svn: 40766
2007-08-02 22:41:43 +00:00
Chris Lattner
7aa350019a
update test
...
llvm-svn: 40765
2007-08-02 22:36:03 +00:00