Commit Graph

18308 Commits

Author SHA1 Message Date
Reid Spencer 94bae69764 For PR1328:
Use the new parameter on Function::getIntrinsicID to identify cases where
a function is being called with an "llvm." name but it isn't actually an
intrinsic. In such cases generate an error.

llvm-svn: 36121
2007-04-16 06:55:42 +00:00
Reid Spencer 9c2eec377e For PR1328:
Don't assert everytime an intrinsic name isn't recognized. Instead, make
the assert optional when callin getIntrinsicID(). This allows the assembler
to handle invalid intrinsic names gracefully.

llvm-svn: 36120
2007-04-16 06:54:34 +00:00
Reid Spencer a3cfb8a683 Revert last patch. It was already fixed.
llvm-svn: 36102
2007-04-16 02:24:41 +00:00
Reid Spencer 8be22e4e04 For PR1336:
Fix a div-by-zero bug noticed by APInt. This fixes:
test/Transforms/IndVarsSimplify/exit_value_tests.llx

llvm-svn: 36099
2007-04-16 01:48:37 +00:00
Owen Anderson f064c92298 Tabs -> Spaces
llvm-svn: 36094
2007-04-15 23:14:18 +00:00
Chris Lattner 343c88cdb9 Fix PR1335 and Transforms/Inline/2007-04-15-InlineEH.ll
llvm-svn: 36090
2007-04-15 21:38:06 +00:00
Chris Lattner cad61e81c1 Fix a nasty bug introduced when apint'ified. This fixes
Transforms/IndVarsSimplify/exit_value_tests.llx

llvm-svn: 36081
2007-04-15 19:52:49 +00:00
Owen Anderson f35a1dbc7a Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for
constructing ImmediateDominator is now folded into DomTree construction.

This is part of the ongoing work for PR217.

llvm-svn: 36063
2007-04-15 08:47:27 +00:00
Chris Lattner f8a7bf317e fix SimplifyLibCalls/IsDigit.ll
llvm-svn: 36047
2007-04-15 05:38:40 +00:00
Chris Lattner 4a6e0cbd41 Extend store merging to support the 'if/then' version in addition to if/then/else.
This sinks the two stores in this example into a single store in cond_next.  In this
case, it allows elimination of the load as well:

        store double 0.000000e+00, double* @s.3060
        %tmp3 = fcmp ogt double %tmp1, 5.000000e-01             ; <i1> [#uses=1]
        br i1 %tmp3, label %cond_true, label %cond_next
cond_true:              ; preds = %entry
        store double 1.000000e+00, double* @s.3060
        br label %cond_next
cond_next:              ; preds = %entry, %cond_true
        %tmp6 = load double* @s.3060            ; <double> [#uses=1]

This implements Transforms/InstCombine/store-merge.ll:test2

llvm-svn: 36040
2007-04-15 01:02:18 +00:00
Chris Lattner 14a251b937 refactor some code, no functionality change.
llvm-svn: 36037
2007-04-15 00:07:55 +00:00
Owen Anderson e59b36defa Fix some unsafe code. Also, tabs -> spaces.
llvm-svn: 36035
2007-04-14 23:57:00 +00:00
Owen Anderson 78cecc817f Make ETForest depend on DomTree rather than IDom. This is the first step
in the long process that will be fixing PR 217.

llvm-svn: 36034
2007-04-14 23:49:24 +00:00
Chris Lattner 28d921d04f fix long lines
llvm-svn: 36031
2007-04-14 23:32:02 +00:00
Chris Lattner e275463e2f add a note
llvm-svn: 36028
2007-04-14 23:06:09 +00:00
Chris Lattner 7bfdd0abe1 Implement Transforms/InstCombine/vec_extract_elt.ll, transforming:
define i32 @test(float %f) {
        %tmp7 = insertelement <4 x float> undef, float %f, i32 0
        %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
        %tmp19 = extractelement <4 x i32> %tmp17, i32 0
        ret i32 %tmp19
}

into:

define i32 @test(float %f) {
        %tmp19 = bitcast float %f to i32                ; <i32> [#uses=1]
        ret i32 %tmp19
}

On PPC, this is the difference between:

_test:
        mfspr r2, 256
        oris r3, r2, 8192
        mtspr 256, r3
        stfs f1, -16(r1)
        addi r3, r1, -16
        addi r4, r1, -32
        lvx v2, 0, r3
        stvx v2, 0, r4
        lwz r3, -32(r1)
        mtspr 256, r2
        blr

and:

_test:
        stfs f1, -4(r1)
        nop
        nop
        nop
        lwz r3, -4(r1)
        blr

llvm-svn: 36025
2007-04-14 23:02:14 +00:00
Chris Lattner b37fb6a0da Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn
unsigned test(float f) {
 return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
}

into:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        movd %xmm0, %eax
        ret

instead of:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        xorps %xmm1, %xmm1
        movss %xmm0, %xmm1
        movd %xmm1, %eax
        ret

GCC gets:

_test:
        subl    $28, %esp
        movss   32(%esp), %xmm0
        mulss   %xmm0, %xmm0
        xorps   %xmm1, %xmm1
        movss   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        movd    %xmm0, 12(%esp)
        movl    12(%esp), %eax
        addl    $28, %esp
        ret

llvm-svn: 36020
2007-04-14 22:29:23 +00:00
Chris Lattner a6b5660209 avoid copying sets and vectors around.
llvm-svn: 36017
2007-04-14 22:10:17 +00:00
Jeff Cohen e7ce8f23f6 Fix PR1329.
llvm-svn: 36016
2007-04-14 21:50:21 +00:00
Chris Lattner 6bd7b7b30b disable switch lowering using shift/and. It still breaks ppc bootstrap for
some reason.  :(  Will investigate.

llvm-svn: 36011
2007-04-14 19:39:41 +00:00
Chris Lattner 6f58839b20 avoid iterator invalidation.
llvm-svn: 36002
2007-04-14 18:06:52 +00:00
Jeff Cohen 4bd0fd367a An even better fix.
llvm-svn: 35998
2007-04-14 17:18:29 +00:00
Jeff Cohen 7233aa9369 Fix recent regression that broke several llvm-tests.
llvm-svn: 35996
2007-04-14 16:55:19 +00:00
Anton Korobeynikov 8a1a84f96e Fix PR1325: Case range optimization was performed in the case it
shouldn't. Also fix some "latent" bug on 64-bit platforms

llvm-svn: 35990
2007-04-14 13:25:55 +00:00
Chris Lattner 7196f09edc disable shift/and lowering to work around PR1325 for now.
llvm-svn: 35985
2007-04-14 02:26:56 +00:00
Chris Lattner 49fa8d2bff Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar
printf("") -> noop.  Still need to do the xforms for fprintf.

This implements Transforms/SimplifyLibCalls/Printf.ll

llvm-svn: 35984
2007-04-14 01:17:48 +00:00
Chris Lattner 02137eec8f in addition to merging, constantmerge should also delete trivially dead globals,
in order to clean up after simplifylibcalls.

llvm-svn: 35982
2007-04-14 01:11:54 +00:00
Chris Lattner efb33d28c6 Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll
llvm-svn: 35981
2007-04-14 00:20:02 +00:00
Chris Lattner 164b76565b use an accessor to simplify code.
llvm-svn: 35979
2007-04-14 00:17:39 +00:00
Chris Lattner 45f15572f7 add GetElementPtrInst::hasAllZeroIndices, a long-overdue helper method.
Writing it twice in the same day was too much for me.

llvm-svn: 35978
2007-04-14 00:12:57 +00:00
Reid Spencer 67378b22bb We want the number of bits needed, not the power of 2.
llvm-svn: 35977
2007-04-14 00:00:10 +00:00
Jeff Cohen b4c610fb89 Silence VC++ warning.
llvm-svn: 35975
2007-04-13 22:52:03 +00:00
Chris Lattner efd3051d60 Now that codegen prepare isn't defeating me, I can finally fix what I set
out to do! :)

This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C).  The problem is that
this code cannot be CSE'd back together if inserted into different blocks.

This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes.  On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:

        add r8, r0, r5
        str r6, [r8, #+4]
..
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        ldr r6, LCPI1_1
        str r6, [r8, #+4]

instead of:

        add r10, r0, r6
        str r8, [r10, #+4]
...
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        add r8, r0, r6
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        add r8, r0, r6
        ldr r10, LCPI1_1
        str r10, [r8, #+4]

Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)

llvm-svn: 35972
2007-04-13 20:42:26 +00:00
Chris Lattner feee64e997 Completely rewrite addressing-mode related sinking of code. In particular,
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.

This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86.  For example, now we compile
CodeGen/X86/isel-sink.ll to:

_test:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx,%eax,4)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl 8(%esp), %eax
        leal (,%eax,4), %ecx
        addl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx), %eax
        ret

llvm-svn: 35970
2007-04-13 20:30:56 +00:00
Reid Spencer 9329e7b626 Implement a getBitsNeeded method to determine how many bits are needed to
represent a string in binary form by an APInt.

llvm-svn: 35968
2007-04-13 19:19:07 +00:00
Devang Patel 38705d5494 Remove use of SlowOperationInformer.
llvm-svn: 35967
2007-04-13 18:58:18 +00:00
Devang Patel b730fe57bf Undo previous check-in.
llvm-svn: 35966
2007-04-13 18:35:15 +00:00
Devang Patel f929b86140 Hello uses LLVMSupport.a (SlowerOperationInformer)
llvm-svn: 35965
2007-04-13 18:28:23 +00:00
Anton Korobeynikov e288040abf Fix PR1323 : we haven't updated phi nodes in good manner :)
llvm-svn: 35963
2007-04-13 06:53:51 +00:00
Chris Lattner 502c3f48d9 arm has r+r*s and r+i addr modes, but no r+i+r*s addr modes.
llvm-svn: 35962
2007-04-13 06:50:55 +00:00
Zhou Sheng 01c175ec52 Make the apint construction more effective.
llvm-svn: 35960
2007-04-13 05:57:32 +00:00
Chris Lattner e71f1447f7 CSE simple binary expressions when they are inserted. This makes LSR produce
less huge code that needs to be cleaned up by sdisel.

llvm-svn: 35959
2007-04-13 05:04:18 +00:00
Reid Spencer 9945235b55 Implement review feedback .. don't double search a set.
llvm-svn: 35957
2007-04-12 21:57:15 +00:00
Reid Spencer 1b9213730f Make sure intrinsics that are lowered to functions make the function weak
linkage so we only end up with one of them in a program. These are, after
all overloaded and templatish in nature.

llvm-svn: 35956
2007-04-12 21:53:38 +00:00
Reid Spencer 0ef2ca8da7 Provide support for intrinsics that lower themselves to a function body.
This can happen for intrinsics that are overloaded.  In such cases it is
necessary to emit a function prototype before the body of the function
that calls the intrinsic and to ensure we don't emit it multiple times.

llvm-svn: 35954
2007-04-12 21:00:45 +00:00
Lauro Ramos Venancio e6818b2549 Implement Thread Local Storage (TLS) in CBackend.
llvm-svn: 35951
2007-04-12 18:42:08 +00:00
Lauro Ramos Venancio 749e4668e7 Implement the "thread_local" keyword.
llvm-svn: 35950
2007-04-12 18:32:50 +00:00
Reid Spencer 0f2f65f723 Fix bugs in generated code for part_select and part_set so that llc doesn't
barf when CBE is run with a program that contains these intrinsics.

llvm-svn: 35946
2007-04-12 13:30:14 +00:00
Reid Spencer 83faeb7611 Fix a bug in PartSet. The replacement value needs to be zext or trunc to
the size of the value, not just zext. Also, give better names to two BBs.

llvm-svn: 35945
2007-04-12 12:46:33 +00:00
Chris Lattner 5111499136 the result of an inline asm copy can be an arbitrary VT that the register
class supports.  In the case of vectors, this means we often get the wrong
type (e.g. we get v4f32 instead of v8i16).  Make sure to convert the vector
result to the right type.  This fixes CodeGen/X86/2007-04-11-InlineAsmVectorResult.ll

llvm-svn: 35944
2007-04-12 06:00:20 +00:00