Commit Graph

26717 Commits

Author SHA1 Message Date
Rafael Espindola ef01656ea4 fix some bugs affecting functions with no arguments
llvm-svn: 30767
2006-10-06 17:26:30 +00:00
Rafael Espindola 6024ea8383 fix the stack alignment
llvm-svn: 30766
2006-10-06 14:29:47 +00:00
Rafael Espindola 5fe7909e18 add support for calling functions that have double arguments
llvm-svn: 30765
2006-10-06 12:50:22 +00:00
Evan Cheng ff1beda569 Still need to support -mcpu=<> or cross compilation will fail. Doh.
llvm-svn: 30764
2006-10-06 09:17:41 +00:00
Evan Cheng 9274f72e58 Do away with CPU feature list. Just use CPUID to detect MMX, SSE, SSE2, SSE3, and 64-bit support.
llvm-svn: 30763
2006-10-06 08:21:07 +00:00
Evan Cheng 4c1a804a5b It appears the inline asm in GetCpuIDAndInfo() may clobbers some registers if it isn't inlined (at < -O3). Force it to be inlined.
llvm-svn: 30762
2006-10-06 07:50:56 +00:00
Chris Lattner 469ea0c94d add an accessor
llvm-svn: 30761
2006-10-06 01:16:29 +00:00
Chris Lattner 16ae43e901 MachineBasicBlock::splice was incorrectly updating parent pointers on
instructions.

llvm-svn: 30760
2006-10-06 01:12:44 +00:00
Evan Cheng df9ac47e5e Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng af309d29b1 Add getStore() helper function to create ISD::STORE nodes.
llvm-svn: 30758
2006-10-05 22:57:11 +00:00
Chris Lattner 8b1a59a272 Don't crash if an MBB doesn't have an LLVM BB
llvm-svn: 30757
2006-10-05 21:40:14 +00:00
Rafael Espindola decfeca52d use a const ref for passing the vector to ArgumentLayout
llvm-svn: 30756
2006-10-05 17:46:48 +00:00
Rafael Espindola e04df41ca2 implement a ArgumentLayout class to factor code common to LowerFORMAL_ARGUMENTS and LowerCALL
implement FMDRR
add support for f64 function arguments

llvm-svn: 30754
2006-10-05 16:48:49 +00:00
Jim Laskey 6549d22ef9 Alias analysis code clean ups.
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Chris Lattner 2deeaeaca7 add a new SimplifyDemandedVectorElts method, which works similarly to
SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752
2006-10-05 06:55:50 +00:00
Chris Lattner 3d5e9818bd new testcase
llvm-svn: 30751
2006-10-05 06:51:54 +00:00
Chris Lattner 65511ff69d Add insertelement/extractelement helper ctors.
llvm-svn: 30750
2006-10-05 06:24:58 +00:00
Chris Lattner f2ef243580 Lower some min/max idioms to minss/maxss when unsafe fp math is enabled.
llvm-svn: 30748
2006-10-05 04:11:26 +00:00
Andrew Lenharth 16b8f95831 Check that jump tables wind up in the rodata section
llvm-svn: 30747
2006-10-05 03:27:52 +00:00
Chris Lattner 40a95dd347 remove JumpTableTextSection
llvm-svn: 30746
2006-10-05 03:14:23 +00:00
Chris Lattner 8cfd10eff3 Don't bother setting JumpTableTextSection, it is about to disappear
llvm-svn: 30745
2006-10-05 03:13:59 +00:00
Chris Lattner 66c1625a37 Emit pic jumptables to the same section that the function is emitted to,
allowing label differences to work.  This fixes CodeGen/X86/pic_jumptable.ll

llvm-svn: 30744
2006-10-05 03:13:28 +00:00
Chris Lattner bfe59e87e5 Verify that jump tables are emitted to the same section as the function is,
when codegen'ing in pic mode.  This fixes a miscompilation of a switch stmt
in a template, as the template goes to a non-.text section.

llvm-svn: 30743
2006-10-05 03:12:36 +00:00
Chris Lattner a6a570e02f Pass the MachineFunction into EmitJumpTableInfo.
llvm-svn: 30742
2006-10-05 03:01:21 +00:00
Chris Lattner 38e2c8a0a2 implement and use getSectionForFunction
llvm-svn: 30741
2006-10-05 02:51:36 +00:00
Chris Lattner 4431699187 Use getSectionForFunction.
llvm-svn: 30740
2006-10-05 02:49:23 +00:00
Chris Lattner d4d255a408 Use getSectionForFunction
llvm-svn: 30739
2006-10-05 02:48:40 +00:00
Chris Lattner c8c78982d4 use getSectionForFunction to decide which section to emit code into
llvm-svn: 30738
2006-10-05 02:47:13 +00:00
Chris Lattner b82247b168 Implement getSectionForFunction, use it when printing function body.
llvm-svn: 30737
2006-10-05 02:43:52 +00:00
Chris Lattner dc82241182 move getSectionForFunction to AsmPrinter
llvm-svn: 30736
2006-10-05 02:42:47 +00:00
Chris Lattner 028d663ee6 Move getSectionForFunction to AsmPrinter, change it to return a string.
llvm-svn: 30735
2006-10-05 02:42:20 +00:00
Chris Lattner 0dca927148 move getSectionForFunction to AsmPrinter.
llvm-svn: 30734
2006-10-05 02:41:43 +00:00
Chris Lattner 0d236450aa implement DarwinTargetAsmInfo::getSectionForFunction, use it when outputting
function bodies

llvm-svn: 30733
2006-10-05 00:35:50 +00:00
Chris Lattner afe6d7a179 Give TargetAsmInfo a virtual dtor, add a new getSectionForFunction method.
llvm-svn: 30732
2006-10-05 00:35:16 +00:00
Chris Lattner 41e22a5419 emit jump table before debug info
llvm-svn: 30731
2006-10-05 00:26:05 +00:00
Chris Lattner aad26a19f0 Always emit the jump table after the function so it's part of the same 'atom'
as the function body.

llvm-svn: 30730
2006-10-05 00:24:46 +00:00
Chris Lattner 19721e8749 getFilename/getDirectory shouldn't abort if the global has no init. This
can happen on bugpoint reduced testcases f.e..

llvm-svn: 30729
2006-10-04 23:06:26 +00:00
Evan Cheng f80dfa83a0 Fix some typos that can cause a flag value to have more than one use.
llvm-svn: 30727
2006-10-04 22:23:53 +00:00
Chris Lattner c374ec43b5 Fix a static dtor issue
llvm-svn: 30726
2006-10-04 22:13:11 +00:00
Chris Lattner 8111c59279 Fix more static dtor issues
llvm-svn: 30725
2006-10-04 21:52:35 +00:00
Chris Lattner 538c6eb05c Fix some more static dtor issues.
llvm-svn: 30724
2006-10-04 21:49:37 +00:00
Evan Cheng 8c5766ef3f Added option -disable-x86-shuffle-opti to disable X86 specific vector shuffle optimizations.
llvm-svn: 30723
2006-10-04 18:33:38 +00:00
Evan Cheng 412aaabcbe Formating.
llvm-svn: 30722
2006-10-04 18:33:00 +00:00
Jim Laskey 708d0db2d8 More extensive alias analysis.
llvm-svn: 30721
2006-10-04 16:53:27 +00:00
Jim Laskey 0d5a0eae57 More long term solution
llvm-svn: 30720
2006-10-04 10:40:15 +00:00
Chris Lattner 9259b1efb6 Pattern match min/max nodes when we have sse. This implements
CodeGen/X86/scalar_sse_minmax.ll

llvm-svn: 30719
2006-10-04 06:57:07 +00:00
Chris Lattner 1e21d3a5ae pattern match min/max nodes
llvm-svn: 30718
2006-10-04 06:56:02 +00:00
Chris Lattner 3e11d99a0a add a note :(
llvm-svn: 30717
2006-10-04 05:52:13 +00:00
Chris Lattner 52886e72d7 This case isn't implemented yet. It seems unlikely to be needed, but if it
ever is, we want to get an assert instead of silent bad codegen.

llvm-svn: 30716
2006-10-04 04:58:58 +00:00
Jim Laskey 66b0b55816 Work around for some problems with templates.
llvm-svn: 30715
2006-10-04 01:43:13 +00:00