Commit Graph

14593 Commits

Author SHA1 Message Date
Chris Lattner 02b6c918b7 Fix a bug in the safety analysis routine
llvm-svn: 16804
2004-10-07 06:01:25 +00:00
Chris Lattner f64799683e Comment cleanups
llvm-svn: 16803
2004-10-07 06:00:24 +00:00
Chris Lattner 25db58032d * Rename pass to globalopt, since we do more than just constify
* Instead of handling dead functions specially, just nuke them.
* Be more aggressive about cleaning up after constification, in
  particular, handle getelementptr instructions and constantexprs.
* Be a little bit more structured about how we process globals.

*** Delete globals that are only stored to, and never read.  These are
    clearly not useful, so they should go.  This implements deadglobal.llx

This last one triggers quite a few times.  In particular, 2208 in the
external tests, 1865 of which are in 252.eon.  This shrinks eon from
1995094 to 1732341 bytes of bytecode.

llvm-svn: 16802
2004-10-07 04:16:33 +00:00
Chris Lattner fa3cfd3955 Rename pass
llvm-svn: 16801
2004-10-07 04:12:02 +00:00
Chris Lattner b0c8aab038 This pass is not needed, as there is only ever one global: the stack
llvm-svn: 16800
2004-10-07 04:10:36 +00:00
Chris Lattner 381fbf1616 Add new testcase, rename pass
llvm-svn: 16799
2004-10-07 04:07:08 +00:00
Chris Lattner fc303099c9 Don't add libz or libbz2 to the USEDLIBS lists, those are for LLVM libraries.
llvm-svn: 16798
2004-10-07 00:03:11 +00:00
Chris Lattner 76319a83bd Don't call memset if malloc returns a null pointer
llvm-svn: 16797
2004-10-06 23:08:03 +00:00
Chris Lattner 1f849a08a3 Implement GlobalConstifier/trivialstore.llx, and also do some
simplifications of the resultant program to avoid making later passes
do it all.

This allows us to constify globals that just have the same constant that
they are initialized stored into them.

Suprisingly this comes up ALL of the freaking time, dozens of times in
SPEC, 30 times in vortex alone.

For example, on 256.bzip2, it allows us to constify these two globals:

%smallMode = internal global ubyte 0             ; <ubyte*> [#uses=8]
%verbosity = internal global int 0               ; <int*> [#uses=49]

Which (with later optimizations) results in the bytecode file shrinking
from 82286 to 69686 bytes!  Lets hear it for IPO :)

For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }"
code.

llvm-svn: 16793
2004-10-06 20:57:02 +00:00
Chris Lattner 645bcf6c5d New testcase
llvm-svn: 16791
2004-10-06 20:42:51 +00:00
Chris Lattner af88fcd4c9 Dont' let null nodes sneak past cast instructions
llvm-svn: 16779
2004-10-06 19:29:13 +00:00
Misha Brukman fe643e314f Undoxyfy internal method.
llvm-svn: 16774
2004-10-06 17:19:58 +00:00
Misha Brukman 74a1195bd6 Doxygen-ify comments
llvm-svn: 16773
2004-10-06 16:56:16 +00:00
Chris Lattner 43e03c9cdf Change Type::isAbstract to have better comments, a more correct name
(PromoteAbstractToConcrete), and to use a set to avoid recomputation.
In particular, this set eliminates the potentially exponential cases
from this little recursive algorithm.

On a particularly nasty testcase, llvm-dis on the .bc file went from 34
minutes (which is when I killed it, it still hadn't finished) to 0.57s.
Remember kids, exponential algorithms are bad.

llvm-svn: 16772
2004-10-06 16:36:46 +00:00
Chris Lattner f29560783a Rename method, change comment, add argument
llvm-svn: 16771
2004-10-06 16:34:23 +00:00
Chris Lattner f94f985bbd Correct some typeos
llvm-svn: 16770
2004-10-06 16:28:24 +00:00
Chris Lattner 0aee4b7947 Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16
llvm-svn: 16769
2004-10-06 15:08:25 +00:00
Chris Lattner 52783ab1d8 New testcase
llvm-svn: 16768
2004-10-06 15:07:56 +00:00
Chris Lattner 93867e516a Remove debugging code, fix encoding problem. This fixes the problems
the JIT had last night.

llvm-svn: 16766
2004-10-06 14:31:50 +00:00
Nate Begeman 9a1fbaf1e9 Turning on fsel code gen now that we can do so would be good.
llvm-svn: 16765
2004-10-06 11:03:30 +00:00
Nate Begeman fac8529df8 Implement floating point select for lt, gt, le, ge using the powerpc fsel
instruction.

Now, rather than emitting the following loop out of bisect:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f4
	bge .LBB_main_64	; no_exit.0.i
.LBB_main_63:	; no_exit.0.i
	b .LBB_main_65	; no_exit.0.i
.LBB_main_64:	; no_exit.0.i
	fmr f2, f1
.LBB_main_65:	; no_exit.0.i
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f5
	bge .LBB_main_67	; no_exit.0.i
.LBB_main_66:	; no_exit.0.i
	b .LBB_main_68	; no_exit.0.i
.LBB_main_67:	; no_exit.0.i
	fmr f4, f1
.LBB_main_68:	; no_exit.0.i
	fadd f1, f2, f4
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fcmpu cr0, f4, f0
	bgt .LBB_main_70	; no_exit.0.i
.LBB_main_69:	; no_exit.0.i
	b .LBB_main_71	; no_exit.0.i
.LBB_main_70:	; no_exit.0.i
	fmr f0, f4
.LBB_main_71:	; no_exit.0.i
	fsub f1, f2, f1
	addi r2, r2, -1
	fcmpu cr0, f1, f3
	blt .LBB_main_73	; no_exit.0.i
.LBB_main_72:	; no_exit.0.i
	b .LBB_main_74	; no_exit.0.i
.LBB_main_73:	; no_exit.0.i
	fmr f3, f1
.LBB_main_74:	; no_exit.0.i
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

We emit this instead:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	fsel f1, f1, f1, f2
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f2, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f2
	fsel f2, f2, f2, f4
	fadd f1, f1, f2
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fsub f5, f0, f4
	fsel f0, f5, f0, f4
	fsub f1, f2, f1
	addi r2, r2, -1
	fsub f2, f1, f3
	fsel f3, f2, f3, f1
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

llvm-svn: 16764
2004-10-06 09:53:04 +00:00
Chris Lattner 6835dedb5b Codegen signed mod by 2 or -2 more efficiently. Instead of generating:
t:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %ECX, 2
        mov %EAX, %EDX
        sar %EDX, 31
        idiv %ECX
        mov %EAX, %EDX
        ret

Generate:
t:
        mov %ECX, DWORD PTR [%ESP + 4]
***     mov %EAX, %ECX
        cdq
        and %ECX, 1
        xor %ECX, %EDX
        sub %ECX, %EDX
***     mov %EAX, %ECX
        ret

Note that the two marked moves are redundant, and should be eliminated by the
register allocator, but aren't.

Compare this to GCC, which generates:

t:
        mov     %eax, DWORD PTR [%esp+4]
        mov     %edx, %eax
        shr     %edx, 31
        lea     %ecx, [%edx+%eax]
        and     %ecx, -2
        sub     %eax, %ecx
        ret

or ICC 8.0, which generates:

t:
        movl      4(%esp), %ecx                                 #3.5
        movl      $-2147483647, %eax                            #3.25
        imull     %ecx                                          #3.25
        movl      %ecx, %eax                                    #3.25
        sarl      $31, %eax                                     #3.25
        addl      %ecx, %edx                                    #3.25
        subl      %edx, %eax                                    #3.25
        addl      %eax, %eax                                    #3.25
        negl      %eax                                          #3.25
        subl      %eax, %ecx                                    #3.25
        movl      %ecx, %eax                                    #3.25
        ret                                                     #3.25

We would be in great shape if not for the moves.

llvm-svn: 16763
2004-10-06 05:01:07 +00:00
Chris Lattner e4c60eb704 Really fix FreeBSD, which apparently doesn't tolerate the extern.
Thanks to Jeff Cohen for pointing out my goof.

llvm-svn: 16762
2004-10-06 04:21:52 +00:00
Chris Lattner 7bd8f1332d Fix a scary bug with signed division by a power of two. We used to generate:
s:   ;; X / 4
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        mov %EDX, %EAX
        add %EDX, %ECX
        sar %EAX, 2
        ret

When we really meant:

s:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        add %EAX, %ECX
        sar %EAX, 2
        ret

Hey, this also reduces register pressure too :)

llvm-svn: 16761
2004-10-06 04:19:43 +00:00
Chris Lattner 147edd2f7e Codegen signed divides by 2 and -2 more efficiently. In particular
instead of:

s:   ;; X / 2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        ret

t:   ;; X / -2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        negl %eax
        ret

Emit:

s:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        ret

t:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        negl %eax
        ret

llvm-svn: 16760
2004-10-06 04:02:39 +00:00
Chris Lattner e9bfa5a2a4 Add some new instructions. Fix the asm string for sbb32rr
llvm-svn: 16759
2004-10-06 04:01:02 +00:00
Chris Lattner 2ce32df8b0 Reduce code growth implied by the tail duplication pass by not duplicating
an instruction if it can be hoisted to a common dominator of the block.
This implements: test/Regression/Transforms/TailDup/MergeTest.ll

llvm-svn: 16758
2004-10-06 03:27:37 +00:00
Chris Lattner 7d83efbc0b When tail duplicating these functions, the add instruction should not be
duplicated, even though the block it is in is duplicated.

llvm-svn: 16757
2004-10-06 03:26:38 +00:00
Chris Lattner 32ed828f46 FreeBSD uses GCC. Patch contributed by Jeff Cohen!
llvm-svn: 16756
2004-10-06 03:15:44 +00:00
Chris Lattner 18b88f71ad Fix the path to the fixinc'd headers. Patch contributed by Jeff Cohen!
llvm-svn: 16755
2004-10-06 03:13:47 +00:00
Brian Gaeke c5a630bd3c Must include sys/stat.h before declaring a 'struct stat'
llvm-svn: 16728
2004-10-05 18:46:59 +00:00
Brian Gaeke a3d1b776b9 Build BFtoLLVM example front-end by default
llvm-svn: 16719
2004-10-05 18:05:53 +00:00
Brian Gaeke ca70a78b71 Add BFtoLLVM example front end
llvm-svn: 16714
2004-10-05 18:05:25 +00:00
Chris Lattner 9b38ead89a Make sure the const bit gets inherited correctly when linking declarations
of disagreeing constness.  This fixes
test/Regression/Linker/ConstantGlobals[123].ll

llvm-svn: 16692
2004-10-05 02:28:11 +00:00
Chris Lattner 07d1d7ede5 Another testcase for constness linkage
llvm-svn: 16691
2004-10-05 02:16:01 +00:00
Chris Lattner e0d464bda2 Testcase to ensure that the 'constant' flag follows the definition when there
is a question.

llvm-svn: 16690
2004-10-05 02:12:20 +00:00
Reid Spencer abb04cfc79 Adjust sys/stat.h inclusion so its only for SunOS.
llvm-svn: 16686
2004-10-05 00:56:46 +00:00
Tanya Lattner c3ef3cc7e5 Added a couple of includes to get this to compile on Sparc.
llvm-svn: 16685
2004-10-05 00:51:26 +00:00
Chris Lattner 9895937618 Solaris doesn't have MAP_FILE.
llvm-svn: 16682
2004-10-05 00:46:21 +00:00
Chris Lattner 2426ff991c Bug fixed
llvm-svn: 16671
2004-10-05 00:23:02 +00:00
Chris Lattner db76a3db91 New testcase for PR450
llvm-svn: 16670
2004-10-05 00:18:21 +00:00
Reid Spencer defd9699e6 Add checks for the ZLIB and BZIP2 header files, not just the libraries.
llvm-svn: 16669
2004-10-04 22:05:53 +00:00
Chris Lattner 69aa178674 Fix #include flavor
llvm-svn: 16658
2004-10-04 18:10:18 +00:00
Reid Spencer 38b846c8e1 Move the warning about no compression library down to the bottom, away
from the fray, so it gets noticed. This commit is made without the
corresponding configure script commit because it doesn't affect
functionality and we don't want to force everyone into another reconfigure

llvm-svn: 16657
2004-10-04 18:02:55 +00:00
Reid Spencer 266cd00360 Fix typo in makefile variable name that prevents zlib from being recognized
llvm-svn: 16656
2004-10-04 17:49:19 +00:00
Reid Spencer 1bd6da2293 Add HAVE_BZIP2 and HAVE_ZLIB
llvm-svn: 16655
2004-10-04 17:48:37 +00:00
Reid Spencer 04f1e90657 Excise the ill-advised RLCOMP compression algorithm and simply leave the
previously temporary NULLCOMP implementation that merely copies the data
verbatim without compression. Also, don't warn if there's no compression
library as that is taken care of during configuration time.

llvm-svn: 16654
2004-10-04 17:45:44 +00:00
Misha Brukman c2fdea6d8f Add example 'abstract' architectures for LLI: MIX, MMIX, and DLX
llvm-svn: 16653
2004-10-04 17:36:35 +00:00
Reid Spencer 2e3cc54e42 Add a context for the callback so different compression scenarios can be
distinguished. Tidy up documentation.  Thanks, Chris.

llvm-svn: 16652
2004-10-04 17:29:25 +00:00
Reid Spencer 34637df55f Minor corrections suggested by Chris' ever-watchful eye.
llvm-svn: 16651
2004-10-04 17:26:26 +00:00