llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Spencer	f9fdfa7aa5	Add the --with-automake option to AutoRegen.sh and provide the automake version of the configure script. This is an early commit of the automake support so that automake support can be tested on multiple platforms. Many additional Makefile.am need to be added to LLVM before this is of any use. Please wait until automake support is announced on llvmdev list before using the --with-automake option. llvm-svn: 16837	2004-10-08 05:33:35 +00:00
Chris Lattner	bff91d9a2e	Instcombine (X & FF00) + xx00 -> (X+xx00) & FF00, implementing and.ll:test27 This comes up when doing adds to bitfield elements. llvm-svn: 16836	2004-10-08 05:07:56 +00:00
Chris Lattner	9467062bcf	New testcase llvm-svn: 16835	2004-10-08 05:03:25 +00:00
Chris Lattner	44bd392cbf	Little patch to turn (shl (add X, 123), 4) -> (add (shl X, 4), 123 << 4) This triggers in cases of bitfield additions, opening opportunities for future improvements. llvm-svn: 16834	2004-10-08 03:46:20 +00:00
Chris Lattner	7bfe4032fc	New testcase llvm-svn: 16833	2004-10-08 03:41:59 +00:00
Nate Begeman	b58dd6799f	Implement logical and with an immediate that consists of a contiguous block of one or more 1 bits (may wrap from least significant bit to most significant bit) as the rlwinm rather than andi., andis., or some longer instructons sequence. int andn4(int z) { return z & -4; } int clearhi(int z) { return z & 0x0000FFFF; } int clearlo(int z) { return z & 0xFFFF0000; } int clearmid(int z) { return z & 0x00FFFF00; } int clearwrap(int z) { return z & 0xFF0000FF; } _andn4: rlwinm r3, r3, 0, 0, 29 blr _clearhi: rlwinm r3, r3, 0, 16, 31 blr _clearlo: rlwinm r3, r3, 0, 0, 15 blr _clearmid: rlwinm r3, r3, 0, 8, 23 blr _clearwrap: rlwinm r3, r3, 0, 24, 7 blr llvm-svn: 16832	2004-10-08 02:49:24 +00:00
Misha Brukman	3e0a20ebb8	Fix usage description typo llvm-svn: 16831	2004-10-08 01:11:15 +00:00
Misha Brukman	f5a92bda7b	Make comment header span the entire line llvm-svn: 16830	2004-10-08 01:10:52 +00:00
Misha Brukman	ed3dc439e1	Describe how to configure tests to work with f2c llvm-svn: 16829	2004-10-08 00:55:43 +00:00
Misha Brukman	8d08d36199	* Reformat to fit 80 cols * Add missing <li> tags llvm-svn: 16828	2004-10-08 00:41:27 +00:00
Nate Begeman	6e6514c47e	Several fixes and enhancements to the PPC32 backend. 1. Fix an illegal argument to getClassB when deciding whether or not to sign extend a byte load. 2. Initial addition of isLoad and isStore flags to the instruction .td file for eventual use in a scheduler. 3. Rewrite of how constants are handled in emitSimpleBinaryOperation so that we can emit the PowerPC shifted immediate instructions far more often. This allows us to emit the following code: int foo(int x) { return x \| 0x00F0000; } _foo: .LBB_foo_0: ; entry ; IMPLICIT_DEF oris r3, r3, 15 blr llvm-svn: 16826	2004-10-07 22:30:03 +00:00
Nate Begeman	c6b63cd2ed	Add ori reg, reg, 0 as a move instruction. This can be generated from loading a 32bit constant into a register whose low halfword is all zeroes. We now omit the ori after the lis for the following C code: int bar(int y) { return y * 0x00F0000; } _bar: .LBB_bar_0: ; entry ; IMPLICIT_DEF lis r2, 15 mullw r3, r3, r2 blr llvm-svn: 16825	2004-10-07 22:26:12 +00:00
Nate Begeman	70a9d9c0b1	Remove unnecessary header include llvm-svn: 16824	2004-10-07 22:24:32 +00:00
Chris Lattner	617f1a34f1	Improve comments, no functionality changes llvm-svn: 16814	2004-10-07 21:30:30 +00:00
Chris Lattner	3ae7bb6b7c	Fix a nasty dangling pointer problem, due to a free'd pointer being left in a map. This caused problems if a later object happened to be allocated at the free'd object's address. llvm-svn: 16813	2004-10-07 20:01:31 +00:00
Chris Lattner	b0b1cb2182	Get friendly with Type llvm-svn: 16812	2004-10-07 19:21:43 +00:00
Chris Lattner	251093ca5d	Unfortunately the fix for the previous bug introduced the previous exponential behavior (bork!). This patch processes stuff with an explicit SCC finder, allowing the algorithm to be more clear, efficient, and also (as a bonus) correct! This gets us back to taking 0.6s to disassemble my horrible .bc file that previously took something > 30 mins. llvm-svn: 16811	2004-10-07 19:20:48 +00:00
Chris Lattner	d299ffee01	Change signature of this method again llvm-svn: 16810	2004-10-07 19:19:12 +00:00
Chris Lattner	31d9e6f922	These files now live in Transforms/GlobalOpt llvm-svn: 16809	2004-10-07 19:16:43 +00:00
Chris Lattner	5860106954	Move these files from Transforms/GlobalConstifier llvm-svn: 16808	2004-10-07 19:16:26 +00:00
Chris Lattner	cef3c06027	Fix a bug in my previous change. Unfortunately this reverts most of the speedup, but has the advantage of not breaking a bunch of programs! llvm-svn: 16806	2004-10-07 16:19:40 +00:00
Reid Spencer	50a425a56d	Make these scripts work on SunOS too. llvm-svn: 16805	2004-10-07 16:03:21 +00:00
Chris Lattner	02b6c918b7	Fix a bug in the safety analysis routine llvm-svn: 16804	2004-10-07 06:01:25 +00:00
Chris Lattner	f64799683e	Comment cleanups llvm-svn: 16803	2004-10-07 06:00:24 +00:00
Chris Lattner	25db58032d	* Rename pass to globalopt, since we do more than just constify * Instead of handling dead functions specially, just nuke them. * Be more aggressive about cleaning up after constification, in particular, handle getelementptr instructions and constantexprs. * Be a little bit more structured about how we process globals. *** Delete globals that are only stored to, and never read. These are clearly not useful, so they should go. This implements deadglobal.llx This last one triggers quite a few times. In particular, 2208 in the external tests, 1865 of which are in 252.eon. This shrinks eon from 1995094 to 1732341 bytes of bytecode. llvm-svn: 16802	2004-10-07 04:16:33 +00:00
Chris Lattner	fa3cfd3955	Rename pass llvm-svn: 16801	2004-10-07 04:12:02 +00:00
Chris Lattner	b0c8aab038	This pass is not needed, as there is only ever one global: the stack llvm-svn: 16800	2004-10-07 04:10:36 +00:00
Chris Lattner	381fbf1616	Add new testcase, rename pass llvm-svn: 16799	2004-10-07 04:07:08 +00:00
Chris Lattner	fc303099c9	Don't add libz or libbz2 to the USEDLIBS lists, those are for LLVM libraries. llvm-svn: 16798	2004-10-07 00:03:11 +00:00
Chris Lattner	76319a83bd	Don't call memset if malloc returns a null pointer llvm-svn: 16797	2004-10-06 23:08:03 +00:00
Chris Lattner	1f849a08a3	Implement GlobalConstifier/trivialstore.llx, and also do some simplifications of the resultant program to avoid making later passes do it all. This allows us to constify globals that just have the same constant that they are initialized stored into them. Suprisingly this comes up ALL of the freaking time, dozens of times in SPEC, 30 times in vortex alone. For example, on 256.bzip2, it allows us to constify these two globals: %smallMode = internal global ubyte 0 ; <ubyte> [#uses=8] %verbosity = internal global int 0 ; <int> [#uses=49] Which (with later optimizations) results in the bytecode file shrinking from 82286 to 69686 bytes! Lets hear it for IPO :) For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }" code. llvm-svn: 16793	2004-10-06 20:57:02 +00:00
Chris Lattner	645bcf6c5d	New testcase llvm-svn: 16791	2004-10-06 20:42:51 +00:00
Chris Lattner	af88fcd4c9	Dont' let null nodes sneak past cast instructions llvm-svn: 16779	2004-10-06 19:29:13 +00:00
Misha Brukman	fe643e314f	Undoxyfy internal method. llvm-svn: 16774	2004-10-06 17:19:58 +00:00
Misha Brukman	74a1195bd6	Doxygen-ify comments llvm-svn: 16773	2004-10-06 16:56:16 +00:00
Chris Lattner	43e03c9cdf	Change Type::isAbstract to have better comments, a more correct name (PromoteAbstractToConcrete), and to use a set to avoid recomputation. In particular, this set eliminates the potentially exponential cases from this little recursive algorithm. On a particularly nasty testcase, llvm-dis on the .bc file went from 34 minutes (which is when I killed it, it still hadn't finished) to 0.57s. Remember kids, exponential algorithms are bad. llvm-svn: 16772	2004-10-06 16:36:46 +00:00
Chris Lattner	f29560783a	Rename method, change comment, add argument llvm-svn: 16771	2004-10-06 16:34:23 +00:00
Chris Lattner	f94f985bbd	Correct some typeos llvm-svn: 16770	2004-10-06 16:28:24 +00:00
Chris Lattner	0aee4b7947	Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16 llvm-svn: 16769	2004-10-06 15:08:25 +00:00
Chris Lattner	52783ab1d8	New testcase llvm-svn: 16768	2004-10-06 15:07:56 +00:00
Chris Lattner	93867e516a	Remove debugging code, fix encoding problem. This fixes the problems the JIT had last night. llvm-svn: 16766	2004-10-06 14:31:50 +00:00
Nate Begeman	9a1fbaf1e9	Turning on fsel code gen now that we can do so would be good. llvm-svn: 16765	2004-10-06 11:03:30 +00:00
Nate Begeman	fac8529df8	Implement floating point select for lt, gt, le, ge using the powerpc fsel instruction. Now, rather than emitting the following loop out of bisect: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f4 bge .LBB_main_64 ; no_exit.0.i .LBB_main_63: ; no_exit.0.i b .LBB_main_65 ; no_exit.0.i .LBB_main_64: ; no_exit.0.i fmr f2, f1 .LBB_main_65: ; no_exit.0.i addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f5 bge .LBB_main_67 ; no_exit.0.i .LBB_main_66: ; no_exit.0.i b .LBB_main_68 ; no_exit.0.i .LBB_main_67: ; no_exit.0.i fmr f4, f1 .LBB_main_68: ; no_exit.0.i fadd f1, f2, f4 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fcmpu cr0, f4, f0 bgt .LBB_main_70 ; no_exit.0.i .LBB_main_69: ; no_exit.0.i b .LBB_main_71 ; no_exit.0.i .LBB_main_70: ; no_exit.0.i fmr f0, f4 .LBB_main_71: ; no_exit.0.i fsub f1, f2, f1 addi r2, r2, -1 fcmpu cr0, f1, f3 blt .LBB_main_73 ; no_exit.0.i .LBB_main_72: ; no_exit.0.i b .LBB_main_74 ; no_exit.0.i .LBB_main_73: ; no_exit.0.i fmr f3, f1 .LBB_main_74: ; no_exit.0.i cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i We emit this instead: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 fsel f1, f1, f1, f2 addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f2, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f2 fsel f2, f2, f2, f4 fadd f1, f1, f2 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fsub f5, f0, f4 fsel f0, f5, f0, f4 fsub f1, f2, f1 addi r2, r2, -1 fsub f2, f1, f3 fsel f3, f2, f3, f1 cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i llvm-svn: 16764	2004-10-06 09:53:04 +00:00
Chris Lattner	6835dedb5b	Codegen signed mod by 2 or -2 more efficiently. Instead of generating: t: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 2 mov %EAX, %EDX sar %EDX, 31 idiv %ECX mov %EAX, %EDX ret Generate: t: mov %ECX, DWORD PTR [%ESP + 4] * mov %EAX, %ECX cdq and %ECX, 1 xor %ECX, %EDX sub %ECX, %EDX * mov %EAX, %ECX ret Note that the two marked moves are redundant, and should be eliminated by the register allocator, but aren't. Compare this to GCC, which generates: t: mov %eax, DWORD PTR [%esp+4] mov %edx, %eax shr %edx, 31 lea %ecx, [%edx+%eax] and %ecx, -2 sub %eax, %ecx ret or ICC 8.0, which generates: t: movl 4(%esp), %ecx #3.5 movl $-2147483647, %eax #3.25 imull %ecx #3.25 movl %ecx, %eax #3.25 sarl $31, %eax #3.25 addl %ecx, %edx #3.25 subl %edx, %eax #3.25 addl %eax, %eax #3.25 negl %eax #3.25 subl %eax, %ecx #3.25 movl %ecx, %eax #3.25 ret #3.25 We would be in great shape if not for the moves. llvm-svn: 16763	2004-10-06 05:01:07 +00:00
Chris Lattner	e4c60eb704	Really fix FreeBSD, which apparently doesn't tolerate the extern. Thanks to Jeff Cohen for pointing out my goof. llvm-svn: 16762	2004-10-06 04:21:52 +00:00
Chris Lattner	7bd8f1332d	Fix a scary bug with signed division by a power of two. We used to generate: s: ;; X / 4 mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 2 ret When we really meant: s: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 add %EAX, %ECX sar %EAX, 2 ret Hey, this also reduces register pressure too :) llvm-svn: 16761	2004-10-06 04:19:43 +00:00
Chris Lattner	147edd2f7e	Codegen signed divides by 2 and -2 more efficiently. In particular instead of: s: ;; X / 2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax ret t: ;; X / -2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax negl %eax ret Emit: s: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax ret t: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax negl %eax ret llvm-svn: 16760	2004-10-06 04:02:39 +00:00
Chris Lattner	e9bfa5a2a4	Add some new instructions. Fix the asm string for sbb32rr llvm-svn: 16759	2004-10-06 04:01:02 +00:00
Chris Lattner	2ce32df8b0	Reduce code growth implied by the tail duplication pass by not duplicating an instruction if it can be hoisted to a common dominator of the block. This implements: test/Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 16758	2004-10-06 03:27:37 +00:00
Chris Lattner	7d83efbc0b	When tail duplicating these functions, the add instruction should not be duplicated, even though the block it is in is duplicated. llvm-svn: 16757	2004-10-06 03:26:38 +00:00

1 2 3 4 5 ...

14630 Commits All Branches Search

14630 Commits

All Branches