llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	28a0885929	No need to clear the map here, it will always be empty llvm-svn: 11868	2004-02-26 05:21:21 +00:00
Chris Lattner	5e39cf9fbd	Fix a bug in the densemap that was killing the local allocator, and probably other clients. The problem is that the nullVal member was left to the default constructor to initialize, which for int's does nothing (ie, leaves it unspecified). To get a zero value, we must use T(). It's C++ wonderful? :) llvm-svn: 11867	2004-02-26 05:00:15 +00:00
Alkis Evlogimenos	e008a4b28f	Remove .micro references as those files no longer exist and add some more recent Makefile additions to the list llvm-svn: 11866	2004-02-26 04:14:10 +00:00
Chris Lattner	973556b724	Fix typeo. grow() cannot shrink storage. clear() should really nuke storage llvm-svn: 11865	2004-02-26 04:07:12 +00:00
Chris Lattner	36ab728fe5	Fix typo llvm-svn: 11864	2004-02-26 03:45:03 +00:00
Chris Lattner	128e84197b	The node doesn't have to be _no_ node flags, it just has to be complete and not have any globals. llvm-svn: 11863	2004-02-26 03:43:43 +00:00
Chris Lattner	c8167b0e7e	Add _more_ functions llvm-svn: 11862	2004-02-26 03:43:08 +00:00
Chris Lattner	73687be9d7	We have this snazzy link-time optimizer. How about we start using it? This removes some cruft from 255.vortex, cleaning up after DAE and IPCP, which do horrible, beautiful, things to vortex. llvm-svn: 11861	2004-02-26 03:34:30 +00:00
Chris Lattner	9192bbdad9	Fix some warnings, some of which were spurious, and some of which were real bugs. Thanks Brian! llvm-svn: 11859	2004-02-26 01:20:02 +00:00
Misha Brukman	1743c4090d	Instructions to call and return from functions. llvm-svn: 11858	2004-02-26 00:37:12 +00:00
Brian Gaeke	11331e5d59	One B00g fixed. llvm-svn: 11857	2004-02-26 00:08:25 +00:00
Alkis Evlogimenos	802cf52b91	Temporarily comment out asserts as they break things. I will uncomment them when all the problem areas are fixed. llvm-svn: 11855	2004-02-25 23:56:36 +00:00
Alkis Evlogimenos	19aaae3f3b	Fix typo. I wonder how this actually worked. llvm-svn: 11854	2004-02-25 23:47:17 +00:00
Alkis Evlogimenos	2cf83d3401	Complete the SPEC_ROOT and USE_SPEC to SPEC2000_ROOT and USE_SPEC200 rename. llvm-svn: 11853	2004-02-25 23:41:32 +00:00
Chris Lattner	71626b8f36	Two changes: 1. Functions do not make things incomplete, only variables 2. Constant global variables no longer need to be marked incomplete, because we are guaranteed that the initializer for the global will be in the graph we are hacking on now. This makes resolution of indirect calls happen a lot more in the bu pass, supports things like vtables and the C counterparts (giant constant arrays of function pointers), etc... Testcase here: test/Regression/Analysis/DSGraph/constant_globals.ll llvm-svn: 11852	2004-02-25 23:36:08 +00:00
Chris Lattner	fc0912d02a	New testcase llvm-svn: 11851	2004-02-25 23:34:04 +00:00
Chris Lattner	fab2872b6c	When building local graphs, clone the initializer for constant globals into each local graph that uses the global. llvm-svn: 11850	2004-02-25 23:31:02 +00:00
Alkis Evlogimenos	e62ddd405d	Fix bugs found with recent addition of assertions in MRegisterInfo::is{Physical,Virtual}Register. llvm-svn: 11849	2004-02-25 23:21:52 +00:00
Chris Lattner	6ce59b4a03	Simplify the dead node elimination stuff Make the incompleteness marker faster by looping directly over the globals instead of over the scalars to find the globals Fix a bug where we didn't mark a global incomplete if it didn't have any outgoing edges. This wouldn't break any current clients but is still wrong. llvm-svn: 11848	2004-02-25 23:08:00 +00:00
Chris Lattner	5e5e060618	Add a bunch more functions llvm-svn: 11847	2004-02-25 23:06:40 +00:00
Chris Lattner	17bce88100	Try harder to get symbol info llvm-svn: 11846	2004-02-25 23:06:30 +00:00
Brian Gaeke	7b4be13f94	Represent va_list in interpreter as a (ec-stack-depth . var-arg-index) pair, and look up varargs in the execution stack every time, instead of just pushing iterators (which can be invalidated during callFunction()) around. (union GenericValue now has a "pair of uints" member, to support this mechanism.) Fixes Bug 234. llvm-svn: 11845	2004-02-25 23:01:48 +00:00
Brian Gaeke	84b76c9be0	Great sparc renaming fallout IV: Sparc --> SparcV9. llvm-svn: 11844	2004-02-25 22:09:36 +00:00
Alkis Evlogimenos	ae54cfc19f	Duh, forgot to close the parenthesis. llvm-svn: 11843	2004-02-25 22:07:14 +00:00
Alkis Evlogimenos	cb69f50cb5	Add assert to isPhysicalRegister and isVirtualRegister to fail when passed the special 'register' 0. llvm-svn: 11842	2004-02-25 22:04:28 +00:00
Alkis Evlogimenos	a9f03fba9d	Remove asssert since it is breaking cases that it shouldn't. llvm-svn: 11841	2004-02-25 22:01:06 +00:00
Alkis Evlogimenos	d8bace7f60	Add DenseMap template and actually use it for for mapping virtual regs to objects. llvm-svn: 11840	2004-02-25 21:55:45 +00:00
Chris Lattner	b66a35ef9c	Add a new pass, run internalize first llvm-svn: 11839	2004-02-25 21:35:13 +00:00
Chris Lattner	0f39359dd2	Add a new pass llvm-svn: 11838	2004-02-25 21:35:02 +00:00
Chris Lattner	14da4ead95	Add prototype llvm-svn: 11837	2004-02-25 21:34:51 +00:00
Chris Lattner	8d1da1abee	My faith in programmers has been found to be totally misplaced. One would assume that if they don't intend to write to a global variable, that they would mark it as constant. However, there are people that don't understand that the compiler can do nice things for them if they give it the information it needs. This pass looks for blatently obvious globals that are only ever read from. Though it uses a trivially simple "alias analysis" of sorts, it is still able to do amazing things to important benchmarks. 253.perlbmk, for example, contains several *GIANT* function pointer tables that are not marked constant and should be. Marking them constant allows the optimizer to turn a whole bunch of indirect calls into direct calls. Note that only a link-time optimizer can do this transformation, but perlbmk does have several strings and other minor globals that can be marked constant by this pass when run from GCCAS. 176.gcc has a ton of strings and large tables that are marked constant, both at compile time (38 of them) and at link time (48 more). Other benchmarks give similar results, though it seems like big ones have disproportionally more than small ones. This pass is extremely quick and does good things. I'm going to enable it in gccas & gccld. Not bad for 50 SLOC. llvm-svn: 11836	2004-02-25 21:34:36 +00:00
Misha Brukman	564654d654	SparcV8 regs are really 32-bit, not 64! Thanks, Chris. llvm-svn: 11835	2004-02-25 21:03:02 +00:00
Misha Brukman	f8dcdcc83b	Clean up the tablegen descriptions for SparcV8. llvm-svn: 11834	2004-02-25 21:02:21 +00:00
Misha Brukman	2122b969f9	Fix the SparcV8 register definitions that were imported from PPC template. llvm-svn: 11833	2004-02-25 21:00:05 +00:00
Misha Brukman	0e3a7ca53e	SparcV8 has different types of instructions, but F1 is only used for CALL. llvm-svn: 11832	2004-02-25 20:52:20 +00:00
Brian Gaeke	232483aecc	Note that this test is currently expected to fail. llvm-svn: 11831	2004-02-25 20:34:02 +00:00
Chris Lattner	f5a393a133	Add an assertion llvm-svn: 11830	2004-02-25 19:37:44 +00:00
Chris Lattner	64c9b223bd	Fix failures in 099.go due to the cfgsimplify pass creating switch instructions where there did not used to be any before llvm-svn: 11829	2004-02-25 19:30:19 +00:00
Brian Gaeke	9a5bd7fca7	SparcV8 skeleton llvm-svn: 11828	2004-02-25 19:28:19 +00:00
Brian Gaeke	068b4596d4	Great renaming part II: Sparc --> SparcV9 (also includes command-line options and Makefiles) llvm-svn: 11827	2004-02-25 19:08:12 +00:00
Brian Gaeke	94e95d2b3e	Great renaming: Sparc --> SparcV9 llvm-svn: 11826	2004-02-25 18:44:15 +00:00
Chris Lattner	864c901444	Add a bunch more functions used by perlbmk llvm-svn: 11824	2004-02-25 17:43:20 +00:00
John Criswell	9f547bcea9	Updated to use llc to generate CBE code. llvm-svn: 11823	2004-02-25 17:15:02 +00:00
Chris Lattner	8ebf253827	Substantial improvements and cleanups for the release notes. We were missing a bunch of stuff! :) llvm-svn: 11822	2004-02-25 16:36:51 +00:00
Chris Lattner	9c6833c5ca	Fix incorrect debug code llvm-svn: 11821	2004-02-25 15:15:04 +00:00
Chris Lattner	309327a4b5	Teach the instruction selector how to transform 'array' GEP computations into X86 scaled indexes. This allows us to compile GEP's like this: int* %test([10 x { int, { int } }]* %X, int %Idx) { %Idx = cast int %Idx to long %X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0 ret int* %X } Into a single address computation: test: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4] ret Before it generated: test: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] shl %ECX, 3 add %EAX, %ECX lea %EAX, DWORD PTR [%EAX + 4] ret This is useful for things like int/float/double arrays, as the indexing can be folded into the loads&stores, reducing register pressure and decreasing the pressure on the decode unit. With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot. On bzip2 for example, we go from this: 10665 asm-printer - Number of machine instrs printed 40 ra-local - Number of loads/stores folded into instructions 1708 ra-local - Number of loads added 1532 ra-local - Number of stores added 1354 twoaddressinstruction - Number of instructions added 1354 twoaddressinstruction - Number of two-address instructions 2794 x86-peephole - Number of peephole optimization performed to this: 9873 asm-printer - Number of machine instrs printed 41 ra-local - Number of loads/stores folded into instructions 1710 ra-local - Number of loads added 1521 ra-local - Number of stores added 789 twoaddressinstruction - Number of instructions added 789 twoaddressinstruction - Number of two-address instructions 2142 x86-peephole - Number of peephole optimization performed ... and these types of instructions are often in tight loops. Linear scan is also helped, but not as much. It goes from: 8787 asm-printer - Number of machine instrs printed 2389 liveintervals - Number of identity moves eliminated after coalescing 2288 liveintervals - Number of interval joins performed 3522 liveintervals - Number of intervals after coalescing 5810 liveintervals - Number of original intervals 700 spiller - Number of loads added 487 spiller - Number of stores added 303 spiller - Number of register spills 1354 twoaddressinstruction - Number of instructions added 1354 twoaddressinstruction - Number of two-address instructions 363 x86-peephole - Number of peephole optimization performed to: 7982 asm-printer - Number of machine instrs printed 1759 liveintervals - Number of identity moves eliminated after coalescing 1658 liveintervals - Number of interval joins performed 3282 liveintervals - Number of intervals after coalescing 4940 liveintervals - Number of original intervals 635 spiller - Number of loads added 452 spiller - Number of stores added 288 spiller - Number of register spills 789 twoaddressinstruction - Number of instructions added 789 twoaddressinstruction - Number of two-address instructions 258 x86-peephole - Number of peephole optimization performed Though I'm not complaining about the drop in the number of intervals. :) llvm-svn: 11820	2004-02-25 07:00:55 +00:00
Chris Lattner	d1ee55d439	* Make the previous patch more efficient by not allocating a temporary MachineInstr to do analysis. * FOLD getelementptr instructions into loads and stores when possible, making use of some of the crazy X86 addressing modes. For example, the following C++ program fragment: struct complex { double re, im; complex(double r, double i) : re(r), im(i) {} }; inline complex operator+(const complex& a, const complex& b) { return complex(a.re+b.re, a.im+b.im); } complex addone(const complex& arg) { return arg + complex(1,0); } Used to be compiled to: _Z6addoneRK7complex: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] * mov %EDX, %ECX fld QWORD PTR [%EDX] fld1 faddp %ST(1) * add %ECX, 8 fld QWORD PTR [%ECX] fldz faddp %ST(1) * mov %ECX, %EAX fxch %ST(1) fstp QWORD PTR [%ECX] *** add %EAX, 8 fstp QWORD PTR [%EAX] ret Now it is compiled to: _Z6addoneRK7complex: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] fld QWORD PTR [%ECX] fld1 faddp %ST(1) fld QWORD PTR [%ECX + 8] fldz faddp %ST(1) fxch %ST(1) fstp QWORD PTR [%EAX] fstp QWORD PTR [%EAX + 8] ret Other programs should see similar improvements, across the board. Note that in addition to reducing instruction count, this also reduces register pressure a lot, always a good thing on X86. :) llvm-svn: 11819	2004-02-25 06:13:04 +00:00
Chris Lattner	4b3514c173	Add a helper to create an addressing mode given all of the pieces. llvm-svn: 11818	2004-02-25 06:01:07 +00:00
Chris Lattner	d825d30f42	add an inefficient way of folding structure and constant array indexes together into a single LEA instruction. This should improve the code generated for things like X->A.B.C[12].D. The bigger benefit is still coming though. Note that this uses an LEA instruction instead of an add, giving the register allocator more freedom. We should probably never generate ADDri32's. llvm-svn: 11817	2004-02-25 03:45:50 +00:00
Chris Lattner	f85e33cd79	Implement special case for storing an immediate into memory so that we don't need an intermediate register. llvm-svn: 11816	2004-02-25 02:56:58 +00:00

1 2 3 4 5 ...

10810 Commits All Branches Search

10810 Commits

All Branches