llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	c0964a571b	Re-apply r84295, with fixes to how the loop "top" and "bottom" blocks are tracked. Instead of trying to manually keep track of these locations while doing complex modifications, just recompute them when they're needed. This fixes a bug in which the TopMBB and BotMBB were not correctly updated, leading to invalid transformations. llvm-svn: 84598	2009-10-20 04:50:37 +00:00
Anton Korobeynikov	8383c3d7f3	Revert r84295, this unbreaks llvm-gcc bootstrap on x86-64/linux llvm-svn: 84516	2009-10-19 18:21:09 +00:00
Dan Gohman	0d3d9ee03e	Enhance CodePlacementOpt's unconditional intra-loop branch elimination logic to be more general and understand more varieties of loops. Teach CodePlacementOpt to reorganize the basic blocks of a loop so that they are contiguous. This also includes a fair amount of logic for preserving fall-through edges while doing so. This fixes a BranchFolding-ism where blocks which can't be made to use a fall-through edge and don't conveniently fit anywhere nearby get tossed out to the end of the function. llvm-svn: 84295	2009-10-17 00:32:43 +00:00
Evan Cheng	255f416470	Clean up spill weight computation. Also some changes to give loop induction variable increment / decrement slighter high priority. This has major impact on some micro-benchmarks. On MultiSource/Applications and spec tests, it's a minor win. It also reduce 256.bzip instruction count by 8%, 55 on 164.gzip on i386 / Darwin. llvm-svn: 82485	2009-09-21 21:12:25 +00:00
Dan Gohman	40503396da	Eliminate more uses of llvm-as and llvm-dis. llvm-svn: 81290	2009-09-08 23:54:48 +00:00
Evan Cheng	d67efaa847	Added a linearscan register allocation optimization. When the register allocator spill an interval with multiple uses in the same basic block, it creates a different virtual register for each of the reloads. e.g. %reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0] %reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0] %reg1486<def> = MOV32rr %reg1506 %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead> %reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0] => %reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0] %reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0] %reg1486<def> = MOV32rr %reg1506 %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead> %reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0] From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block. Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused. This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006. llvm-svn: 69585	2009-04-20 08:01:12 +00:00

6 Commits