Commit Graph

11 Commits

Author SHA1 Message Date
Chandler Carruth 4190b507c5 Flip the new block-placement pass to be on by default.
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.

This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.

I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.

I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.

I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.

llvm-svn: 154816
2012-04-16 13:49:17 +00:00
Evan Cheng d983eba7dc Re-apply r124518 with fix. Watch out for invalidated iterator.
llvm-svn: 124526
2011-01-29 04:46:23 +00:00
Evan Cheng 65b8ccf6ac Revert r124518. It broke Linux self-host.
llvm-svn: 124522
2011-01-29 02:43:04 +00:00
Evan Cheng d4eff31476 Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
2011-01-29 01:29:26 +00:00
Evan Cheng aaa9606b2f Revert r124462. There are a few big regressions that I need to fix first.
llvm-svn: 124478
2011-01-28 07:12:38 +00:00
Evan Cheng 417fca86c4 - Stop simplifycfg from duplicating "ret" instructions into unconditional
branches. PR8575, rdar://5134905, rdar://8911460.
- Allow codegen tail duplication to dup small return blocks after register
  allocation is done.

llvm-svn: 124462
2011-01-28 02:19:21 +00:00
Dan Gohman 4fee6f3bdd Start function numbering at 0.
llvm-svn: 101638
2010-04-17 16:29:15 +00:00
Sean Callanan 04d8cb74f3 Instruction fixes, added instructions, and AsmString changes in the
X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638
2009-12-18 00:01:26 +00:00
Dan Gohman 64b5d0f468 Add support for tail duplication to BranchFolding, and extend
tail merging support to handle more cases.
 - Recognize several cases where tail merging is beneficial even when
   the tail size is smaller than the generic threshold.
 - Make use of MachineInstrDesc::isBarrier to help detect
   non-fallthrough blocks.
 - Check for and avoid disrupting fall-through edges in more cases.

llvm-svn: 86871
2009-11-11 19:48:59 +00:00
Dan Gohman ff97acd8f1 Revert the main portion of r31856. It was causing BranchFolding
to break up CFG diamonds by banishing one of the blocks to the end of
the function, which is bad for code density and branch size.

This does pessimize MultiSource/Benchmarks/Ptrdist/yacr2, the
benchmark cited as the reason for the change, however I've examined
the code and it looks more like a case of gaming a particular
branch than of being generally applicable.

llvm-svn: 84803
2009-10-22 00:03:58 +00:00
Dan Gohman c0964a571b Re-apply r84295, with fixes to how the loop "top" and "bottom" blocks are
tracked. Instead of trying to manually keep track of these locations
while doing complex modifications, just recompute them when they're needed.
This fixes a bug in which the TopMBB and BotMBB were not correctly updated,
leading to invalid transformations.

llvm-svn: 84598
2009-10-20 04:50:37 +00:00