llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	c7b91e65d8	[CGP] avoid crashing from weightlessness It's possible that we have branch weights with 0 values. In that case, don't try to create an impossible BranchProbability. llvm-svn: 268935	2016-05-09 17:31:55 +00:00
Sanjay Patel	d66607bd8c	[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572	2016-04-26 17:11:17 +00:00
Sanjay Patel	43c7af6889	add tests for potential CGP transform (PR27344) llvm-svn: 267426	2016-04-25 16:56:52 +00:00
Sanjay Patel	0fc4137065	[x86] auto-generate checks for cmov tests llvm-svn: 267417	2016-04-25 15:26:57 +00:00
Junmo Park	161dc1c605	[CodeGenPrepare] Remove load-based heuristic Summary: Both the hardware and LLVM have changed since 2012. Now, load-based heuristic don't show big differences any more on OoO cores. There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5). Reviewers: spatel, zansari Differential Revision: http://reviews.llvm.org/D16836 llvm-svn: 261809	2016-02-25 00:23:27 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Stephen Lin	f799e3f944	Convert CodeGen//.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;$ $$[A-Z0-9_]$:$ $test$[A-Za-z0-9_-]$:$ $$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen//*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. llvm-svn: 186258	2013-07-13 20:38:47 +00:00
Benjamin Kramer	3d38c17b59	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258	2012-05-06 14:25:16 +00:00
Benjamin Kramer	047d7ca0b1	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234	2012-05-05 12:49:22 +00:00

9 Commits