llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthias Braun	ba7d95d425	PeepholeOptimizer: Do not replace SubregToReg(bitcast like) While we can usually replace bitcast like instructions (MachineInstr::isBitcast()) with a COPY this is not legal if any of the users uses SUBREG_TO_REG to assert the upper bits of the result are zero. Differential Revision: https://reviews.llvm.org/D28474 llvm-svn: 291483	2017-01-09 21:38:17 +00:00
Krzysztof Parzyszek	91b5cf8412	Extract LaneBitmask into a separate type Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820	2016-12-15 14:36:06 +00:00
Philip Reames	1f1bbac8da	[peephole] Enhance folding logic to work for STATEPOINTs The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510	2016-12-13 01:38:41 +00:00
Eugene Zelenko	1804a77b2a	Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Differential revision: https://reviews.llvm.org/D23861 llvm-svn: 279695	2016-08-25 00:45:04 +00:00
Matt Arsenault	44540a3db2	PeepholeOptimizer: Make pass name match DEBUG_TYPE llvm-svn: 274874	2016-07-08 16:29:11 +00:00
Eric Liu	e617adea12	Fixed warning caused by r274402. llvm-svn: 274497	2016-07-04 12:10:08 +00:00
Matt Arsenault	28aaf45c10	PeepholeOptimizer: Relax assert Allow implicit defs llvm-svn: 274402	2016-07-01 23:15:06 +00:00
Duncan P. N. Exon Smith	9cfc75c214	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189	2016-06-30 00:01:54 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
Andrew Kaylor	f0f279291c	Initial implementation of optimization bisect support. This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022	2016-04-21 17:58:54 +00:00
Sanjay Patel	b120ae96d3	fix formatting; NFC llvm-svn: 256574	2015-12-29 19:34:53 +00:00
Sanjay Patel	faeee6f44d	use range-based for-loop; NFCI llvm-svn: 256572	2015-12-29 18:30:09 +00:00
Sanjay Patel	59309cc090	don't repeat function names in comments; NFC llvm-svn: 256569	2015-12-29 18:14:06 +00:00
Dan Gohman	dab313e0ed	PeepholeOptimizer: Ignore dead implicit defs Target-specific instructions may have uninteresting physreg clobbers, for target-specific reasons. The peephole pass doesn't need to concern itself with such defs, as long as they're implicit and marked as dead. llvm-svn: 255182	2015-12-10 00:37:51 +00:00
JF Bastien	1ac69947b6	CodeGen peephole: fold redundant phys reg copies Code generation often exposes redundant physical register copies through virtual registers such as: %vreg = COPY %PHYSREG ... %PHYSREG = COPY %vreg There are cases where no intervening clobber of %PHYSREG occurs, and the later copy could therefore be removed. In some cases this further allows us to remove the initial copy. This patch contains a motivating example which comes from the x86 build of Chrome, specifically cc::ResourceProvider::UnlockForRead uses libstdc++'s implementation of hash_map. That example has two tests live at the same time, and after machine sinking LLVM has confused itself enough and things spilling EFLAGS is a great idea even though it's never restored and the comparison results are both live. Before this patch we have: DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def> %vreg1<def> = COPY %EFLAGS; GR64:%vreg1 %EFLAGS<def> = COPY %vreg1; GR64:%vreg1 JNE_1 <BB#1>, %EFLAGS<imp-use> Both copies are useless. This patch tries to eliminate the later copy in a generic manner. dec is especially confusing to LLVM when compared with sub. I wrote this patch to treat all physical registers generically, but only remove redundant copies of non-allocatable physical registers because the allocatable ones caused issues (e.g. when calling conventions weren't properly modeled) and should be handled later by the register allocator anyways. The following tests used to failed when the patch also replaced allocatable registers: CodeGen/X86/StackColoring.ll CodeGen/X86/avx512-calling-conv.ll CodeGen/X86/copy-propagation.ll CodeGen/X86/inline-asm-fpstack.ll CodeGen/X86/musttail-varargs.ll CodeGen/X86/pop-stack-cleanup.ll CodeGen/X86/preserve_mostcc64.ll CodeGen/X86/tailcallstack64.ll CodeGen/X86/this-return-64.ll This happens because COPY has other special meaning for e.g. dependency breakage and x87 FP stack. Note that all other backends' tests pass. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15157 llvm-svn: 254665	2015-12-03 23:43:56 +00:00
Rafael Espindola	84921b9860	Refactor: Simplify boolean conditional return statements in lib/CodeGen. Patch by Richard. llvm-svn: 251213	2015-10-24 23:11:13 +00:00
Matt Arsenault	10aa807856	PeepholeOptimizer: Remove redundant copies If a virtual register is copied and another copy was already seen, replace with the previous copy. This only handles the simplest cases for now. This pattern shows up from various operand restrictions AMDGPU has which require inserting copies depending on the register class of the operands. llvm-svn: 248611	2015-09-25 20:22:12 +00:00
Matt Arsenault	68d938649e	Introduce target hook for optimizing register copies Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478	2015-09-24 08:36:14 +00:00
Matt Arsenault	c7ec46c3aa	Remove dead declaration llvm-svn: 248471	2015-09-24 07:51:12 +00:00
Matt Arsenault	3099156261	Fix typos / grammar llvm-svn: 247109	2015-09-09 00:38:33 +00:00
Benjamin Kramer	fcdb1c14ac	Make helper functions static. NFC. llvm-svn: 245549	2015-08-20 09:57:22 +00:00
Bruno Cardoso Lopes	27fd06922b	[PeepholeOptimizer] Look through PHIs to find additional register sources Reintroduce r245442. Remove an overly conservative assertion introduced in r245442. We could replace the assertion to use `shareSameRegisterFile` instead, but in that point in `insertPHI` we already lost the original Def subreg to check against. So drop the assertion completely. Original commit message: - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245479	2015-08-19 18:53:36 +00:00
Bruno Cardoso Lopes	61009142b8	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Revert r245442 while investigating a fix. An assertion hit in http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380 llvm-svn: 245446	2015-08-19 15:10:32 +00:00
Bruno Cardoso Lopes	0a1c126684	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r243486. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245442	2015-08-19 14:34:41 +00:00
Michael Kuperstein	bc7f99a3ab	[X86] Allow x86 call frame optimization to fold more loads into pushes This abstracts away the test for "when can we fold across a MachineInstruction" into the the MI interface, and changes call-frame optimization use the same test the peephole optimizer users. Differential Revision: http://reviews.llvm.org/D11945 llvm-svn: 244729	2015-08-12 10:14:58 +00:00
Michael Kuperstein	82814f63c0	Allow PeepholeOptimizer to fold a few more cases The condition for clearing the folding candidate list was clamped together with the "uninteresting instruction" condition. This is too conservative, e.g. we don't need to clear the list when encountering an IMPLICIT_DEF. Differential Revision: http://reviews.llvm.org/D11591 llvm-svn: 244577	2015-08-11 08:19:43 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Bruno Cardoso Lopes	38c0250679	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540	2015-07-29 17:46:47 +00:00
Bruno Cardoso Lopes	3c235763e5	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486	2015-07-28 21:45:50 +00:00
Bruno Cardoso Lopes	b20841df44	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Still breaks some ARM buildbots. This reverts r243271. llvm-svn: 243318	2015-07-27 20:26:04 +00:00
Bruno Cardoso Lopes	669c921bfd	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r242295 with fixes in the implementation. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243271	2015-07-27 14:39:46 +00:00
Bruno Cardoso Lopes	f16ec12654	[PeepholeOptimizer] Refactor optimizeUncoalescable logic Reapply r242294. - Create a new CopyRewriter for Uncoalescable copy-like instructions - Change the ValueTracker to return a ValueTrackerResult This makes optimizeUncoalescable looks more like optimizeCoalescable and use the CopyRewritter infrastructure. This is also the preparation for looking up into PHI nodes in the ValueTracker. rdar://problem/20404526 Differential Revision: http://reviews.llvm.org/D11195 llvm-svn: 242940	2015-07-22 21:30:16 +00:00
Bruno Cardoso Lopes	9b39693a5d	Revert "Refactor optimizeUncoalescable logic" Likely broke compilation on ARM: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/13054 This reverts commit 0b7824464fbe3d3f386e2d4aef6a431422709e53. llvm-svn: 242311	2015-07-15 18:10:46 +00:00
Bruno Cardoso Lopes	ad61f34293	Revert "Look through PHIs to find additional register sources" Likely broke compilation on ARM: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/13054 This reverts commit 131ce4a838c081516cbfed039fc986b33e3979d6. llvm-svn: 242310	2015-07-15 18:10:35 +00:00
Bruno Cardoso Lopes	fadd4fef2a	Look through PHIs to find additional register sources - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 242295	2015-07-15 15:35:23 +00:00
Bruno Cardoso Lopes	bd68a09591	Refactor optimizeUncoalescable logic - Create a new CopyRewriter for Uncoalescable copy-like instructions - Change the ValueTracker to return a ValueTrackerResult This makes optimizeUncoalescable looks more like optimizeCoalescable and use the CopyRewritter infrastructure. This is also the preparation for looking up into PHI nodes in the ValueTracker. Differential Revision: http://reviews.llvm.org/D11195 llvm-svn: 242294	2015-07-15 15:35:09 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
David Blaikie	dc3f01e9cf	Simplify expressions involving boolean constants with clang-tidy Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617	2015-03-09 01:57:13 +00:00
Benjamin Kramer	4f6ac16292	Replace std::copy with a back inserter with vector append where feasible All of the cases were just appending from random access iterators to a vector. Using insert/append can grow the vector to the perfect size directly and moves the growing out of the loop. No intended functionalty change. llvm-svn: 230845	2015-02-28 10:11:12 +00:00
Mehdi Amini	22e59748ef	Peephole opt needs optimizeSelect() to keep track of newly created MIs Peephole optimizer is scanning a basic block forward. At some point it needs to answer the question "given a pointer to an MI in the current BB, is it located before or after the current instruction". To perform this, it keeps a set of the MIs already seen during the scan, if a MI is not in the set, it is assumed to be after. It means that newly created MIs have to be inserted in the set as well. This commit passes the set as an argument to the target-dependent optimizeSelect() so that it can properly update the set with the (potentially) newly created MIs. llvm-svn: 225772	2015-01-13 07:07:13 +00:00
Eric Christopher	2181fb2ff3	Avoid caching the MachineFunction, we don't use it outside of runOnMachineFunction. llvm-svn: 219847	2014-10-15 21:06:25 +00:00
Gerolf Hoflehner	a4c96d02a2	[AAarch64] Optimize CSINC-branch sequence Peephole optimization that generates a single conditional branch for csinc-branch sequences like in the examples below. This is possible when the csinc sets or clears a register based on a condition code and the branch checks that register. Also the condition code may not be modified between the csinc and the original branch. Examples: 1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44 to b.<invCC> 2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44 to b.<CC> rdar://problem/18506500 llvm-svn: 219742	2014-10-14 23:07:53 +00:00
Eric Christopher	92b4bcbbee	Instead of the TargetMachine cache the MachineFunction and TargetRegisterInfo in the peephole optimizer. This makes it easier to grab subtarget dependent variables off of the MachineFunction rather than the TargetMachine. llvm-svn: 219669	2014-10-14 07:17:20 +00:00
Quentin Colombet	6674b095b8	[PeepholeOptimizer] Enable the advanced copy optimization by default. The advanced copy optimization does not yield any difference on the whole llvm test-suite + SPECs, either in compile time or runtime (binaries are identical), but has a big potential when data go back and forth between register files as demonstrated with test/CodeGen/ARM/adv-copy-opt.ll. Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64. <rdar://problem/12702965> llvm-svn: 216236	2014-08-21 22:23:52 +00:00
Quentin Colombet	6b36337c09	[PeepholeOptimizer] Update the kill flags when extending the live-range of the source of a copy. <rdar://problem/12702965> llvm-svn: 216229	2014-08-21 21:34:06 +00:00
Quentin Colombet	689623009b	[PeepholeOptimizer] Take advantage of the isInsertSubreg property in the advanced copy optimization. This is the final step patch toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov.32 d16[0], r0 vmov.32 d16[1], r1 and is able to rewrite the following sequence: vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 into simple generic GPR copies that the coalescer managed to remove. <rdar://problem/12702965> llvm-svn: 216144	2014-08-21 00:19:16 +00:00
Quentin Colombet	67639df146	[PeepholeOptimizer] Take advantage of the isExtractSubreg property in the advanced copy optimization. This patch is a step toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov r0, r1, d16 but it does not understand yet vmov.32 d16[0], r0 vmov.32 d16[1], r1 Comming patches will fix that and update the related test case. <rdar://problem/12702965> llvm-svn: 216136	2014-08-20 23:13:02 +00:00

1 2 3

124 Commits