llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	faea2d3130	Revert "Enable MachineOutliner by default under -Oz for AArch64" It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136	2018-07-27 17:25:38 +00:00
Jessica Paquette	d4229b985c	Enable MachineOutliner by default under -Oz for AArch64 This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133	2018-07-27 16:44:42 +00:00
Jessica Paquette	69f517df27	[MachineOutliner][NFC] Move target frame info into OutlinedFunction Just some gardening here. Similar to how we moved call information into Candidates, this moves outlined frame information into OutlinedFunction. This allows us to remove TargetCostInfo entirely. Anywhere where we returned a TargetCostInfo struct, we now return an OutlinedFunction. This establishes OutlinedFunctions as more of a general repeated sequence, and Candidates as occurrences of those repeated sequences. llvm-svn: 337848	2018-07-24 20:13:10 +00:00
Jessica Paquette	fca55129b1	[MachineOutliner][NFC] Make Candidates own their call information Before this, TCI contained all the call information for each Candidate. This moves that information onto the Candidates. As a result, each Candidate can now supply how it ought to be called. Thus, Candidates will be able to, say, call the same function in cheaper ways when possible. This also removes that information from TCI, since it's no longer used there. A follow-up patch for the AArch64 outliner will demonstrate this. llvm-svn: 337840	2018-07-24 17:42:11 +00:00
Yvan Roux	a96a04558d	[MachineOutliner] Assert that Liveness tracking is accurate (NFC) The checking is done deeper inside MachineBasicBlock, but this will hopefully help to find issues when porting the machine outliner to a target where Liveness tracking is broken (like ARM). Differential Revision: https://reviews.llvm.org/D49023 llvm-svn: 336481	2018-07-07 08:02:19 +00:00
Yvan Roux	eaececf5e0	[MachineOutliner] Fix typo in getOutliningCandidateInfo function name getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285	2018-07-04 15:37:08 +00:00
Jessica Paquette	f472f6159a	[MachineOutliner] Don't outline sequences where x16/x17/nzcv are live across It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence. This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true. https://bugs.llvm.org/show_bug.cgi?id=37573 https://reviews.llvm.org/D47655 llvm-svn: 335758	2018-06-27 17:43:27 +00:00
Jessica Paquette	32de26d432	[MachineOutliner] NFC: Remove insertOutlinerPrologue, rename insertOutlinerEpilogue insertOutlinerPrologue was not used by any target, and prologue-esque code was beginning to appear in insertOutlinerEpilogue. Refactor that into one function, buildOutlinedFrame. This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue. llvm-svn: 335076	2018-06-19 21:14:48 +00:00
Jessica Paquette	aa087327ce	[MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.h This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952	2018-06-04 21:14:16 +00:00
Eli Friedman	785acce51d	Delete unused variable from r333015. (The assertion suppressed the unused variable warning on Release+Asserts builds, so I didn't notice.) llvm-svn: 333018	2018-05-22 19:38:07 +00:00
Eli Friedman	042dc9e092	[MachineOutliner] Add "thunk" outlining for AArch64. When we're outlining a sequence that ends in a call, we can save up to three instructions in the outlined function by turning the call into a tail-call. I refer to this as thunk outlining because the resulting outlined function looks like a thunk; suggestions welcome for a better name. In addition to making the outlined function shorter, thunk outlining allows outlining calls which would otherwise be illegal to outline: we don't need to save/restore LR, so we don't need to prove anything about the stack access patterns of the callee. To make this work effectively, I also added MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner code; this allows treating an arbitrary instruction as a terminator in the suffix tree. Differential Revision: https://reviews.llvm.org/D47173 llvm-svn: 333015	2018-05-22 19:11:06 +00:00
Eli Friedman	4081a57af7	[MachineOutliner] Count savings from outlining in bytes. Counting the number of instructions is both unintuitive and inaccurate. On AArch64, this only affects the generated remarks and certain rare pseudo-instructions, but it will have a bigger impact on other targets. Differential Revision: https://reviews.llvm.org/D46921 llvm-svn: 332685	2018-05-18 01:52:16 +00:00
Eli Friedman	ddbf6d6514	[MachineOutliner] Don't outline instructions that modify SP. This breaks the code which saves and restores LR, so we can't outline without doing something more complicated for stack adjustment. Found by inspection; we get lucky in most cases because getMemOpInfo only handles STRWpost, not any other pre/post-increment forms. But it hits a couple of artificial testcases in the tree. Differential Revision: https://reviews.llvm.org/D46920 llvm-svn: 332529	2018-05-16 21:20:16 +00:00
Eli Friedman	02709bcb78	[MachineOutliner] Don't save/restore LR for tail calls. The cost computation assumes we do this correctly, but the actual lowering was wrong. Differential Revision: https://reviews.llvm.org/D46923 llvm-svn: 332514	2018-05-16 19:49:01 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Jessica Paquette	0b6724917a	[MachineOutliner] Add defs to calls + don't track liveness on outlined functions This commit makes it so that if you outline a def of some register, then the call instruction created by the outliner actually reflects that the register is defined by the call. It also makes it so that outlined functions don't have the TracksLiveness property. Outlined calls shouldn't break liveness assumptions that someone might make. This also un-XFAILs the noredzone test, and updates the calls test. llvm-svn: 331095	2018-04-27 23:36:35 +00:00
Eli Friedman	da018e5687	[MachineOutliner] Don't outline from functions with a section marking. The program might have unusual expectations for functions; for example, the Linux kernel's build system warns if it finds references from .text to .init.data. I'm not sure this is something we actually want to make any guarantees about (there isn't any explicit rule that would disallow outlining in this case), but we might want to be conservative anyway. Differential Revision: https://reviews.llvm.org/D46091 llvm-svn: 331007	2018-04-27 00:21:34 +00:00
Jessica Paquette	4f56428de1	[MachineOutliner] Check for explicit uses of LR/W30 in MI operands Before, the outliner would grab ADRPs that used LR/W30. This patch fixes that by checking for explicit uses of those registers before the special-casing for ADRPs. This also adds a test that ensures that those sorts of ADRPs won't be outlined. llvm-svn: 330783	2018-04-24 22:38:15 +00:00
Jessica Paquette	2e5ada5c81	[MachineOutliner] Change B instruction for tail calls to TCRETURNdi First off, this is more correct than having the B. Second off, this was making a bot upset. This fixes that. Update the test to include -verify-machineinstrs as well to prevent stuff like this slipping by non debug/assert builds in the future. llvm-svn: 330459	2018-04-20 18:03:21 +00:00
Jessica Paquette	642f6c61a3	[MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfo This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120	2018-04-03 21:56:10 +00:00
Jessica Paquette	4aa14dbcc2	[MachineOutliner] Simplify call outlining + require valid callee save info for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719	2018-03-28 17:52:31 +00:00
Jessica Paquette	2519ee7081	[MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operands If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674	2018-03-27 22:23:48 +00:00
Jessica Paquette	563548d8f3	[MachineOutliner] AArch64: Emit CFI instructions when outlining calls When outlining calls, the outliner needs to update CFI to ensure that, say, exception handling works. This commit adds that functionality and adds a test just for call outlining. Call outlining stuff in machine-outliner.mir should be moved into machine-outliner-calls.mir in a later commit. llvm-svn: 327917	2018-03-19 22:48:40 +00:00
Jessica Paquette	b3e7dc9144	[MachineOutliner] Make KILLs invisible At the point the outliner runs, KILLs don't impact anything, but they're still considered unique instructions. This commit makes them invisible like DebugValues so that they can still be outlined without impacting outlining decisions. llvm-svn: 327760	2018-03-16 22:53:34 +00:00
Evandro Menezes	1515e859c6	[AArch64] Adjust the cost model for Exynos M3 Increase the number of cheap as move cases of register reset. llvm-svn: 327661	2018-03-15 20:31:13 +00:00
Evandro Menezes	b5f12090fc	[AArch64] Refactor stand alone methods (NFC) Make stand alone methods in AArch64InstrInfo static. llvm-svn: 324745	2018-02-09 16:14:41 +00:00
Evandro Menezes	07c78eeeef	[AArch64] Add new target feature to handle cheap as move for Exynos This feature enables special handling of cheap as move in the existing custom handling specifically for Exynos processors. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323774	2018-01-30 15:40:22 +00:00
Evandro Menezes	9f9daa1f14	[AArch64] Add pipeline model for Exynos M3 Add the scheduling and cost model for Exynos M3. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323773	2018-01-30 15:40:16 +00:00
Oliver Stannard	a9d2e004d2	[AArch64] Generate the CASP instruction for 128-bit cmpxchg The Large System Extension added an atomic compare-and-swap instruction that operates on a pair of 64-bit registers, which we can use to implement a 128-bit cmpxchg. Because i128 is not a legal type for AArch64 we have to do all of the instruction selection in C++, and the instruction requires even/odd register pairs, so we have to wrap it in REG_SEQUENCE and EXTRACT_SUBREG nodes. This is very similar to what we do for 64-bit cmpxchg in the ARM backend. Differential revision: https://reviews.llvm.org/D42104 llvm-svn: 323634	2018-01-29 09:18:37 +00:00
Jessica Paquette	757e120379	[MachineOutliner] Move hasAddressTaken check to MachineOutliner.cpp Mostly NFC. Still updating the test though just for completeness. This moves the hasAddressTaken check to MachineOutliner.cpp and replaces it with a per-basic block test rather than a per-function test. The old test was too conservative and was preventing functions in C programs from being outlined even though they were safe to outline. This was mostly a problem in C sources. llvm-svn: 322425	2018-01-13 00:42:28 +00:00
Jessica Paquette	c191f1097c	[MachineOutliner] Outline ADRPs ADRP instructions weren't being outlined because they're PC-relative and thus fail the LR checks. This patch adds a special case for ADRPs to getOutliningType to make sure that ADRPs can be outlined and updates the MIR test. llvm-svn: 322207	2018-01-10 18:49:57 +00:00
Jessica Paquette	3291e7353e	[MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups This commit does two things. Firstly, it adds a collection of flags which can be passed along to the target to encode information about the MBB that an instruction lives in to the outliner. Second, it adds some of those flags to the AArch64 outliner in order to add more stack instructions to the list of legal instructions that are handled by the outliner. The two flags added check if - There are calls in the MachineBasicBlock containing the instruction - The link register is available in the entire block If the link register is available and there are no calls, then a stack instruction can always be outlined without fixups, regardless of what it is, since in this case, the outliner will never modify the stack to create a call or outlined frame. The motivation for doing this was checking which instructions are most often missed by the outliner. Instructions like, say %sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy are very common, but cannot be outlined in the case that the outliner might modify the stack. This commit allows us to outline instructions like this. llvm-svn: 322048	2018-01-09 00:26:18 +00:00
Matthew Simpson	9439f54902	[AArch64] Change order of candidate FMLS patterns r319980 added new patterns to the machine combiner for transforming (fsub (fmul x y) z) into (fmla (fneg z) x y). That is, fsub's where the first source operand is an fmul are transformed. We previously only matched the case where the second source operand of an fsub was an fmul, transforming (fsub z (fmul x y)) into (fmls z x y). Now, if we have an fsub where both source operands are fmuls, both of the above patterns are applicable. However, the order in which we add the patterns to the list of candidates determines the transformation that takes place, since only the first pattern that matches will be used. This patch changes the order these two patterns are added to the list of candidates such that we prefer the case where the second source operand is an fmul (the fmls case), rather than the other one (the fmla/fneg case). When both source operands are fmuls, this ordering results in fewer instructions. Differential Revision: https://reviews.llvm.org/D41587 llvm-svn: 321491	2017-12-27 15:25:01 +00:00
Jessica Paquette	8565d3af84	[MachineOutliner][NFC] Gardening: use std::any_of instead of bool + loop River Riddle suggested to use std::any_of instead of the bool + loop thing on r320229. This commit does that. llvm-svn: 321028	2017-12-18 21:44:52 +00:00
Jessica Paquette	02c124d644	[MachineOutliner] Recommit r320229 LR was undefined entering outlined functions that contain calls. This made the machine verifier unhappy when expensive checks were enabled. This fixes that. llvm-svn: 321014	2017-12-18 19:33:21 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Francis Visoiu Mistrih	0b5bdceabf	[CodeGen] Print stack object references as %(fixed-)stack.0 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of `<fi#-4>` (supposing there are 4 fixed stack objects). Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41027 llvm-svn: 320827	2017-12-15 16:33:45 +00:00
Galina Kistanova	9dee3f0a97	Reverted r320229. It broke tests on builder llvm-clang-x86_64-expensive-checks-win. llvm-svn: 320588	2017-12-13 15:26:27 +00:00
Jessica Paquette	a249c4f513	[MachineOutliner] Outline calls The outliner previously would never outline calls. Calls are pretty common in files, so it makes sense to outline them. In fact, in the LLVM test suite, if you count the number of instructions that the outliner misses when you outline calls vs when you don't, it turns out that, on average, around 6% of the instructions encountered are calls. So, if we outline calls, we can find more candidates, and thus save some more space. This commit adds that functionality and updates the mir test to reflect that. llvm-svn: 320229	2017-12-09 00:43:49 +00:00
Jessica Paquette	59948666fb	[MachineOutliner] Fix offset overflow check The offset overflow check before was incorrect. It would always give the correct result, but it was comparing the SCALED potential fixed-up offset against an UNSCALED minimum/maximum. As a result, the outliner was missing a bunch of frame setup/destroy instructions that ought to have been safe to outline. This fixes that, and adds an instruction to the .mir test that failed the old test. llvm-svn: 320090	2017-12-07 21:51:43 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Florian Hahn	5d6a4e43ba	[AArch64] Add patterns to replace fsub fmul with fma fneg. Summary: This patch adds MachineCombiner patterns for transforming (fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower latency on micro architectures where fneg is cheap. Patch based on work by George Steed. Reviewers: rengolin, joelkevinjones, joel_k_jones, evandro, efriedma Reviewed By: evandro Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40306 llvm-svn: 319980	2017-12-06 22:48:36 +00:00
Francis Visoiu Mistrih	93ef145862	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427	2017-11-30 12:12:19 +00:00
Francis Visoiu Mistrih	9d7bb0cb40	[CodeGen] Print register names in lowercase in both MIR and debug output As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187	2017-11-28 17:15:09 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Tim Northover	350a87eaf1	AArch64: account for possible frame index operand in compares. If the address of a local is used in a comparison, AArch64 can fold the address-calculation into the comparison via "adds". Unfortunately, a couple of places (both hit in this one test) are not ready to deal with that yet and just assume the first source operand is a register. llvm-svn: 316035	2017-10-17 21:43:52 +00:00
Jessica Paquette	13593843f6	[MachineOutliner] Disable outlining from LinkOnceODRs by default Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, and D,E,F from another function (where letters are instructions). Now those functions are not identical, and cannot be deduped. Locally to M1 and M2, these outlining choices would be good-- to the whole program, however, this might not be true! To mitigate this, this commit makes it so that the outliner sees linkonceodr functions as unsafe to outline from. It also adds a flag, -enable-linkonceodr-outlining, which allows the user to specify that they want to outline from such functions when they know what they're doing. Changing this handles most code size regressions in the test suite caused by competing with linker dedupe. It also doesn't have a huge impact on the code size improvements from the outliner. There are 6 tests that regress > 5% from outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most tests either improve or are not impacted. Not outlined vs outlined without linkonceodrs: https://hastebin.com/raw/qeguxavuda Not outlined vs outlined with linkonceodrs: https://hastebin.com/raw/edepoqoqic Outlined with linkonceodrs vs outlined without linkonceodrs: https://hastebin.com/raw/awiqifiheb Numbers generated using compare.py with -m size.__text. Tests run for AArch64 with -Oz -mllvm -enable-machine-outliner -mno-red-zone. llvm-svn: 315136	2017-10-07 00:16:34 +00:00
Jessica Paquette	4cf187b5b4	[MachineOutliner] AArch64: Avoid saving + restoring LR if possible This commit allows the outliner to avoid saving and restoring the link register on AArch64 when it is dead within an entire class of candidates. This introduces changes to the way the outliner interfaces with the target. For example, the target now interfaces with the outliner using a MachineOutlinerInfo struct rather than by using getOutliningCallOverhead and getOutliningFrameOverhead. This also improves several comments on the outliner's cost model. https://reviews.llvm.org/D36721 llvm-svn: 314341	2017-09-27 20:47:39 +00:00
Evandro Menezes	91650ef061	[AArch64] Adjust the cost model for Exynos M1 and M2 Refine the model of loads and stores using the register offset addressing modes. llvm-svn: 313554	2017-09-18 19:00:36 +00:00
Evandro Menezes	9cd1bd7a83	[AArch64] Adjust the cost model for Exynos M1 and M2 Fix formatting in the predicate function AArch64InstrInfo::isExynosShiftLeftFast(). llvm-svn: 313553	2017-09-18 19:00:31 +00:00
Stanislav Mekhanoshin	7fe9a5d9b4	Allow target to decide when to cluster loads/stores in misched MachineScheduler when clustering loads or stores checks if base pointers point to the same memory. This check is done through comparison of base registers of two memory instructions. This works fine when instructions have separate offset operand. If they require a full calculated pointer such instructions can never be clustered according to such logic. Changed shouldClusterMemOps to accept base registers as well and let it decide what to do about it. Differential Revision: https://reviews.llvm.org/D37698 llvm-svn: 313208	2017-09-13 22:20:47 +00:00
Evandro Menezes	509516d200	[AArch64] Adjust the cost model for Exynos M1 and M2 Add new predicate to more accurately model the cost of arithmetic and logical operations shifted left. Differential revision: https://reviews.llvm.org/D37151 llvm-svn: 311943	2017-08-28 22:51:32 +00:00
Sjoerd Meijer	b0eb5fb317	[AArch64] Add FMOVH0: materialize 0 using zero register for f16 values Instead of loading 0 from a constant pool, it's of course much better to materialize it using an fmov and the zero register. Thanks to Ahmed Bougacha for the suggestion. Differential Revision: https://reviews.llvm.org/D37102 llvm-svn: 311662	2017-08-24 14:47:06 +00:00
Jessica Paquette	6315d2d21d	[MachineOutliner] Add RegState::Define to LDRXpost in insertOutlinedCall This fixes a MachineVerifier failure in machine-outliner.mir. Not explicitly adding RegState::Define to the LR argument makes it unhappy because an explicit definition is marked as a use. Build failure: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/7496/testReport/junit/LLVM/CodeGen_AArch64/machine_outliner_mir/ llvm-svn: 310671	2017-08-10 23:11:24 +00:00
Jessica Paquette	d36945bf3a	[MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LR Before, the outliner would mark all instructions that read from/modify LR as illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be outlined. This commit fixes that by making modifiesRegister() and readsRegister() look at W30 + take in a TRI argument. This makes sure that modifiesRegister() and readsRegister() won't outline either of W30 and LR. https://reviews.llvm.org/D36435 llvm-svn: 310422	2017-08-08 21:51:26 +00:00
Jessica Paquette	d87f54493d	[MachineOutliner] NFC: Change IsTailCall to a call class + frame class This commit - Removes IsTailCall and replaces it with a target-defined unsigned - Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall - Adds a call class + frame class classification to OutlinedFunction and Candidate respectively This accomplishes a couple things. Firstly, we don't need the notion of tail call in the general outlining algorithm. Secondly, we now can have different "outlining classes" for each candidate within a set of candidates. This will make it easy to add new ways to outline sequences for certain targets and dynamically choose an appropriate cost model for a sequence depending on the context that that sequence lives in. Ultimately, this should get us closer to being able to do something like, say avoid saving the link register when outlining AArch64 instructions. llvm-svn: 309475	2017-07-29 02:55:46 +00:00
Jessica Paquette	809d708b8a	[MachineOutliner] NFC: Split up getOutliningBenefit This is some more cleanup in preparation for some actual functional changes. This splits getOutliningBenefit into two cost functions: getOutliningCallOverhead and getOutliningFrameOverhead. These functions return the number of instructions that would be required to call a specific function and the number of instructions that would be required to construct a frame for a specific funtion. The actual outlining benefit logic is moved into the outliner, which calls these functions. The goal of refactoring getOutliningBenefit is to: - Get us closer to getting rid of the IsTailCall flag - Further split up "target-specific" things and "general algorithm" things llvm-svn: 309356	2017-07-28 03:21:58 +00:00
Adrian Prantl	8f4b353ee1	Remove unused function from AArch64 backend (NFC) llvm-svn: 309336	2017-07-27 23:52:06 +00:00
Geoff Berry	b1e8714af9	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1) Summary: This patch is the first step in reducing HW prefetcher instruction tag collisions in inner loops for Falkor. It adds a pass that annotates IR loads with metadata to indicate that they are known to be strided loads, and adds a target lowering hook that translates this metadata to a target-specific MachineMemOperand flag. A follow on change will use this MachineMemOperand flag to re-write instructions to reduce tag collisions. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34963 llvm-svn: 308059	2017-07-14 21:44:12 +00:00
Geoff Berry	6748abe24d	[MIR] Add support for printing and parsing target MMO flags Summary: Add target hooks for printing and parsing target MMO flags. Targets may override getSerializableMachineMemOperandTargetFlags() to return a mapping from string to flag value for target MMO values that should be serialized/parsed in MIR output. Add implementation of this hook for AArch64 SuppressPair MMO flag. Reviewers: bogner, hfinkel, qcolombet, MatzeB Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34962 llvm-svn: 307877	2017-07-13 02:28:54 +00:00
Joel Jones	7466ccfc59	Doxygen formatting. NFCI llvm-svn: 307597	2017-07-10 22:11:50 +00:00
Simon Pilgrim	8b4dc53326	[AArch64] Fix -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307393	2017-07-07 13:03:28 +00:00
Joel Jones	aff09bf052	Doxygen formatting. NFCI llvm-svn: 307263	2017-07-06 14:17:36 +00:00
Chad Rosier	6db9ff64a8	[AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free". This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a conditional branch (Bcc), when the NZCV flags can be set for "free". This is preferred on targets that have more flexibility when scheduling Bcc instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are equal). This can reduce register pressure and is also the default behavior for GCC. A few examples: add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS. cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed. add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses. cbz w8, .LBB1_2 -> b.eq .LBB1_2 sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses. tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2 In looking at all current sub-target machine descriptions, this transformation appears to be either positive or neutral. Differential Revision: https://reviews.llvm.org/D34220. llvm-svn: 306144	2017-06-23 19:20:12 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Geoff Berry	d6ac96f953	[AArch64][Falkor] Refine sched details for LSLfast/ASRfast. llvm-svn: 303682	2017-05-23 19:57:45 +00:00
Chad Rosier	8b12a03215	Fix an improperly placed curly bracket. NFC. llvm-svn: 303165	2017-05-16 12:43:23 +00:00
Chad Rosier	aeffffdb44	[AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD. Differential Revision: http://reviews.llvm.org/D33101. llvm-svn: 302822	2017-05-11 20:07:24 +00:00
Krzysztof Parzyszek	44e25f37ae	Move size and alignment information of regclass to TargetRegisterInfo 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221	2017-04-24 18:55:33 +00:00
Hans Wennborg	9b9a5358dd	Re-commit r301040 "X86: Don't emit zero-byte functions on Windows" In addition to the original commit, tighten the condition for when to pad empty functions to COFF Windows. This avoids running into problems when targeting e.g. Win32 AMDGPU, which caused test failures when this was committed initially. llvm-svn: 301047	2017-04-21 21:48:41 +00:00
Hans Wennborg	04593000d8	Revert r301040 "X86: Don't emit zero-byte functions on Windows" This broke almost all bots. Reverting while fixing. llvm-svn: 301041	2017-04-21 21:10:37 +00:00
Hans Wennborg	cb3e810714	X86: Don't emit zero-byte functions on Windows Empty functions can lead to duplicate entries in the Guard CF Function Table of a binary due to multiple functions sharing the same RVA, causing the kernel to refuse to load that binary. We had a terrific bug due to this in Chromium. It turns out we were already doing this for Mach-O in certain situations. This patch expands the code for that in AsmPrinter::EmitFunctionBody() and renames TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it seems it was used for not just Mach-O anyway. Differential Revision: https://reviews.llvm.org/D32330 llvm-svn: 301040	2017-04-21 20:58:12 +00:00
Balaram Makam	b4419f9d30	[AArch64] Refine Falkor Machine Model - Part 3 This concludes the refinements to Falkor Machine Model. It includes SchedPredicates for immediate zero and LSL Fast. Forwarding logic is also modeled for vector multiply and accumulate only. llvm-svn: 299810	2017-04-08 03:30:15 +00:00
Jessica Paquette	eac8633d6d	[Outliner] Revert r298734. When I tested r298734, I thought that red zones were enabled by default like in X86. Since red zones are behind a flag on AArch64 the testing wasn't true. llvm-svn: 298747	2017-03-24 23:00:21 +00:00
Jessica Paquette	167af85ec7	[Outliner] Remove no red zone requirment for AArch64 AArch64 doesn't require -mno-red-zone; stack fixups are sufficient here. This was unnecessarily copied over from the X86 target. (You can now outline with red zones! Yay!) Removing the requirement passes all Single/MultiSource tests. llvm-svn: 298734	2017-03-24 20:47:59 +00:00
Jessica Paquette	02cbfb2926	[Outliner] ACTUALLY remove the errs output I don't know how to type. This fixes the last commit which would have made all of the overflows legal, and kept the screaming. llvm-svn: 298263	2017-03-20 16:25:04 +00:00
Jessica Paquette	5d59a4ee19	[Outliner] Remove output for offset range check Forgot to remove some output before committing last time. (Instruction fixups don't actually overflow anywhere in the test suite so far, so I missed it). To prevent the outliner from screaming "Overflow!" in the event that that does happen, this commit removes that output. llvm-svn: 298260	2017-03-20 15:51:45 +00:00
Jessica Paquette	ea8cc09be0	[Outliner] Add outliner for AArch64 This commit adds the necessary target hooks for outlining in AArch64. It also refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a more general function, `getMemOpInfo`. This allows the outliner to share that code without copying and pasting it. The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with the X86-64 outliner. The test for this pass verifies that the outliner does, in fact outline functions, fixes up the stack accesses properly, and can correctly generate a tail call. In the future, this test should be replaced with a MIR test, so that we can properly test immediate offset overflows in fixed-up instructions. llvm-svn: 298162	2017-03-17 22:26:55 +00:00
Matthias Braun	e959544517	TargetInstrInfo: Provide default implementation of isTailCall(). In fact this default implementation should be the only implementation, keep it virtual for now to accomodate targets that don't model flags correctly. Differential Revision: https://reviews.llvm.org/D30747 llvm-svn: 297980	2017-03-16 20:02:30 +00:00
Azharuddin Mohammed	473b75c3d5	Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg Summary: A53 scheduler causes an assertion failure on all CRC instructions: include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. The case statements corresponding to CRC instructions are incorrect and should be removed. Also adding a testcase while on this. Reviewers: t.p.northover, javed.absar, apazos, rengolin Reviewed By: rengolin Subscribers: evandro, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30274 llvm-svn: 297582	2017-03-12 14:02:32 +00:00
Evandro Menezes	94edf02923	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 llvm-svn: 293737	2017-02-01 02:54:34 +00:00
Serge Rogatch	bc2d34394d	[XRay][AArch64] More staging for tail call support in XRay on AArch64 - in LLVM Summary: This patch prepares more for tail call support in XRay. Until the logging part supports tail calls, this is just staging, so it seems LLVM part is mostly ready with this patch. Related: https://reviews.llvm.org/D28948 (compiler-rt) Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown, aemerson Differential Revision: https://reviews.llvm.org/D28947 llvm-svn: 293080	2017-01-25 20:21:49 +00:00
Evandro Menezes	7784cacd91	[AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128' In order to follow the pattern of the existing 'slow-misaligned-128store' option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'. llvm-svn: 292954	2017-01-24 17:34:31 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Eugene Zelenko	049b017538	[AArch64, Lanai] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 291197	2017-01-06 00:30:53 +00:00
Logan Chien	ce542eefe3	Code cleanup: Remove tab indents. llvm-svn: 291193	2017-01-05 23:41:33 +00:00
Geoff Berry	d46b6e8096	[AArch64] Fold some filled/spilled subreg COPYs Summary: Extend AArch64 foldMemoryOperandImpl() to handle folding spills of subreg COPYs with read-undef defs like: %vreg0:sub_32<def,read-undef> = COPY %WZR; GPR64:%vreg0 by widening the spilled physical source reg and generating: STRXui %XZR <fi#0> as well as folding fills of similar COPYs like: %vreg0:sub_32<def,read-undef> = COPY %vreg1; GPR64:%vreg0, GPR32:%vreg1 by generating: %vreg0:sub_32<def,read-undef> = LDRWui <fi#0> Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27425 llvm-svn: 291180	2017-01-05 21:51:42 +00:00
Geoff Berry	7ffce7be0c	[AArch64] Fold more spilled/refilled COPYs. Summary: Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all full COPYs between register classes of the same size that are either spilled or refilled. Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27271 llvm-svn: 288439	2016-12-01 23:43:55 +00:00
Geoff Berry	7c078fc035	[AArch64] Fold spills of COPY of WZR/XZR Summary: In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the COPY being spilled is copying from WZR/XZR, but the source register is not in the COPY destination register's regclass. For example, when spilling: %vreg0 = COPY %XZR ; %vreg0:GPR64common without this change, the code in TargetInstrInfo::foldMemoryOperand() and canFoldCopy() that normally handles cases like this would fail to optimize since %XZR is not in GPR64common. So the spill code generated would be: %vreg0 = COPY %XZR STR %vreg instead of the new code generated: STR %XZR Reviewers: qcolombet, MatzeB Subscribers: mcrosier, aemerson, t.p.northover, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26976 llvm-svn: 288176	2016-11-29 18:28:32 +00:00
Matthias Braun	115efcd3d1	MachineScheduler: Export function to construct "default" scheduler. This makes the createGenericSchedLive() function that constructs the default scheduler available for the public API. This should help when you want to get a scheduler and the default list of DAG mutations. This also shrinks the list of default DAG mutations: {Load\|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer added by default. Targets can easily add them if they need them. It also makes it easier for targets to add alternative/custom macrofusion or clustering mutations while staying with the default createGenericSchedLive(). It also saves the callback back and forth in TargetInstrInfo::enableClusterLoads()/enableClusterStores(). Differential Revision: https://reviews.llvm.org/D26986 llvm-svn: 288057	2016-11-28 20:11:54 +00:00
Abderrazek Zaafrani	9f382f53d1	Test commit llvm-svn: 284832	2016-10-21 15:24:08 +00:00
Matt Arsenault	0a3ea89e85	AArch64: Move remaining target specific BranchRelaxation bits to TII llvm-svn: 283458	2016-10-06 15:38:09 +00:00
Matthias Braun	46a5238682	AArch64: Macrofusion: Split features, add missing combinations. AArch64InstrInfo::shouldScheduleAdjacent() determines whether two instruction can benefit from macroop fusion on apple CPUs. The list turned out to be incomplete: - the "rr" variants of the instructions were missing - even the "rs" variants can have shift value == 0 and behave like the "rr" variants This also splits the MacropFusion target feature into ArithmeticBccFusion and ArithmeticCbzFusion. Differential Revision: https://reviews.llvm.org/D25142 llvm-svn: 283243	2016-10-04 19:28:21 +00:00
Evandro Menezes	19b2aed308	[AArch64] Support for FP FMA when -ffp-contract=fast Currently, the machine combiner can proceed matching when -ffast-math is on. It should also match when only -ffp-contract=fast is specified as was the case before when DAGCombiner was doing the job. Patch by: Abderrazek Zaafrani <a.zaafrani@samsung.com>. Differential Revision: https://reviews.llvm.org/D24366 llvm-svn: 281649	2016-09-15 19:55:23 +00:00
Matt Arsenault	1b9fc8ed65	Finish renaming remaining analyzeBranch functions llvm-svn: 281535	2016-09-14 20:43:16 +00:00
Matt Arsenault	e8e0f5cac6	Make analyzeBranch family of instruction names consistent analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506	2016-09-14 17:24:15 +00:00
Matt Arsenault	a2b036e88b	AArch64: Use TTI branch functions in branch relaxation The main change is to return the code size from InsertBranch/RemoveBranch. Patch mostly by Tim Northover llvm-svn: 281505	2016-09-14 17:23:48 +00:00
Diana Picus	4b97288184	[AArch64] Support stackmap/patchpoint in getInstSizeInBytes We currently return 4 for stackmaps and patchpoints, which is very optimistic and can in rare cases cause the branch relaxation pass to fail to relax certain branches. This patch causes getInstSizeInBytes to return a pessimistic estimate of the size as the number of bytes requested in the stackmap/patchpoint. In the future, we could provide a more accurate estimate by sharing some of the logic in AArch64::LowerSTACKMAP/PATCHPOINT. Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750 Differential Revision: https://reviews.llvm.org/D24073 llvm-svn: 281301	2016-09-13 07:45:17 +00:00

1 2 3 4 5 ...

297 Commits