llvm-project

Commit Graph

Author	SHA1	Message	Date
Richard Sandiford	a075708abe	[SystemZ] Prefer comparisons with zero Convert >= 1 to > 0, etc. Using comparison with zero isn't a win on its own, but it exposes more opportunities for CC reuse (the next patch). llvm-svn: 187571	2013-08-01 10:29:45 +00:00
Richard Sandiford	791bea4182	[SystemZ] Implement isLegalAddressingMode() The loop optimizers were assuming that scales > 1 were OK. I think this is actually a bug in TargetLoweringBase::isLegalAddressingMode(), since it seems to be trying to reject anything that isn't r+i or r+r, but it has no default case for scales other than 0, 1 or 2. Implementing the hook for z means that z can no longer test any change there though. llvm-svn: 187497	2013-07-31 12:58:26 +00:00
Richard Sandiford	ee8343822e	[SystemZ] Be more careful about inverting CC masks (conditional loads) Extend r187495 to conditional loads. I split this out because the easiest way seemed to be to force a particular operand order in SystemZISelDAGToDAG.cpp. llvm-svn: 187496	2013-07-31 12:38:08 +00:00
Richard Sandiford	3d768e334b	[SystemZ] Be more careful about inverting CC masks System z branches have a mask to select which of the 4 CC values should cause the branch to be taken. We can invert a branch by inverting the mask. However, not all instructions can produce all 4 CC values, so inverting the branch like this can lead to some oddities. For example, integer comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater). If an integer EQ is reversed to NE before instruction selection, the branch will test for 1 or 2. If instead the branch is reversed after instruction selection (by inverting the mask), it will test for 1, 2 or 3. Both are correct, but the second isn't really canonical. This patch therefore keeps track of which CC values are possible and uses this when inverting a mask. Although this is mostly cosmestic, it fixes undefined behavior for the CIJNLH in branch-08.ll. Another fix would have been to mask out bit 0 when generating the fused compare and branch, but the point of this patch is that we shouldn't need to do that in the first place. The patch also makes it easier to reuse CC results from other instructions. llvm-svn: 187495	2013-07-31 12:30:20 +00:00
Richard Sandiford	8a757bba10	[SystemZ] Move compare-and-branch generation even later r187116 moved compare-and-branch generation from the instruction-selection pass to the peephole optimizer (via optimizeCompare). It turns out that even this is a bit too early. Fused compare-and-branch instructions don't interact well with predication, where a CC result is needed. They also make it harder to reuse the CC side-effects of earlier instructions (not yet implemented, but the subject of a later patch). Another problem was that the AnalyzeBranch family of routines weren't handling compares and branches, so we weren't able to reverse the fused form in cases where we would reverse a separate branch. This could have been fixed by extending AnalyzeBranch, but given the other problems, I've instead moved the fusing to the long-branch pass, which is also responsible for the opposite transformation: splitting out-of-range compares and branches into separate compares and long branches. I've added a test for the AnalyzeBranch problem. A test for the predication problem is included in the next patch, which fixes a bug in the choice of CC mask. llvm-svn: 187494	2013-07-31 12:11:07 +00:00
Richard Sandiford	6a06ba36ba	[SystemZ] Postpone NI->RISBG conversion to convertToThreeAddress() r186399 aggressively used the RISBG instruction for immediate ANDs, both because it can handle some values that AND IMMEDIATE can't, and because it allows the destination register to be different from the source. I realized later while implementing the distinct-ops support that it would be better to leave the choice up to convertToThreeAddress() instead. The AND IMMEDIATE form is shorter and is less likely to be cracked. This is a problem for 32-bit ANDs because we assume that all 32-bit operations will leave the high word untouched, whereas RISBG used in this way will either clear the high word or copy it from the source register. The patch uses the z196 instruction RISBLG for this instead. This means that z10 will be restricted to NILL, NILH and NILF for 32-bit ANDs, but I think that should be OK for now. Although we're using z10 as the base architecture, the optimization work is going to be focused more on z196 and zEC12. llvm-svn: 187492	2013-07-31 11:36:35 +00:00
Richard Sandiford	6cf80b3ec0	[SystemZ] Add RISBLG and RISBHG instruction definitions The next patch will make use of RISBLG for codegen. llvm-svn: 187490	2013-07-31 11:17:35 +00:00
Richard Sandiford	c3f85d73ab	[SystemZ] Rework compare and branch support Before the patch we took advantage of the fact that the compare and branch are glued together in the selection DAG and fused them together (where possible) while emitting them. This seemed to work well in practice. However, fusing the compare so early makes it harder to remove redundant compares in cases where CC already has a suitable value. This patch therefore uses the peephole analyzeCompare/optimizeCompareInstr pair of functions instead. No behavioral change intended, but it paves the way for a later patch. llvm-svn: 187116	2013-07-25 09:34:38 +00:00
Richard Sandiford	f2404164ba	[SystemZ] Add LOCR and LOCGR llvm-svn: 187113	2013-07-25 09:11:15 +00:00
Richard Sandiford	09a8cf3604	[SystemZ] Add LOC and LOCG As with the stores, these instructions can trap when the condition is false, so they are only used for things like (cond ? x : *ptr). llvm-svn: 187112	2013-07-25 09:04:52 +00:00
Richard Sandiford	a68e6f5660	[SystemZ] Add STOC and STOCG These instructions are allowed to trap even if the condition is false, so for now they are only used for "ptr = (cond ? x : ptr)"-style constructs. llvm-svn: 187111	2013-07-25 08:57:02 +00:00
Craig Topper	690d8ea181	Split generated asm mnemonic matching table into a separate table for each asm variant. This removes the need to store the asm variant in each row of the single table that existed before. Shaves ~16K off the size of X86AsmParser.o. llvm-svn: 187026	2013-07-24 07:33:14 +00:00
Richard Sandiford	fac8b10a84	[SystemZ] Add ALRK, AGLRK, SLRK and SGLRK Follows the same lines as r186686, but much more limited, since we only use ADD LOGICAL for multi-i64 additions. llvm-svn: 186689	2013-07-19 16:37:00 +00:00
Richard Sandiford	7d6a453623	[SystemZ] Add AHIK and AGHIK I did these as a separate patch because it uses a slightly different form of RIE layout. llvm-svn: 186687	2013-07-19 16:32:12 +00:00
Richard Sandiford	c575df6dcc	[SystemZ] Add ARK, AGRK, SRK and SGRK The testsuite changes follow the same lines as for r186683. llvm-svn: 186686	2013-07-19 16:26:39 +00:00
Richard Sandiford	c57e586792	[SystemZ] Add NGRK, OGRK and XGRK Like r186683, but for 64 bits. llvm-svn: 186685	2013-07-19 16:24:22 +00:00
Richard Sandiford	0175b4a353	[SystemZ] Add NRK, ORK and XRK The atomic tests assume the two-operand forms, so I've restricted them to z10. Running and-01.ll, or-01.ll and xor-01.ll for z196 as well as z10 shows why using convertToThreeAddress() is better than exposing the three-operand forms first and then converting back to two operands where possible (which is what I'd originally tried). Using the three-operand form first stops us from taking advantage of NG, OG and XG for spills. llvm-svn: 186683	2013-07-19 16:21:55 +00:00
Richard Sandiford	ff6c5a5609	[SystemZ] Use SLLK, SRLK and SRAK for codegen This patch uses the instructions added in r186680 for codegen. llvm-svn: 186681	2013-07-19 16:12:08 +00:00
Richard Sandiford	27d1cfe3d4	[SystemZ] Start adding z196 and zEC12 support This first step just adds definitions for SLLK, SRLK and SRAK. The next patch will actually make use of them during codegen. insn-bad.s tests that some form of error is reported when using these instructions on z10. More work is needed to get the "instruction requires: distinct-ops" that we'd ideally like, so I've stubbed that part out for now. I'll come back and make it mandatory once the necessary changes are in. llvm-svn: 186680	2013-07-19 16:09:03 +00:00
Richard Sandiford	5109321042	[SystemZ] Use RNSBG This should be the last of the R.SBG patches for now. llvm-svn: 186573	2013-07-18 10:40:35 +00:00
Richard Sandiford	297f7d2724	[SystemZ] Generalize RxSBG SRA case The original code only folded SRA into ROTATE ... SELECTED BITS if there was no outer shift. This patch splits out that check and generalises it slightly. The extra cases aren't really that interesting, but this is paving the way for RNSBG support. llvm-svn: 186571	2013-07-18 10:14:55 +00:00
Richard Sandiford	7878b852e6	[SystemZ] Use RXSBG Extend the previous R.SBG patches to handle XORs. llvm-svn: 186570	2013-07-18 10:06:15 +00:00
Richard Sandiford	5cbac96730	[SystemZ] Rename and formatting fixes In hindsight, using "RISBG" for something that can be any type of R.SBG instruction was a bit confusing, so this renames it to RxSBG. That might not be the best choice either, since there is an instruction called RXSBG, but hopefully the lower-case letter stands out enough. While there I fixed a couple of GNUisms that had crept in -- sorry about that! llvm-svn: 186569	2013-07-18 09:45:08 +00:00
Aaron Ballman	fbb104513b	Silencing an MSVC warning about signed vs unsigned comparison mismatches. llvm-svn: 186529	2013-07-17 19:43:13 +00:00
Richard Sandiford	885140c951	[SystemZ] Use ROSBG and non-zero form of RISBG for OR nodes llvm-svn: 186405	2013-07-16 11:55:57 +00:00
Richard Sandiford	35bb463fb1	[SystemZ] Add MC support for R[NOX]SBG CodeGen support will come later. llvm-svn: 186401	2013-07-16 11:28:08 +00:00
Richard Sandiford	82ec87dbdb	[SystemZ] Use RISBG for (shift (and ...)) Another patch in the series to make more use of R.SBG. This one extends r186072 and r186073 to handle cases where the AND is inside the shift. llvm-svn: 186399	2013-07-16 11:02:24 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Richard Sandiford	6d4bd28322	[SystemZ] Optimize sign-extends of vector setccs Normal (sext (setcc ...)) sequences are optimised into (select_cc ..., -1, 0) by DAGCombiner::visitSIGN_EXTEND. However, this is deliberately not done for vectors, and after vector type legalization we have (sext_inreg (setcc ...)) instead. I wondered about trying to extend DAGCombiner to handle this case too, but it seemed to be a loss on some other targets I tried, even those for which SETCC isn't "legal" and SELECT_CC is. llvm-svn: 186149	2013-07-12 09:17:10 +00:00
Richard Sandiford	b820405b59	[SystemZ] Fix parsing of inline asm registers GPR and FPR constraints like "{r2}" and "{f2}" weren't handled correctly because the name-to-regno mapping depends on the value type and (because of that) the internal names in RegStrings are not the same as the AsmName. CC constraints like "{cc}" didn't work either because there was no associated register class. llvm-svn: 186148	2013-07-12 09:08:12 +00:00
Richard Sandiford	3f0edc2903	[SystemZ] Improve spilling of LGDR and LDGR If the source of these instructions is spilled we should load the destination. If the destination is spilled we should store the source. llvm-svn: 186147	2013-07-12 08:37:17 +00:00
Richard Sandiford	ea9b6aa20b	[SystemZ] Use zeroing form of RISBG for shift-and-AND sequences Extend r186072 to handle shifts and ANDs. llvm-svn: 186073	2013-07-11 09:10:09 +00:00
Richard Sandiford	84f54a3bc9	[SystemZ] Use zeroing form of RISBG for some AND sequences RISBG can handle some ANDs for which no AND IMMEDIATE exists. It also acts as a three-operand AND for some cases where an AND IMMEDIATE could be used instead. It might be worth adding a pass to replace RISBG with AND IMMEDIATE in cases where the register operands end up being the same and where AND IMMEDIATE is smaller. llvm-svn: 186072	2013-07-11 08:59:12 +00:00
Richard Sandiford	67ddcd6dd0	[SystemZ] Allow 8-bit operands to RISBG RISBG has three 8-bit operands (I3, I4 and I5). I'd originally restricted all three to 6 bits, since that's the only range we intended to use at the time. However, the top bit of I4 acts as a "zero" flag for RISBG, while the top bit of I3 acts as a "test" flag for RNSBG & co. This patch therefore allows them to have the full 8-bit range. I've left the fifth operand as a 6-bit value for now since the upper 2 bits have no defined meaning. llvm-svn: 186070	2013-07-11 08:37:13 +00:00
Stephen Lin	73de7bf5de	AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956	2013-07-09 18:16:56 +00:00
Richard Sandiford	9784649157	[SystemZ] Use MVC for simple load/store pairs Look for patterns of the form (store (load ...), ...) in which the two locations are known not to partially overlap. (Identical locations are OK.) These sequences are better implemented by MVC unless either the load or the store could use RELATIVE LONG instructions. The testcase showed that we weren't using LHRL and LGHRL for extload16, only sextloadi16. The patch fixes that too. llvm-svn: 185919	2013-07-09 09:46:39 +00:00
Richard Sandiford	47660c148c	[SystemZ] Use "STC;MVC" for memset Use "STC;MVC" for memsets that are too big for two STCs or MV...Is yet small enough for a single MVC. As with memcpy, I'm leaving longer cases till later. The number of tests might seem excessive, but f33 & f34 from memset-04.ll failed the first cut because I'd not added the "?:" on the calculation of Size1. llvm-svn: 185918	2013-07-09 09:32:42 +00:00
Richard Sandiford	d6c78e8f9f	[SystemZ] Remove unwanted part from last commit I was originally going to use MVC for memmove too, but that's less of a clear win. Remove some accidental left-overs in the previous commit. llvm-svn: 185804	2013-07-08 09:55:36 +00:00
Richard Sandiford	d131ff8cf8	[SystemZ] Use MVC for memcpy Use MVC for memcpy in cases where a single MVC is enough. Using MVC is a win for longer copies too, but I'll leave that for later. llvm-svn: 185802	2013-07-08 09:35:23 +00:00
Richard Sandiford	c40f27b52d	[SystemZ] Remove no-op MVCs The stack coloring pass has code to delete stores and loads that become trivially dead after coloring. Extend it to cope with single instructions that copy from one frame index to another. The testcase happens to show an example of this kicking in at the moment. It did occur in Real Code too though. llvm-svn: 185705	2013-07-05 14:38:48 +00:00
Richard Sandiford	1ca6deaeb7	[SystemZ] Remove redundant frame MMOs This fixes foldMemoryOperandImpl() so that it doesn't create duplicated frame MMOs. I hadn't realized when writing r185434 that it was the caller's responsibility to add these. No behavioural change intended. llvm-svn: 185704	2013-07-05 14:31:24 +00:00
Richard Sandiford	8976ea72ab	[SystemZ] Enable the use of MVC for frame-to-frame spills ...now that the problem that prompted the restriction has been fixed. The original spill-02.py was a compromise because at the time I couldn't find an example that actually failed without the two scavenging slots. The version included here did. llvm-svn: 185701	2013-07-05 14:02:01 +00:00
Richard Sandiford	23943229f6	[SystemZ] Allocate a second register scavenging slot This is another prerequisite for frame-to-frame MVC copies. I'll commit the patch that makes use of the slot separately. The downside of trying to test many corner cases with each of the available addressing modes is that a fair few tests need to account for the new frame layout. I do still think it's useful to have all these tests though, since it's something that wouldn't get much coverage otherwise. llvm-svn: 185698	2013-07-05 13:11:52 +00:00
Richard Sandiford	5dd52f8c4d	[SystemZ] Clean up register scavenging code SystemZ wants normal register scavenging slots, as close to the stack or frame pointer as possible. The only reason it was using custom code was because PrologEpilogInserter assumed an x86-like layout, where the frame pointer is at the opposite end of the frame from the stack pointer. This meant that when frame pointer elimination was disabled, the slots ended up being as close as possible to the incoming stack pointer, which is the opposite of what we want on SystemZ. This patch adds a new knob to say which layout is used and converts SystemZ to use target-independent scavenging slots. It's one of the pieces needed to support frame-to-frame MVCs, where two slots might be required. The ABI requires us to allocate 160 bytes for calls, so one approach would be to use that area as temporary spill space instead. It would need some surgery to make sure that the slot isn't live across a call though. I stuck to the "isFPCloseToIncomingSP - ..." style comment on the "do what the surrounding code does" principle. The FP case is already covered by several Systemz/frame-* tests, which fail without the PrologueEpilogueInserter change, so no new ones are needed. No behavioural change intended. llvm-svn: 185696	2013-07-05 12:55:00 +00:00
Jakob Stoklund Olesen	db429d9483	Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes. These exception-related opcodes are not used any longer. llvm-svn: 185625	2013-07-04 13:54:20 +00:00
Craig Topper	af0dea1347	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Jakob Stoklund Olesen	a1f5b901a5	Revert r185595-185596 which broke buildbots. Revert "Simplify landing pad lowering." Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes." llvm-svn: 185600	2013-07-04 00:26:30 +00:00
Jakob Stoklund Olesen	f33ec531fa	Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes. These exception-related opcodes are not used any longer. llvm-svn: 185596	2013-07-03 23:56:31 +00:00
Richard Sandiford	ed1fab6b5b	[SystemZ] Fold more spills Add a mapping from register-based <INSN>R instructions to the corresponding memory-based <INSN>. Use it to cut down on the number of spill loads. Some instructions extend their operands from smaller fields, so this required a new TSFlags field to say how big the unextended operand is. This optimisation doesn't trigger for C(G)R and CL(G)R because in practice we always combine those instructions with a branch. Adding a test for every other case probably seems excessive, but it did catch a missed optimisation for DSGF (fixed in r185435). llvm-svn: 185529	2013-07-03 10:10:02 +00:00
Richard Sandiford	df313ff697	[SystemZ] Rename mapping table fields Rename Function->DispKey and PairType->DispSize. I'd originally used "Function" because I thought it might be useful for other InstMappings. However, it turns out that having two very similar instructions with the same Function makes it pretty useless for anything other than the displacement size key. Other InstMappings will want the key to be defined for only one instruction in the pair. No behavioural change intended. llvm-svn: 185526	2013-07-03 09:19:58 +00:00

1 2 3 4 5 ...

495 Commits