llvm-project

Commit Graph

Author	SHA1	Message	Date
Puyan Lotfi	43e94b15ea	Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922	2018-01-31 22:04:26 +00:00
Petar Jovanovic	540f4cd10a	[DWARF] Allow duplication of tails with CFI instructions This commit came as a result for revert of patch r317579 (originally committed as r317100). The patch made CFI instructions duplicable, because their existence in the epilogue block was affecting the Tail duplication pass. However, duplicating blocks with CFI instructions was an issue for compact unwind info on Darwin, which is why the patch was reverted. This patch allows duplicating tails with CFI instructions, though they are not duplicable, by copying them 'manually'. Patch by Djordje Kovacevic. Differential Revision: https://reviews.llvm.org/D40979 llvm-svn: 323883	2018-01-31 15:57:57 +00:00
Craig Topper	3913a4dd56	[X86] Fix a crash that can occur in combineExtractVectorElt due to not checking the width of a ConstantSDNode before calling getConstantOperandVal. llvm-svn: 323614	2018-01-28 07:29:35 +00:00
Craig Topper	247016a735	[X86] Use vptestm/vptestnm for comparisons with zero to avoid creating a zero vector. We can use the same input for both operands to get a free compare with zero. We already use this trick in a couple places where we explicitly create PTESTM with the same input twice. This generalizes it. I'm hoping to remove the ISD opcodes and move this to isel patterns like we do for scalar cmp/test. llvm-svn: 323605	2018-01-27 20:19:09 +00:00
Craig Topper	513d3fa674	[X86] Remove X86ISD::PCMPGTM/PCMPEQM and instead just use X86ISD::PCMPM and pattern match the immediate value during isel. Legalization is still biased to turn LT compares in to GT by swapping operands to avoid needing extra isel patterns to commute. I'm hoping to remove TESTM/TESTNM next and this should simplify that by making EQ/NE more similar. llvm-svn: 323604	2018-01-27 20:19:02 +00:00
Craig Topper	2c570eaa00	[TargetLowering] Teach TargetLowering::SimplifySetCC to simplify setcc of vXi1 vectors into logic ops. This transform was already being done for setcc of scalar i1. This extends it to vectors. llvm-svn: 323585	2018-01-27 09:10:58 +00:00
Craig Topper	c58c2b5c9b	[X86] Rewrite vXi1 element insertion by using a vXi1 scalar_to_vector and inserting into a vXi1 vector. The existing code was already doing something very similar to subvector insertion so this allows us to remove the nearly duplicate code. This patch is a little larger than it should be due to differences between the DQI handling between the two today. llvm-svn: 323212	2018-01-23 15:56:36 +00:00
Craig Topper	76adcc86cd	[X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8. Summary: For the most part its better to keep v32i1 as a mask type of a narrower width than trying to promote it to a ymm register. I had to add some overrides to the methods that get the types for the calling convention so that we still use v32i8 for argument/return purposes. There are still some regressions in here. I definitely saw some around shuffles. I think we probably should move vXi1 shuffle from lowering to a DAG combine where I think the extend and truncate we have to emit would be better combined. I think we also need a DAG combine to remove trunc from (extract_vector_elt (trunc)) Overall this removes something like 13000 CHECK lines from lit tests. Reviewers: zvi, RKSimon, delena, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42031 llvm-svn: 323201	2018-01-23 14:25:39 +00:00
Craig Topper	b2868233b7	[X86] Use ISD::TRUNCATE instead of X86ISD::VTRUNC when input and output types have the same number of elements. llvm-svn: 322455	2018-01-14 08:11:36 +00:00
Craig Topper	e9fc0cd920	[X86] Improve legalization of vXi16/vXi8 selects. Extend vXi1 conditions of vXi8/vXi16 selects even before type legalization gets a chance to split wide vectors. Previously we would only extend 128 and 256 bit vectors. But if we start with a 512 bit vector or wider that needs to be split we wouldn't extend until after the split had taken place. By extending early we improve the results of type legalization. Don't widen condition of 128/256 bit vXi16/vXi8 selects when we have BWI but not VLX. We can still use a mask register by widening the select to 512-bits instead. This is similar to what we do for compares already. llvm-svn: 322450	2018-01-14 02:05:51 +00:00
Craig Topper	d58c165545	[X86] Make v2i1 and v4i1 legal types without VLX Summary: There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type. It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway. This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly. We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added. I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all. There's definitely room for improvement with some follow up patches. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41560 llvm-svn: 321967	2018-01-07 18:20:37 +00:00
Craig Topper	9befe89367	[X86] Use SIGN_EXTEND to implement ANY_EXTEND from vXi1. llvm-svn: 321334	2017-12-22 02:30:26 +00:00
Craig Topper	600f1ba333	[X86] Don't zero the upper bits of the k-register before extracting a single bit from a vXi1. This doesn't match the semantics of the extract_vector_elt operation. Nothing downstream knows the bits were zeroed so they still get masked or sign extended after the extrat anyway. llvm-svn: 320723	2017-12-14 18:35:25 +00:00
Craig Topper	8cdf7c0e68	[X86] Make ANY_EXTEND from vXi1 Custom for more types. We should be able to support ANY_EXTEND for any types we support ZERO_EXTEND for. llvm-svn: 320675	2017-12-14 08:26:00 +00:00
Craig Topper	eab2d4665f	[SelectionDAG][X86] Improve legalization of v32i1 CONCAT_VECTORS of v16i1 for AVX512F. A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized. This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts. llvm-svn: 320674	2017-12-14 08:25:58 +00:00
Craig Topper	323ba39f10	[X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles. We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert. This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position. The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift. llvm-svn: 320120	2017-12-08 00:16:09 +00:00
Craig Topper	dfc79c7c33	[X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed. There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right. This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift. I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first. llvm-svn: 320081	2017-12-07 20:10:04 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Craig Topper	a404ce955a	[X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319740	2017-12-05 06:37:21 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Francis Visoiu Mistrih	9d7bb0cb40	[CodeGen] Print register names in lowercase in both MIR and debug output As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187	2017-11-28 17:15:09 +00:00
Reid Kleckner	7adb2fdbba	Revert "Correct dwarf unwind information in function epilogue for X86" This reverts r317579, originally committed as r317100. There is a design issue with marking CFI instructions duplicatable. Not all targets support the CFIInstrInserter pass, and targets like Darwin can't cope with duplicated prologue setup CFI instructions. The compact unwind info emission fails. When the following code is compiled for arm64 on Mac at -O3, the CFI instructions end up getting tail duplicated, which causes compact unwind info emission to fail: int a, c, d, e, f, g, h, i, j, k, l, m; void n(int o, int b) { if (g) f = 0; for (; f < o; f++) { m = a; if (l > j k > i) j = i = k = d; h = b[c] - e; } } We get assembly that looks like this: ; BB#1: ; %if.then Lloh3: adrp x9, _f@GOTPAGE Lloh4: ldr x9, [x9, _f@GOTPAGEOFF] mov w8, wzr Lloh5: str wzr, [x9] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.lt LBB0_3 b LBB0_7 LBB0_2: ; %entry.if.end_crit_edge Lloh6: adrp x8, _f@GOTPAGE Lloh7: ldr x8, [x8, _f@GOTPAGEOFF] Lloh8: ldr w8, [x8] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.ge LBB0_7 LBB0_3: ; %for.body.lr.ph Note the multiple .cfi_def* directives. Compact unwind info emission can't handle that. llvm-svn: 317726	2017-11-08 21:31:14 +00:00
Petar Jovanovic	e2a585dddc	Reland "Correct dwarf unwind information in function epilogue for X86" Reland r317100 with minor fix regarding ComputeCommonTailLength function in BranchFolding.cpp. Skipping top CFI instructions block needs to executed on several more return points in ComputeCommonTailLength(). Original r317100 message: "Correct dwarf unwind information in function epilogue for X86" This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. llvm-svn: 317579	2017-11-07 14:40:27 +00:00
Petar Jovanovic	bb5c84fb57	Revert "Correct dwarf unwind information in function epilogue for X86" This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf buildbot failure (build #15606). llvm-svn: 317136	2017-11-01 23:05:52 +00:00
Petar Jovanovic	f2faee92aa	Correct dwarf unwind information in function epilogue for X86 This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D35844 llvm-svn: 317100	2017-11-01 16:04:11 +00:00
Guy Blank	92d5ce3bd4	[X86] Add a pass to convert instruction chains between domains. The pass scans the function to find instruction chains that define registers in the same domain (closures). It then calculates the cost of converting the closure to another domain. If found profitable, the instructions are converted to instructions in the other domain and the register classes are changed accordingly. This commit adds the pass infrastructure and a simple conversion from the GPR domain to the Mask domain. Differential Revision: https://reviews.llvm.org/D37251 Change-Id: Ic2cf1d76598110401168326d411128ae2580a604 llvm-svn: 316288	2017-10-22 11:43:08 +00:00
Craig Topper	a9cd59fb5d	[X86] Lower vselect with constant condition to vector_shuffle even with AVX512 instructions. Summary: It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register. It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day. Reviewers: RKSimon, delena, zvi Reviewed By: delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38932 llvm-svn: 315849	2017-10-15 06:39:07 +00:00
Reid Kleckner	ab23dace56	[MC] Suppress .Lcfi labels when emitting textual assembly Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - \| llvm-mc \| llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259	2017-10-10 00:57:36 +00:00
Geoff Berry	fabedbad11	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814	2017-10-03 16:59:13 +00:00
Geoff Berry	bfc5fb4571	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 314729	2017-10-02 22:01:37 +00:00
Gadi Haber	87337a2bb9	[X86][SKX][KNL] Updated regression tests to use -mattr instead of -mcpu flag.NFC. NFC. Updated 8 regression tests to use -mattr instead of -mcpu flag as follows: -mcpu=knl --> -mattr=+avx512f -mcpu=skx --> -mattr=+avx512f,+avx512bw,+avx512vl,+avx512dq The updates are as part of the preparation of a large commit to add all instruction scheduling for the SKX target. Reviewers: delena, zvi, RKSimon Differential Revision: https://reviews.llvm.org/D38222 Change-Id: I2381c9b5bb75ecacfca017243c22d054f6eddd14 llvm-svn: 314306	2017-09-27 14:44:15 +00:00
Sam McCall	f71bb198ed	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This crashes on boringSSL on PPC (will send reduced testcase) This reverts commit r312328. llvm-svn: 312490	2017-09-04 15:47:00 +00:00
Geoff Berry	65528f2991	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues addressed since original review: - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312328	2017-09-01 14:27:20 +00:00
Hans Wennborg	24775a0a6c	Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!") > Issues identified by buildbots addressed since original review: > - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. > - The pass no longer forwards COPYs to physical register uses, since > doing so can break code that implicitly relies on the physical > register number of the use. > - The pass no longer forwards COPYs to undef uses, since doing so > can break the machine verifier by creating LiveRanges that don't > end on a use (since the undef operand is not considered a use). > > [MachineCopyPropagation] Extend pass to do COPY source forwarding > > This change extends MachineCopyPropagation to do COPY source forwarding. > > This change also extends the MachineCopyPropagation pass to be able to > be run during register allocation, after physical registers have been > assigned, but before the virtual registers have been re-written, which > allows it to remove virtual register COPY LiveIntervals that become dead > through the forwarding of all of their uses. llvm-svn: 312178	2017-08-30 22:11:37 +00:00
Geoff Berry	feffb0c8af	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues identified by buildbots addressed since original review: - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312154	2017-08-30 18:41:07 +00:00
Craig Topper	641e2af9e8	[X86] Provide a separate feature bit for macro fusion support instead of basing it on the AVX flag Summary: Currently we determine if macro fusion is supported based on the AVX flag as a proxy for the processor being Sandy Bridge". This is really strange as now AMD supports AVX. It also means if user explicitly disables AVX we disable macro fusion. This patch adds an explicit macro fusion feature. I've also enabled for the generic 64-bit CPU (which doesn't have AVX) This is probably another candidate for being in the MI layer, but for now I at least wanted to correct the overloading of the AVX feature. Reviewers: spatel, chandlerc, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37280 llvm-svn: 312097	2017-08-30 04:34:48 +00:00
Gadi Haber	d76f7b824e	[X86][Haswell] Updating HSW instruction scheduling information This patch completely replaces the instruction scheduling information for the Haswell architecture target by modifying the file X86SchedHaswell.td located under the X86 Target. We used the scheduling information retrieved from the Haswell architects in order to replace and modify the existing scheduling. The patch continues the scheduling replacement effort started with the SNB target in r307529 and r310792. Information includes latency, number of micro-Ops and used ports by each HSW instruction. Please expect some performance fluctuations due to code alignment effects. Reviewers: RKSimon, zvi, aymanmus, craig.topper, m_zuckerman, igorb, dim, chandlerc, aaboud Differential Revision: https://reviews.llvm.org/D36663 llvm-svn: 311879	2017-08-28 10:04:16 +00:00
Geoff Berry	bd47e8a4f7	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142	2017-08-18 01:43:11 +00:00
Geoff Berry	51f52c4fca	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135	2017-08-17 23:06:55 +00:00
Geoff Berry	4e38e02e6f	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062	2017-08-17 04:04:11 +00:00
Geoff Berry	87f8d25150	[MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038	2017-08-16 20:50:01 +00:00
Dinar Temirbulatov	a0beedef1c	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926	2017-08-03 08:50:18 +00:00
Matthias Braun	6b898beb8e	X86: Do not use llc -march in tests. `llc -march` is problematic because it only switches the target architecture, but leaves the operating system unchanged. This occasionally leads to indeterministic tests because the OS from LLVM_DEFAULT_TARGET_TRIPLE is used. However we can simply always use `llc -mtriple` instead. This changes all the tests to do this to avoid people using -march when they copy and paste parts of tests. See also the discussion in https://reviews.llvm.org/D35287 llvm-svn: 309774	2017-08-02 00:28:10 +00:00
Dinar Temirbulatov	aead31a36f	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298	2017-07-27 17:47:01 +00:00
Nirav Dave	df86d2d008	[DAG] Handle missing transform in fold of value extension case. Summary: When pushing an extension of a constant bitwise operator on a load into the load, change other uses of the load value if they exist to prevent the old load from persisting. Reviewers: spatel, RKSimon, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35030 llvm-svn: 308618	2017-07-20 13:57:32 +00:00
Nirav Dave	07871007aa	[DAG] Avoid deleting nodes before combining them. When replacing a node and it's operand, replacing the operand node may cause the deletion of the original node leading to an assertion failure. Case around these replacements to avoid this without relying on inspecting the DELETED_NODE opcode in various extend dagcombiner cases. Fixes PR32515. Reviewers: dbabokin, RKSimon, davide, chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D34095 llvm-svn: 308330	2017-07-18 17:39:15 +00:00
Michael Zuckerman	f66840020c	Reverting commit 306414 on behalf of @gadi.haber llvm-svn: 306532	2017-06-28 11:23:31 +00:00
Gadi Haber	13759a7ed6	Updated and extended the information about each instruction in HSW and SNB to include the following data: •static latency •number of uOps from which the instructions consists •all ports used by the instruction Reviewers:  RKSimon zvi aymanmus m_zuckerman Differential Revision: https://reviews.llvm.org/D33897 llvm-svn: 306414	2017-06-27 15:05:13 +00:00
Zvi Rackover	845ca8fba9	[X86] Rerun the update_llc_test_checks tool on test. NFC. llvm-svn: 305897	2017-06-21 11:21:43 +00:00
Guy Blank	548e22a1a7	[X86][AVX512] Make i1 illegal in the CodeGen This patch defines the i1 type as illegal in the X86 backend for AVX512. For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended. This should produce better scalar code for i1 types since GPRs will be used instead of mask registers. Differential Revision: https://reviews.llvm.org/D32273 llvm-svn: 303421	2017-05-19 12:35:15 +00:00

1 2 3

139 Commits