llvm-project

Commit Graph

Author	SHA1	Message	Date
Sander de Smalen	886510f350	[TableGen][AsmMatcherEmitter] Generate assembler checks for tied operands Summary: This extends TableGen's AsmMatcherEmitter with code that generates a table with tied-operand constraints. The constraints are checked when parsing the instruction. If an operand is not equal to its tied operand, the assembler will give an error. Patch [2/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB. Reviewers: olista01, rengolin, mcrosier, fhahn, craig.topper, evandro, echristo Reviewed By: fhahn Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D41446 llvm-svn: 322166	2018-01-10 10:10:56 +00:00
Jonas Paulsson	1a76f3a2c2	Temporarily revert "[SystemZ] Check for legality before doing LOAD AND TEST transformations." , due to test failures. llvm-svn: 322165	2018-01-10 10:05:55 +00:00
Diana Picus	8f14886630	[ARM GlobalISel] Legalize s32/s64 G_FCONSTANT Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164	2018-01-10 10:01:49 +00:00
Diana Picus	734a5e8912	[ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162	2018-01-10 09:32:01 +00:00
Jonas Paulsson	d9dde1ac56	[SystemZ] Check for legality before doing LOAD AND TEST transformations. Since a load and test instruction treat its operands as signed, it can only replace a logical compare for EQ/NE uses. Review: Ulrich Weigand https://bugs.llvm.org/show_bug.cgi?id=35662 llvm-svn: 322161	2018-01-10 09:18:17 +00:00
Stefan Pintilie	1712700842	[PowerPC] Manually schedule the prologue and epilogue This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. This commit is almost identical to this one: r322036 except that two warnings that broke build bots have been fixed. The revision number is D41737 as before. llvm-svn: 322124	2018-01-09 21:57:49 +00:00
Tim Renouf	6eaad1e539	[AMDGPU] Fixed incorrect uniform branch condition Summary: I had a case where multiple nested uniform ifs resulted in code that did v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first ensuring that bits for inactive lanes were clear. There was already code for inserting an "s_and_b64 vcc, exec, vcc" to clear bits for inactive lanes in the case that the branch is instruction selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in SIFixSGPRCopies. I have added the same code into SILowerControlFlow for the case that the branch is instruction selected as s_cbranch_vccnz. This de-optimizes the code in some cases where the s_and is not needed, because vcc is the result of a v_cmp, or multiple v_cmp instructions combined by s_and/s_or. We should add a pass to re-optimize those cases. Reviewers: arsenm, kzhuravl Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle Differential Revision: https://reviews.llvm.org/D41292 llvm-svn: 322119	2018-01-09 21:34:43 +00:00
Alexey Bataev	771ec9f399	[COST]Fix PR35865: Fix cost model evaluation for shuffle on X86. Summary: If the vector type is transformed to non-vector single type, the compile may crash trying to get vector information about non-vector type. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41862 llvm-svn: 322106	2018-01-09 19:08:22 +00:00
Derek Schuff	e9c278ccf1	[WebAssembly] Update libcall signature lists New signatures added in r322087. A fix for this tight coupling is forthcoming. llvm-svn: 322105	2018-01-09 19:05:34 +00:00
Craig Topper	c4d2dd80b6	[X86] Add a DAG combine to combine (sext (setcc)) with VLX Normally target independent DAG combine would do this combine based on getSetCCResultType, but with VLX getSetCCResultType returns a vXi1 type preventing the DAG combining from kicking in. But doing this combine can allow us to remove the explicit sign extend that would otherwise be emitted. This patch adds a target specific DAG combine to combine the sext+setcc when the result type is the same size as the input to the setcc. I've restricted this to FP compares and things that can be represented with PCMPEQ and PCMPGT since we don't have full integer compare support on the older ISAs. Differential Revision: https://reviews.llvm.org/D41850 llvm-svn: 322101	2018-01-09 18:14:22 +00:00
Francis Visoiu Mistrih	7d9bef8f5c	[CodeGen] Don't print "pred:" and "opt:" in -debug output In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096	2018-01-09 17:31:07 +00:00
Sander de Smalen	906a5deace	Recommit r322073: [AArch64][SVE] Asm: Add predicated ADD/SUB instructions Fixed issue that was found on sanitizer-x86_64-linux-fast. I changed the result type of 'Parser.getTok().getString().lower()' in AArch64AsmParser::tryParseSVEPredicateVector() from 'StringRef' to 'auto', since StringRef::lower() returns a std::string. llvm-svn: 322092	2018-01-09 17:01:27 +00:00
Sander de Smalen	6595603187	Reverted r322073 because of AddressSanitizer failure on sanitizer-x86_64-linux-fast builder. llvm-svn: 322077	2018-01-09 13:51:09 +00:00
Sander de Smalen	1f97363e5f	[AArch64][SVE] Asm: Add predicated ADD/SUB instructions Summary: Add the predicated ADD/SUB instructions and corresponding tests. Patch [3/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41443 llvm-svn: 322073	2018-01-09 12:43:46 +00:00
Sander de Smalen	7868e74033	[AArch64][SVE] Asm: Add parsing of merging/zeroing suffix for SVE predicate vector operands Summary: Parsing of the '/m' (merging) or '/z' (zeroing) suffix of a predicate operand. Patch [2/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo, MatzeB, t.p.northover Reviewed By: fhahn Subscribers: t.p.northover, MatzeB, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41442 llvm-svn: 322070	2018-01-09 11:17:06 +00:00
Nikolai Bozhenov	eededdade9	[Nios2] Arithmetic instructions for R1 and R2 ISA. Summary: This commit enables some of the arithmetic instructions for Nios2 ISA (for both R1 and R2 revisions), implements facilities required to emit those instructions and provides LIT tests for added instructions. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D41236 Author: belickim <mateusz.belicki@intel.com> llvm-svn: 322069	2018-01-09 11:15:08 +00:00
Oren Ben Simhon	1c6308ecd5	Instrument Control Flow For Indirect Branch Tracking CET (Control-Flow Enforcement Technology) introduces a new mechanism called IBT (Indirect Branch Tracking). According to IBT, each Indirect branch should land on dedicated ENDBR instruction (End Branch). The new pass adds ENDBR instructions for every indirect jmp/call (including jumps using jump tables / switches). For more information, please see the following: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Differential Revision: https://reviews.llvm.org/D40482 Change-Id: Icb754489faf483a95248f96982a4e8b1009eb709 llvm-svn: 322062	2018-01-09 08:51:18 +00:00
Craig Topper	def1c30c66	[X86] Allow more cmpps/pd immediate encodings to be commuted during isel. The code that checks the immediate wasn't masking to the lower 3-bits like the code in X86InstrInfo.cpp that's used by the peephole pass does. llvm-svn: 322060	2018-01-09 07:09:34 +00:00
Sean Fertile	33a17762bb	[PowerPC] Can not assume an intrinsic argument is a simple type. The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics, and assumes the arguments must be of a simple type. This isn't always the case though. For example if we unroll and vectorize a loop we may end up with vectors larger then the largest legal type, along with intrinsics that operate on those wider types. This happened in the ffmpeg build, where we unrolled a loop and ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion. Differential Revision: https://reviews.llvm.org/D41758 llvm-svn: 322055	2018-01-09 03:03:41 +00:00
Eric Christopher	c44717774a	Remove unused function HvxSelector::zerous. llvm-svn: 322053	2018-01-09 02:38:17 +00:00
Stefan Pintilie	7e10987b12	Revert "[PowerPC] Manually schedule the prologue and epilogue" [PowerPC] This reverts commit r322036. Failing build bots. Revert the commit now. llvm-svn: 322051	2018-01-09 01:06:21 +00:00
Craig Topper	cc342d465e	[X86] Remove llvm.x86.avx512.cvt2mask. intrinsics and autoupgrade to (icmp slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050	2018-01-09 00:50:47 +00:00
Craig Topper	7c2abdd249	[X86] Remove unnecessary isel pattern that is a combination of two other patterns. The pattern was this def : Pat<(i32 (zext (i8 (bitconvert (v8i1 VK8:$src))))), (MOVZX32rr8 (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit))>, Requires<[NoDQI]>; but if you just let (i32 (zext X)) match byte itself you'll get MOVZX32rr8. And if you let (i8 (bitconvert (v8i1 VK8:$src))) match by itself you'll get (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit). So we can just let isel do the two patterns naturally. llvm-svn: 322049	2018-01-09 00:50:42 +00:00
Jessica Paquette	3291e7353e	[MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups This commit does two things. Firstly, it adds a collection of flags which can be passed along to the target to encode information about the MBB that an instruction lives in to the outliner. Second, it adds some of those flags to the AArch64 outliner in order to add more stack instructions to the list of legal instructions that are handled by the outliner. The two flags added check if - There are calls in the MachineBasicBlock containing the instruction - The link register is available in the entire block If the link register is available and there are no calls, then a stack instruction can always be outlined without fixups, regardless of what it is, since in this case, the outliner will never modify the stack to create a call or outlined frame. The motivation for doing this was checking which instructions are most often missed by the outliner. Instructions like, say %sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy are very common, but cannot be outlined in the case that the outliner might modify the stack. This commit allows us to outline instructions like this. llvm-svn: 322048	2018-01-09 00:26:18 +00:00
Stefan Pintilie	55bfdd040a	[PowerPC] Manually schedule the prologue and epilogue This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. Differential Revision: https://reviews.llvm.org/D41737 llvm-svn: 322036	2018-01-08 22:23:10 +00:00
Aleksandar Beserminji	a734d409c6	[mips] Remove duplicated R6 EVA instructions This patch removes duplicated EVA instructions in R6. Differential Revision: https://reviews.llvm.org/D41769 llvm-svn: 322007	2018-01-08 16:50:33 +00:00
Momchil Velikov	ac7c5c1d92	[ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996	2018-01-08 14:47:19 +00:00
Jonas Paulsson	22f208f034	[SystemZ] Comment fix in SystemZElimCompare.cpp NFC Review: Ulrich Weigand llvm-svn: 321990	2018-01-08 12:52:40 +00:00
Momchil Velikov	d17dabca31	[ARM] Fix PR35481 This patch allows `r7` to be used, regardless of its use as a frame pointer, as a temporary register when popping `lr`, and also falls back to using a high temporary register if, for some reason, we weren't able to find a suitable low one. Differential revision: https://reviews.llvm.org/D40961 Fixes https://bugs.llvm.org/show_bug.cgi?id=35481 llvm-svn: 321989	2018-01-08 11:32:37 +00:00
Francis Visoiu Mistrih	d52da12822	[X86] Remove side-effects from determineCalleeSaves (Target)FrameLowering::determineCalleeSaves can be called multiple times. I don't think it should have side-effects as creating stack objects and setting global MachineFunctionInfo state as it is doing today (in other back-ends as well). This moves the creation of stack objects from determineCalleeSaves to assignCalleeSavedSpillSlots. Differential Revision: https://reviews.llvm.org/D41703 llvm-svn: 321987	2018-01-08 10:46:05 +00:00
Craig Topper	f090e8a89a	[X86] Replace CVT2MASK ISD opcode with PCMPGTM compared to zero. CVT2MASK is just checking the sign bit which can be represented with a comparison with zero. llvm-svn: 321985	2018-01-08 06:53:54 +00:00
Craig Topper	a2018e799a	[X86] Add patterns to allow 512-bit BWI compare instructions to be used for 128/256-bit compares when VLX is not available. llvm-svn: 321984	2018-01-08 06:53:52 +00:00
Craig Topper	9f5859e3ee	[X86] Simplify some code in lower1BitVectorShuffle by relying on getNode's ability to constant fold vector SIGN_EXTEND. llvm-svn: 321979	2018-01-07 23:56:37 +00:00
Craig Topper	03d8e516cf	[X86] Add VSHUFF32X4 and similar instructions to load folding tables. llvm-svn: 321978	2018-01-07 23:30:20 +00:00
Craig Topper	e9f44e1b80	[X86] Revert accidental change to CMakeLists.txt in r321952 I had removed the qualifiers around the autogenerated folding table so I could compare with the manual table, but didn't intend to commit the change. llvm-svn: 321971	2018-01-07 21:03:43 +00:00
Craig Topper	c1ec57c3e2	[X86] Remove unneeded code from combineGatherScatter that used to delte SIGN_EXTEND_INREG nodes created during legalization of v2i1/v4i1 masks on KNL. v2i1/v4i1 are now legal on KNL so no sign_extend_inreg is generated. llvm-svn: 321968	2018-01-07 18:34:08 +00:00
Craig Topper	d58c165545	[X86] Make v2i1 and v4i1 legal types without VLX Summary: There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type. It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway. This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly. We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added. I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all. There's definitely room for improvement with some follow up patches. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41560 llvm-svn: 321967	2018-01-07 18:20:37 +00:00
Craig Topper	d461aefe5f	[PowerPC] Add an ISD::TRUNCATE to the legalization for ppc_is_decremented_ctr_nonzero Summary: I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work. There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed. With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: nemanjai, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D41654 llvm-svn: 321959	2018-01-07 07:51:36 +00:00
Craig Topper	a21f551109	[X86] Add the 16 and 8-bit CRC32 instructions to the load folding tables. llvm-svn: 321958	2018-01-07 06:48:20 +00:00
Craig Topper	d0859a03b5	[X86] Correct the load folding flags for xmm fp->mmx conversion instructions. The instructions that load 64-bits or an xmm register should be TB_NO_REVERSE to avoid the load being widened during unfold. The instructions that load 128-bits need to ensure 128-bit alignment. llvm-svn: 321956	2018-01-07 06:24:30 +00:00
Craig Topper	aa73941176	[X86] Add TB_NO_REVERSE to some scalar intrinsic instructions in the load folding table. llvm-svn: 321955	2018-01-07 06:24:29 +00:00
Craig Topper	85657d59a9	[X86] Don't put any EVEX_B instructions in the tablegen generated load folding tables. EVEX_B means different things for memory and register forms. The instructions should not be considered equivalent. llvm-svn: 321954	2018-01-07 06:24:28 +00:00
Craig Topper	89293a2a94	[X86] Add 128 and 256-bit VPOPCNTD/Q instructions to load folding tables. llvm-svn: 321953	2018-01-07 06:24:27 +00:00
Craig Topper	a124ab10ef	[X86] Add some 8 and 16-bit instructions to the load folding tables. llvm-svn: 321952	2018-01-07 06:24:25 +00:00
Craig Topper	11aede13db	[X86] Add EVEX vcvtph2ps to the load folding tables. llvm-svn: 321951	2018-01-07 06:24:24 +00:00
Craig Topper	40cc8338f7	[X86] Remove cvtps2ph xmm->xmm from store folding tables. Add the evex versions of cvtps2ph to the store folding tables. The memory form of the xmm->xmm version only writes 64-bits. If we use it in the folding tables and its get used for a stack spill, only half the slot will be written. Then a reload may read all 128-bits which will pull in garbage. But without the spill the upper bits of the register would have been zero. By not folding we would preserve the zeros. llvm-svn: 321950	2018-01-07 06:24:23 +00:00
Craig Topper	8fa800b834	[X86] Add CMP8ri8 to load folding tables. llvm-svn: 321949	2018-01-07 06:24:21 +00:00
Craig Topper	cf93feb981	[X86] Remove assembler predicates from all AVX512 related feature flags. We don't do fine grained feature control like this on features prior to AVX512. We do still have checks in place in the assembly parser itself that prevents %zmm references or %xmm16-31 from being parsed without at least -mattr=avx512f. Same for rounding control and mask operands. That will prevent the table matcher from matching for any instructions that need those features and that's probably good enough. llvm-svn: 321947	2018-01-06 21:45:30 +00:00
Craig Topper	61d8a60e23	[X86] Remove memory forms of EVEX encoded vcvttss2si/vcvttsd2si from asm matcher table. This is also needed to fix PR35837. llvm-svn: 321946	2018-01-06 21:27:25 +00:00
Craig Topper	0f4ccb7806	[X86] Add load folding pattern to EVEX vcvttss2si/vcvtsd2si. llvm-svn: 321945	2018-01-06 21:02:26 +00:00
Craig Topper	90353a9f42	[X86] Remove an unnecessary VCVTTSD2SIrrb/VCVTSS2SIrrb instruction with no isel pattern that only existed for the assembler. Use VCVTTSD2SIrrb_Int instead. For consistency use the _Int version of VCVTTSD2SIrr_Int and VCVTTSD2SIrm_Int for the assembler as well. llvm-svn: 321944	2018-01-06 21:02:22 +00:00
Craig Topper	a49c354a08	[X86] Remove memory forms of EVEX encoded vcvtsd2si/vcvtss2si from the assembler matcher table We should always prefer the VEX encoded version of these instructions. There is no advantage to the EVEX version. Fixes PR35837. llvm-svn: 321939	2018-01-06 19:20:33 +00:00
Sanjay Patel	5a48aef3f0	[x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325) This is the last step needed to fix PR33325: https://bugs.llvm.org/show_bug.cgi?id=33325 We're trading branch and compares for loads and logic ops. This makes the code smaller and hopefully faster in most cases. The 24-byte test shows an interesting construct: we load the trailing scalar elements into vector registers and generate the same pcmpeq+movmsk code that we expected for a pair of full vector elements (see the 32- and 64-byte tests). Differential Revision: https://reviews.llvm.org/D41714 llvm-svn: 321934	2018-01-06 16:16:04 +00:00
Craig Topper	b18d6221ba	[X86] Rename the EVEX encoded GFNI instructions to start with a 'V'. NFC This makes the names consistent with the mnemonics like every other instruction. llvm-svn: 321931	2018-01-06 07:18:08 +00:00
Craig Topper	36d8da3358	[X86] When parsing rounding mode operands, provide a proper end location so we don't crash when trying to print an error message using it. llvm-svn: 321930	2018-01-06 06:41:07 +00:00
Craig Topper	8c2ea74e74	[X86] Call lowerShuffleAsRepeatedMaskAndLanePermute from lowerV4I64VectorShuffle. llvm-svn: 321929	2018-01-06 06:08:04 +00:00
Craig Topper	e2659d8383	[X86] Add vcvtsd2sil/vcvtsd2siq etc. InstAliases to the EVEX-encoded instructions. This matches their VEX equivalents. llvm-svn: 321912	2018-01-05 23:13:54 +00:00
Krzysztof Parzyszek	b0b52618c0	[Hexagon] Even simpler patterns for sign- and zero-extending HVX vectors Recommit r321897 with updated testcases. llvm-svn: 321908	2018-01-05 22:31:11 +00:00
Krzysztof Parzyszek	4ed8ef6f8e	Revert r321894: it requires a part of another commit that is not ready yet Commit message: [Hexagon] Add patterns for sext_inreg of HVX vector types llvm-svn: 321904	2018-01-05 21:57:43 +00:00
Craig Topper	29476ab0bd	[X86] Add InstAliases for 'vmovd' with GR64 registers to select EVEX encoded instructions as well. Without this we allow "vmovd %rax, %xmm0", but not "vmovd %rax, %xmm16" This exists due to continue a silly bug where really old versions of the GNU assembler required movd instead of movq on these instructions. This compatibility hack then crept forward to avx version too, but we didn't propagate it to avx512. llvm-svn: 321903	2018-01-05 21:57:23 +00:00
Krzysztof Parzyszek	9920dab75e	Revert r321897: affected testcases were not updated Commit message: [Hexagon] Even simpler patterns for sign- and zero-extending HVX vectors llvm-svn: 321902	2018-01-05 21:50:15 +00:00
Craig Topper	004867312e	[X86] Stop printing moves between VR64 and GR64 with 'movd' mnemonic. Use 'movq' instead. This behavior existed to work with an old version of the gnu assembler on MacOS that only accepted this form. Newer versions of GNU assembler and the current LLVM derived version of the assembler on MacOS support movq as well. llvm-svn: 321898	2018-01-05 20:55:12 +00:00
Krzysztof Parzyszek	577d2f2fbd	[Hexagon] Even simpler patterns for sign- and zero-extending HVX vectors llvm-svn: 321897	2018-01-05 20:49:26 +00:00
Krzysztof Parzyszek	f9d01a12d1	[Hexagon] Add patterns for truncating HVX vector types Only non-bool vectors. llvm-svn: 321895	2018-01-05 20:48:03 +00:00
Krzysztof Parzyszek	9d0c6355a0	[Hexagon] Add patterns for sext_inreg of HVX vector types llvm-svn: 321894	2018-01-05 20:46:41 +00:00
Krzysztof Parzyszek	0f5d976aa0	[Hexagon] Add a bitcast to required type in LowerHvxMul llvm-svn: 321893	2018-01-05 20:45:34 +00:00
Krzysztof Parzyszek	66ee123d61	[Hexagon] Add pattern for vsplat to v8i8 llvm-svn: 321892	2018-01-05 20:43:56 +00:00
Krzysztof Parzyszek	b3e50ac1c4	[Hexagon] Set boolean contents in HexagonISelLowering llvm-svn: 321891	2018-01-05 20:41:50 +00:00
Reid Kleckner	5619669a5a	Fix -Wsign-compare warnings on Windows These arise because enums are 'int' by default. llvm-svn: 321887	2018-01-05 19:53:51 +00:00
Momchil Velikov	7efdd090e2	[ARM] Issue an erorr when non-general-purpose registers are used in address operands Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar. This patch add checks that only general-purpose registers are used in address operands, shifted registers, and shift amounts. Differential revision: https://reviews.llvm.org/D39910 llvm-svn: 321866	2018-01-05 13:28:10 +00:00
Evandro Menezes	6161a0b3b0	[AArch64] Improve code generation of vector build Instead of using, for example, `dup v0.4s, wzr`, which transfers between register files, use the more efficient `movi v0.4s, #0` instead. Differential revision: https://reviews.llvm.org/D41515 llvm-svn: 321824	2018-01-04 21:43:12 +00:00
Craig Topper	dffb98e03d	[X86] Correct the execution domain for AVX1 VBROADCASTF128 to be FP instead of integer. llvm-svn: 321821	2018-01-04 20:56:21 +00:00
Oliver Stannard	7d9198b296	[ARM] Fix endianness of Thumb .inst.w directive Wide Thumb2 instructions should be emitted into the object file as pairs of 16-bit words of the appropriate endianness, not one 32-bit word. Differential revision: https://reviews.llvm.org/D41185 llvm-svn: 321799	2018-01-04 13:56:40 +00:00
Krzysztof Parzyszek	b1b2960336	[Hexagon] Replace INSERTRP/EXTRACTRP with INSERT/EXTRACT in HexagonISD llvm-svn: 321798	2018-01-04 13:56:04 +00:00
Diana Picus	865f7fecb2	[ARM GlobalISel] Select G_PHI Select G_PHI to PHI and manually constrain the result register. This is very similar to how COPY is handled, so extract and reuse some of that code. llvm-svn: 321797	2018-01-04 13:09:25 +00:00
Diana Picus	c768bbe2e7	[ARM GlobalISel] Legalize scalar G_PHI Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard float. Widen any smaller types. llvm-svn: 321795	2018-01-04 13:09:14 +00:00
Diana Picus	37ae9f68a4	[ARM GlobalISel] Fix selection of pointer constants We used to handle G_CONSTANT with pointer type by forcing the type of the result register to s32 and then letting TableGen handle it. Unfortunately, setting the type only works for generic virtual registers, that haven't yet been constrained to a register class (e.g. those used only by a COPY later on). If the result register has already been constrained as a use of a previously selected instruction, then setting the type will assert. It would be nice to be able to teach TableGen to select pointer constants the same as integer constants, but since it's such an edge case (at the moment the only pointer constant that we're generally interested in is 0, and that is mostly used for comparisons and selects, which are also not supported by TableGen) it's probably not worth the effort right now. Instead, handle pointer constants with some trivial handwritten code. llvm-svn: 321793	2018-01-04 10:54:57 +00:00
Craig Topper	e6e9c27510	[X86] Remove 'else' after 'return' I forgot to cleanup before committing D41691. llvm-svn: 321755	2018-01-03 19:15:43 +00:00
Matt Arsenault	4ff5e002ea	AMDGPU: Remove dead file llvm-svn: 321752	2018-01-03 18:45:42 +00:00
Craig Topper	8232e88dd5	[X86] Remove useless custom inserter for 64-bit TAILJMP and TCRETURN opcodes This custom inserter was added in r124272 at which time it added about bunch of Defs for Win64. In r150708, those defs were removed leaving only the "return BB". So I think this means the custom inserter is a NOP these days. This patch removes the remaining code and stops tagging the instructions for custom insertion Differential Revision: https://reviews.llvm.org/D41671 llvm-svn: 321747	2018-01-03 18:20:36 +00:00
Craig Topper	cc6637b707	[X86] Use ANY_EXTEND instead of SIGN_EXTEND in lowerMasksToReg Currently we use SIGN_EXTEND in lowerMasksToReg as part of calling convention setup, but we don't require a specific value for the upper bits. This patch changes it to ANY_EXTEND which will be lowered as SIGN_EXTEND if it ends up sticking around. llvm-svn: 321746	2018-01-03 18:11:01 +00:00
Hans Wennborg	7998549040	Remove left-over debug printout from r321692 Besides the unsightly print-out, it was causing some buildbots to fail, e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/9311 llvm-svn: 321711	2018-01-03 14:48:19 +00:00
Alex Bradbury	46db78b743	[ARM][NFC] Avoid recreating MCSubtargetInfo in ARMAsmBackend After D41349, we can now directly access MCSubtargetInfo from createARM*AsmBackend. This patch makes use of this, avoiding the need to create a fresh MCSubtargetInfo (which was previously always done with a blank CPU and feature string). Given the total size of the change remains pretty tiny and we're removing the old explicit destructor, I changed the STI field to a reference rather than a pointer. Differential Revision: https://reviews.llvm.org/D41693 llvm-svn: 321707	2018-01-03 13:46:21 +00:00
Sander de Smalen	dc5e081b93	[AArch64][SVE] Asm: Add restricted register classes for SVE predicate vectors. Summary: Add a register class for SVE predicate operands that can only be p0-p7 (as opposed to p0-p15) Patch [1/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo, olista01, SjoerdMeijer, javed.absar Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41441 llvm-svn: 321699	2018-01-03 10:15:46 +00:00
Alex Bradbury	7c093bf1cf	Fix build of WebAssembly and AVR backends after r321692 As experimental backends, I didn't have them configured to build in my local build config. llvm-svn: 321696	2018-01-03 09:30:39 +00:00
Alex Bradbury	b22f751fa7	Thread MCSubtargetInfo through Target::createMCAsmBackend Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692	2018-01-03 08:53:05 +00:00
Andrew Kaylor	e12e08c680	Handle the case of live 16-bit subregisters in X86FixupBWInsts Differential Revision: https://reviews.llvm.org/D40524 Change-Id: Ie3a405b28503ceae999f5f3ba07a68fa733a2400 llvm-svn: 321674	2018-01-02 21:04:38 +00:00
Sanjay Patel	9a80871ffe	[x86] allow pairs of PCMPEQ for vector-sized integer equality comparisons (PR33325) This is an extension of D31156 with the goal that we'll allow memcmp() == 0 expansion for x86 to use 2 pairs of loads per block. The memcmp expansion pass (formerly part of CGP) will generate this kind of pattern with oversized integer compares, so we want to transform these into x86-specific vector nodes before legalization splits things into scalar chunks. See PR33325 for more details: https://bugs.llvm.org/show_bug.cgi?id=33325 Differential Revision: https://reviews.llvm.org/D41618 llvm-svn: 321656	2018-01-02 16:38:29 +00:00
Amara Emerson	854d10d10b	[AArch64][GlobalISel] Enable GlobalISel at -O0 by default Tests updated to explicitly use fast-isel at -O0 instead of implicitly. This change also allows an explicit -fast-isel option to override an implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0. Differential Revision: https://reviews.llvm.org/D41362 llvm-svn: 321655	2018-01-02 16:30:47 +00:00
Krzysztof Parzyszek	cfe4a3616f	[Hexagon] Fix generation of vector sign extensions llvm-svn: 321650	2018-01-02 15:28:49 +00:00
Sander de Smalen	c9b3e1cf03	[AArch64][AsmParser] Add isScalarReg() and repurpose isReg() Summary: isReg() in AArch64AsmParser.cpp is a bit of a misnomer, and would be better named 'isScalarReg()' instead. Patch [1/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41445 llvm-svn: 321646	2018-01-02 13:39:44 +00:00
Simon Pilgrim	39f50e103b	Strip trailing whitespace. NFCI llvm-svn: 321644	2018-01-02 12:41:29 +00:00
Alex Bradbury	8cb894b34b	[RISCV] Add Defs Uses information for c.jal and c.addi4spn Differential Revision: https://reviews.llvm.org/D41339 Patch by Shiva Chen. llvm-svn: 321643	2018-01-02 12:09:29 +00:00
Alex Bradbury	3633d1205f	[RISCV][NFC] Resolve unused variable warning in RISCVISelLowering XLenVT in LowerFormalArguments is used only in an assert. llvm-svn: 321642	2018-01-02 11:54:59 +00:00
Craig Topper	c8898b3640	[X86] Promote vXi1 fp_to_uint/fp_to_sint to vXi32 to avoid scalarization. llvm-svn: 321632	2018-01-01 21:12:18 +00:00
Craig Topper	e5943bb337	[X86] Replace custom lowering of vXi1 SINT_TO_FP/UINT_TO_FP with promotion. The custom lowering was just doing the same thing promotion would do. llvm-svn: 321630	2018-01-01 20:08:43 +00:00
Craig Topper	a4f9997675	[SelectionDAG][X86][AArch64] Require targets to specify the promotion type when using setOperationAction Promote for INT_TO_FP and FP_TO_INT Currently the promotion for these ignores the normal getTypeToPromoteTo and instead just tries to double the element width. This is because the default behavior of getTypeToPromote to just adds 1 to the SimpleVT, which has the affect of increasing the element count while keeping the scalar size the same. If multiple steps are required to get to a legal operation type, int_to_fp will be promoted multiple times. And fp_to_int will keep trying wider types in a loop until it finds one that works. getTypeToPromoteTo does have the ability to query a promotion map to get the type and not do the increasing behavior. It seems better to just let the target specify the promotion type in the map explicitly instead of letting the legalizer iterate via widening. FWIW, it's worth I think for any other vector operations that need to be promoted, we have to specify the type explicitly because the default behavior of getTypeToPromote isn't useful for vectors. The other types of promotion already require either the element count is constant or the total vector width is constant, but neither happens by incrementing the SimpleVT enum. Differential Revision: https://reviews.llvm.org/D40664 llvm-svn: 321629	2018-01-01 19:21:35 +00:00
Craig Topper	0d35edda90	[X86] In LowerTruncateVecI1, don't add SHL if the input is known to be all sign bits. If the input is all sign bits then the LSB through MSB are all the same so we don't need to be move the LSB to the MSB. llvm-svn: 321617	2018-01-01 04:52:58 +00:00
Craig Topper	694c73adc2	[X86] Add missing NoVLX predicate around some patterns that use zmm registers to implement 128/256-bit operations without VLX. llvm-svn: 321613	2018-01-01 01:11:32 +00:00
Craig Topper	fc3ce4993c	[X86] Add patterns for using zmm registers for v8i32/v8f32 vselect with the false input being zero. We can use zmm move with zero masking for this. We already had patterns for using a masked move, but we didn't check for the zero masking case separately. llvm-svn: 321612	2018-01-01 01:11:29 +00:00
Craig Topper	f78b75fb59	[X86] Use CONCAT_VECTORS instead of INSERT_SUBVECTOR for padding v4i1/v2i1 vector to v8i1 pre-legalize. The CONCAT_VECTORS will be lowered to INSERT_SUBVECTOR later. In the modified cases this seems to be enough to trick a later DAG combine into running in a different order than allows the ANDs to be removed. I'll admit this is a bit of a hack that happens to work, but using CONCAT_VECTORS is more consistent with other legalization code anyway. llvm-svn: 321611	2017-12-31 19:17:52 +00:00
Simon Pilgrim	b000675374	[X86][AVX2] Combine extract(broadcast(scalar_value)) --> scalar_value As it has a scalar source we don't treat it as a target shuffle so needs special handling. llvm-svn: 321610	2017-12-31 18:59:30 +00:00
Simon Pilgrim	f205ec716b	[X86][SSE] Don't vectorize splat buildvector of binops (PR30780) Don't combine buildvector(binop(),binop(),binop(),binop()) -> binop(buildvector(), buildvector()) if its a splat - keep the binop scalar and just splat the result to avoid large vector constants. llvm-svn: 321607	2017-12-31 17:07:47 +00:00
Craig Topper	f0f6eefb49	[X86] Add a DAG combine to widen (i4 (bitcast (v4i1))) before type legalization sees the i4 and changes to load/store. Same for v2i1 and i2. llvm-svn: 321602	2017-12-31 09:50:38 +00:00
Craig Topper	7f39623533	[X86] Add a DAG combine to fix (v4i1 (bitcast (i4))) before type legalization sees the i4 and changes to load/store. Same for i2 and v2i1. llvm-svn: 321601	2017-12-31 08:25:50 +00:00
Craig Topper	876ec0b558	[X86] Prevent combining (v8i1 (bitconvert (i8 load)))->(v8i1 load) if we don't have DQI. We end up using an i8 load via an isel pattern from v8i1 anyway. This just makes it more explicit. This seems to improve codgen in some cases and I'd like to kill off some of the load patterns. llvm-svn: 321598	2017-12-31 07:38:41 +00:00
Craig Topper	6159f5ebd8	[X86] Remove patterns for load/store of vXi with bitcasts to/from integer. This is better handled by a DAG combine if its not already being done. No lit tests fail from the removal of these patterns. llvm-svn: 321597	2017-12-31 07:38:36 +00:00
Craig Topper	a362dee774	[X86] Remove AND32ri8 from pattern for v1i1 load. I don't think anything would actually expect the other bits to be zero. llvm-svn: 321596	2017-12-31 07:38:33 +00:00
Craig Topper	7ba1b76854	[X86] Fix a crash when returning a <1 x i1> value> llvm-svn: 321595	2017-12-31 07:38:30 +00:00
Craig Topper	1d0e2e82bc	[X86] Cleanup store splitting in LowerTruncatingStore Use getMemBasePlusOffset and calculate proper pointer info and alignment for the second store. llvm-svn: 321594	2017-12-31 07:38:26 +00:00
Benjamin Kramer	c7fc81e659	Use phi ranges to simplify code. No functionality change intended. llvm-svn: 321585	2017-12-30 15:27:33 +00:00
Hiroshi Inoue	ca3cdd7f27	[PowerPC] fix a bug in TCO eligibility check If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same. This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed. Differential Revision: https://reviews.llvm.org/D40893 llvm-svn: 321579	2017-12-30 08:09:04 +00:00
Craig Topper	97cc7b0377	[X86] Remove isel patterns for kshifts with types that don't support kshift natively. We should only be creating natively supported kshifts now. llvm-svn: 321577	2017-12-30 06:45:46 +00:00
Craig Topper	c5fd31a802	[X86] Custom legalize vXi1 extract_subvector with KSHIFTR. This allows us to remove some isel patterns. This is mostly NFC, but we now use KSHIFTB instead of KSHIFTW with DQI. llvm-svn: 321576	2017-12-30 06:45:43 +00:00
Simon Atanasyan	d41feef40f	[mips] Provide correct descriptions of asm constraints in the comments. NFC llvm-svn: 321566	2017-12-29 19:18:30 +00:00
Simon Atanasyan	970f686faa	[mips] Replace assert by an error message Initially, if the `c` constraint applied to the wrong data type that causes LLVM to assert. This commit replaces the assert by an error message. llvm-svn: 321565	2017-12-29 19:18:24 +00:00
Matt Arsenault	e19bc2ee0f	AMDGPU: Use unique PSVs for buffer resources Also fixes using the wrong memory type for some intrinsics when custom lowering them. llvm-svn: 321557	2017-12-29 17:18:21 +00:00
Matt Arsenault	d94b63d765	AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores Atomics still have hasSideEffects set on them because of the mess that is the memory properties. llvm-svn: 321556	2017-12-29 17:18:18 +00:00
Matt Arsenault	905f3518ba	AMDGPU: Implement getTgtMemIntrinsic for images Currently all images are lowered to have a single image PseudoSourceValue. Image stores happen to have overly strict mayLoad/mayStore/hasSideEffects flags set on them, so this happens to work. When these are fixed to be correct, the scheduler breaks this because the identical PSVs are assumed to be the same address. These need to be unique to the image resource value. llvm-svn: 321555	2017-12-29 17:18:14 +00:00
Simon Pilgrim	c701596e86	[X86][SSE] Match PSHUFLW/PSHUFHW + PSHUFD vXi16 shuffle patterns (PR34686) As noted in PR34686, we are relying on a PSHUFD+PSHUFLW+PSHUFHW shuffle chain for most general vXi16 unary shuffles. This patch checks for simpler PSHUFLW+PSHUFD and PSHUFHW+PSHUFD cases beforehand, building on some existing code that just handled splat shuffles. By doing so we also prevent premature use of PSHUFB shuffles which can be slower and require the creation/loading of constant shuffle masks. We now have the 'fast-variable-shuffle' option for hardware that prefers combining 2 or more shuffles to VPSHUFB etc. Differential Revision: https://reviews.llvm.org/D38318 llvm-svn: 321553	2017-12-29 14:41:50 +00:00
Dmitry Preobrazhensky	414e05383f	[AMDGPU][MC] Incorrect parsing of flat/global atomic modifiers See bug 35730: https://bugs.llvm.org/show_bug.cgi?id=35730 Differential Revision: https://reviews.llvm.org/D41598 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 321552	2017-12-29 13:55:11 +00:00
Nemanja Ivanovic	4e1f5e0734	[PowerPC] Fix for PR35688 - handle out-of-range values for r+r to r+i conversion Revision 320791 introduced a pass that transforms reg+reg instructions to reg+imm if they're fed by "load immediate". However, it didn't handle out-of-range shifts correctly as reported in PR35688. This patch fixes that and therefore the PR. Furthermore, there was undefined behaviour in the patch where the RHS of an initialization expression was 32 bits and constant `1` was shifted left 32 bits. This was fixed by ensuring the RHS is 64 bits just like the LHS. Differential Revision: https://reviews.llvm.org/D41369 llvm-svn: 321551	2017-12-29 12:22:27 +00:00
Andrew V. Tischenko	03ddad853d	Fix incorrect operand sizes for some MMX instructions: punpcklwd, punpcklbw and punpckldq. Differential Revision: https://reviews.llvm.org/D41595 llvm-svn: 321549	2017-12-29 08:31:01 +00:00
Craig Topper	55cf880900	[X86] When lowering extending loads from v2i1/v4i1, if we have VLX, use a narrower extend. Previously we used an extend from v8i1 to v8i32/v8i64. Then extracted to the final width. But if we have VLX we should extract first. This way we don't end up with an overly large extend. This allows us to use vcmpeq to make all ones for the sign extend when DQI isn't available. Otherwise we get a VPTERNLOG. If we make v2i1/v4i1 legal like proposed in D41560, we could always do this and rely on the lowering of the extend to widen when necessary. llvm-svn: 321538	2017-12-28 19:46:11 +00:00
Craig Topper	c0b6cb1e47	[X86] Use ISD::CONCAT_VECTORS when splitting 256-bit loads in combineLoad. llvm-svn: 321537	2017-12-28 19:46:06 +00:00
Craig Topper	4b311da3a4	[X86] Fix inconsistencies in different places where we split loads/stores. -Use MinAlign instead of std::min. -Use SelectionDAG::getMemBasePlusOffset. -Apply offset to the pointer info for the second load/store created. llvm-svn: 321536	2017-12-28 19:46:03 +00:00
Craig Topper	05cf1f338f	[X86] Emit ISD::TRUNCATE instead of X86ISD::VTRUNC from LowerZERO_EXTEND_Mask/LowerSIGN_EXTEND_Mask. The truncate will be lowered X86ISD::VTRUNC later. llvm-svn: 321534	2017-12-28 19:45:58 +00:00
Craig Topper	88e26a99f8	[X86] Remove unnecessary patterns for sign extending vXi1 without VLX. The custom lowering already widens the result type to 512-bits if VLX isn't supported. llvm-svn: 321533	2017-12-28 19:45:55 +00:00
Reid Kleckner	a2d119a059	[WinEH] Don't emit state stores or EH thunks for available_externally functions The exception handler thunk needs to reference the LSDA of the parent function, which won't be emitted if it's available_externally. Fixes PR35736. ThinLTO ends up producing available_externally functions that use _CxxFrameHandler3. llvm-svn: 321532	2017-12-28 18:41:31 +00:00
Benjamin Kramer	3a13ed60ba	Avoid int to string conversion in Twine or raw_ostream contexts. Some output changes from uppercase hex to lowercase hex, no other functionality change intended. llvm-svn: 321526	2017-12-28 16:58:54 +00:00
Simon Pilgrim	62411e4d4f	[X86][SSE] Use PMADDWD for v4i32 multiplies with 17 or more leading zeros If there are 17 or more leading zeros to the v4i32 elements, then we can use PMADD for the integer multiply when PMULLD is unavailable or slow. The 17 bits need to be zero as the PMADDWD performs a v8i16 signed-mul-extend + pairwise-add - the upper 16 so we're adding a zero pair and the 17th bit so we don't incorrectly sign extend. Differential Revision: https://reviews.llvm.org/D41484 llvm-svn: 321516	2017-12-28 10:05:49 +00:00
Craig Topper	55cfa89f20	[X86] Add CLWB to icelake. Per Table 1-1 in October 2017 edition of Intel® Architecture Instruction Set Extensions and Future Features llvm-svn: 321501	2017-12-27 22:04:04 +00:00
Craig Topper	72bbbeb2a7	[X86] Reimplement r321437 using custom lowering instead of as a DAG combine. My original implementation ran as a DAG combine post type legalization, but it turns out we don't run that DAG combine step if type legalization didn't change anything. Attempts to make the combine run before type legalization as well hit other issues. So just do it in LowerMUL where we can catch more cases. llvm-svn: 321496	2017-12-27 19:09:40 +00:00
Matthew Simpson	9439f54902	[AArch64] Change order of candidate FMLS patterns r319980 added new patterns to the machine combiner for transforming (fsub (fmul x y) z) into (fmla (fneg z) x y). That is, fsub's where the first source operand is an fmul are transformed. We previously only matched the case where the second source operand of an fsub was an fmul, transforming (fsub z (fmul x y)) into (fmls z x y). Now, if we have an fsub where both source operands are fmuls, both of the above patterns are applicable. However, the order in which we add the patterns to the list of candidates determines the transformation that takes place, since only the first pattern that matches will be used. This patch changes the order these two patterns are added to the list of candidates such that we prefer the case where the second source operand is an fmul (the fmls case), rather than the other one (the fmla/fneg case). When both source operands are fmuls, this ordering results in fewer instructions. Differential Revision: https://reviews.llvm.org/D41587 llvm-svn: 321491	2017-12-27 15:25:01 +00:00
Benjamin Kramer	293f34301e	[X86] Fix vmul combine for AVX1 targets. v8i32 is legal von AVX1, but it doesn't have pmuludq for it. llvm-svn: 321490	2017-12-27 13:31:50 +00:00
Craig Topper	428d87e559	[X86] Return SDValue(N, 0) instead of an SDValue() after a successful combine. Returning SDValue() means nothing changed, SDValue(N,0) means there was a change but the worklist management was taken care of. I don't know if this has a real effect other than making sure the combine counter in the DAG combiner gets updated, but it is the correct thing to do. llvm-svn: 321463	2017-12-26 22:22:58 +00:00
Andrew V. Tischenko	1dd7856af5	It's a fix for Bug 35741 - can't use comments after x86 prefixes. Differential Revision: https://reviews.llvm.org/D41579 llvm-svn: 321459	2017-12-26 18:29:52 +00:00
Craig Topper	162439dcdf	[X86] Pass itins.rr/itins.rm through properly for some instructions. llvm-svn: 321452	2017-12-26 05:43:05 +00:00
Craig Topper	9b800c692e	[X86] Use SSE_INTMUL_ITINS_P for the AVX-512 MUL instructions to match their SSE/AVX counterparts. llvm-svn: 321451	2017-12-26 05:43:04 +00:00
Craig Topper	e0b9b5ef2b	[X86] Fix typo in assert message. llvm-svn: 321450	2017-12-26 05:43:02 +00:00
Craig Topper	705fef3ef3	[X86] Add a DAG combines to turn vXi64 muls into VPMULDQ/VPMULUDQ if the upper bits are all sign bits or zeros. Normally we catch this during lowering, but vXi64 mul is considered legal when we have AVX512DQ. This DAG combine allows us to avoid PMULLQ with AVX512DQ if we can prove its unnecessary. PMULLQ is 3 uops that take 4 cycles each. While pmuldq/pmuludq is only one 4 cycle uop. llvm-svn: 321437	2017-12-25 06:47:10 +00:00
Craig Topper	fabeb27e36	[X86] Make some helper methods static functions instead. NFC llvm-svn: 321433	2017-12-25 00:54:53 +00:00
Craig Topper	b2cd8485dc	[X86] Use SelectionDAG::getFPExtendOrRound to simplify some code. llvm-svn: 321432	2017-12-25 00:54:51 +00:00
Benjamin Kramer	802e6255b2	Make helpers static. No functionality change. llvm-svn: 321425	2017-12-24 12:46:22 +00:00
Simon Pilgrim	e0434fad16	[X86][X87] Mark pseudo memory fold instructions as load/sideeffects (PR21160, PR34080, PR34454). Match regular x87 memory fold instructions with load/sideeffects tags, to prevent the schedulers from re-ordering them across the fnstcw/fldcw sequences for truncating stores while they are still pseudo during the stack conversion pass. llvm-svn: 321424	2017-12-24 12:20:21 +00:00
Craig Topper	2d1d9a11c1	[X86] Fix (v2f64 (s/uint_to_fp (v2i1))) to avoid scalarization without AVX512DQ. Previously we extended v2i1 to v2f64 and then tried to use cvtuqq2pd/cvtqq2pd, but that only works with avx512dq. So we ended up scalarizing it. Now we widen to v4i1 first and extend to v4i32. llvm-svn: 321420	2017-12-24 06:51:36 +00:00
Craig Topper	2e308a9b2f	[X86] Add assembler predicates to BITALG/VBMI2/VNNI features to be consistent with the other AVX512 ISAs. llvm-svn: 321416	2017-12-24 02:05:17 +00:00
Craig Topper	62fd123731	[X86] Teach WidenMaskArithmetic to handle any constant buildvector on the RHS not just all zeros/ones. llvm-svn: 321415	2017-12-24 01:03:31 +00:00
Craig Topper	06dad14797	[X86] Remove type restrictions from WidenMaskArithmetic. This can help AVX-512 code where mask types are legal allowing us to remove extends and truncates to/from mask types. llvm-svn: 321408	2017-12-23 18:53:05 +00:00
Craig Topper	e79a7a4b2e	[X86] In WidenMaskArithmetic, make sure we check the input type of a truncate on N1. Later in the code we explicitly bypass the truncate so we should be checking its type to make sure that it's safe. llvm-svn: 321407	2017-12-23 18:53:03 +00:00
Craig Topper	dbbbb8532c	[X86] Remove unneeded EVT variable. NFC Immediately after it is created we check if its equal to another EVT. Then we inconsistently use one or the other variables in the code below. Instead do the equality check directly on the getValueType result and remove the variable. Use the origina VT variable throughout the remaining code. llvm-svn: 321406	2017-12-23 18:53:01 +00:00
Simon Pilgrim	fc01bf86d5	[X86][X87] Wrap FpI_ pseudo to use PseudoI. NFCI. llvm-svn: 321405	2017-12-23 17:25:59 +00:00
Simon Pilgrim	730cbc8f8e	[X86] Add default InstrItinClass to PseudoI This will be used to help tidyup existing pseudos that we've added scheduling info to. llvm-svn: 321401	2017-12-23 10:47:21 +00:00
Craig Topper	b8e7ab8231	[X86] Pass the right VT to the getZeroExtendInReg introduced in r321398 Apparently we don't have tests for this which I didn't realize before. I'll try to fix that but wanted to fix the obvious bug. llvm-svn: 321399	2017-12-23 06:52:03 +00:00
Craig Topper	ed4a87f6a8	[X86] Use SelectionDAG::getZeroExtendInReg instead of implementing it manually. llvm-svn: 321398	2017-12-23 02:54:52 +00:00
Craig Topper	d6a8f2e67d	[SelectionDAG][X86] Don't use ->getValueType(0) after a call to getOperand to get the type of the operand. getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0. I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places. llvm-svn: 321397	2017-12-23 02:54:50 +00:00
Sanjoy Das	26d11ca4b0	(Re-landing) Expose a TargetMachine::getTargetTransformInfo function Re-land r321234. It had to be reverted because it broke the shared library build. The shared library build broke because there was a missing LLVMBuild dependency from lib/Passes (which calls TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can tell, this problem was always there but was somehow masked before (perhaps because TargetMachine::getTargetIRAnalysis was a virtual function). Original commit message: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321375	2017-12-22 18:21:59 +00:00
Dmitry Preobrazhensky	471adf7fdc	[AMDGPU][MC] Corrected handling of negative expressions See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41488 llvm-svn: 321372	2017-12-22 18:03:35 +00:00
Craig Topper	576335f998	[X86] When lowering insert_vector_elt/extract_vector_elt of vXi1 with a non-constant index just use either a 128-bit type or the vXi8 type with the correct number of elements. Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code. llvm-svn: 321369	2017-12-22 17:18:11 +00:00
Craig Topper	eff84ed204	[X86] Improve the printing of address mode during isel matching. Fix some inconsistent new line behavior and only print the FrameIndex when the address mode is a FrameIndexBase addressing mode. llvm-svn: 321368	2017-12-22 17:18:10 +00:00
Dmitry Preobrazhensky	c5b0c172f6	[AMDGPU][MC] Corrected parsing of optional operands for ds_swizzle_b32 See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41186 llvm-svn: 321367	2017-12-22 17:13:28 +00:00
Dmitry Preobrazhensky	2713495318	[AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registers See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561 This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic. Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41437 llvm-svn: 321359	2017-12-22 15:18:06 +00:00
Diana Picus	28a6d0e639	[ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32 Mark conversions between pointers and 32-bit scalars as legal, map them to the GPR and select to a simple COPY. llvm-svn: 321356	2017-12-22 13:05:51 +00:00
Diana Picus	68773859c8	[ARM GlobalISel] Support pointer constants Pointer constants are pretty rare, since we usually represent them as integer constants and then cast to pointer. One notable exception is the null pointer constant, which is represented directly as a G_CONSTANT 0 with pointer type. Mark it as legal and make sure it is selected like any other integer constant. llvm-svn: 321354	2017-12-22 11:09:18 +00:00
Craig Topper	e2873a17d7	[X86] Add missing initialization for the HasPREFETCHWT1 subtarget variable. llvm-svn: 321340	2017-12-22 03:53:14 +00:00
Craig Topper	67885f5d58	[X86] Enable PRFCHW feature on KNL/KNM and all CPUs inherited from Broadwell. llvm-svn: 321336	2017-12-22 02:41:12 +00:00
Craig Topper	e268598dd3	[X86] Add prefetchwt1 instruction and overhaul priorities and isel enabling for prefetch instructions. Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well. The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled. Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions. I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations. llvm-svn: 321335	2017-12-22 02:30:30 +00:00
Craig Topper	9befe89367	[X86] Use SIGN_EXTEND to implement ANY_EXTEND from vXi1. llvm-svn: 321334	2017-12-22 02:30:26 +00:00
Eli Friedman	39ed9a602b	[Inliner] Restrict soft-float inlining penalty. The penalty is currently getting applied in a bunch of places where it doesn't make sense, like bitcasts (which are free) and calls (which were getting the call penalty applied twice). Instead, just apply the penalty to binary operators and floating-point casts. While I'm here, also fix getFPOpCost() to do the right thing in more cases, so we don't have to dig into function attributes. Differential Revision: https://reviews.llvm.org/D41522 llvm-svn: 321332	2017-12-22 02:08:08 +00:00
Craig Topper	8772228963	[X86] Use SIGN_EXTEND rather than ZERO_EXTEND for lowering extract_vector_elt from vXi1 with a non-const index. We have a better range of instructions we can use if we can fill with the value i1 value rather than zeroing. llvm-svn: 321315	2017-12-21 22:08:23 +00:00
Craig Topper	742ac98d01	[X86] When lowering truncates to vXi1, don't sign extend i16/i8 types to 512-bit if we have VLX. This should only affect what we do for v8i16. Previously we went to v8i64, but if we have VLX we only need v8i32. This prevents an unnecessary zmm usage. llvm-svn: 321303	2017-12-21 20:45:13 +00:00
Craig Topper	410a289b79	[X86] Promote v8i1 shuffles to v8i32 instead of v8i64 if we have VLX. We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX. We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support. llvm-svn: 321291	2017-12-21 18:44:06 +00:00
Simon Pilgrim	4de5bb093c	[X86][SSE] Split large PAVGB/PAVGW vectors to legal widths Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly. Differential Revision: https://reviews.llvm.org/D41440 llvm-svn: 321288	2017-12-21 18:12:31 +00:00
Tony Jiang	eba757e45c	[PowerPC] Fix parest build failure in SPEC2017. The build failure was caused by an assertion in pre-legalization DAGCombine: Combining: t6: ppcf128 = uint_to_fp t5 ... into: t20: f32 = PPCISD::FCFIDUS t19 which is clearly wrong since ppcf128 are definitely different type with f32 and we cannot change the node value type when do DAGCombine. The fix is don't handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and leave it to downstream to legalize it and expand it to small legal types. Differential Revision: https://reviews.llvm.org/D41411 llvm-svn: 321276	2017-12-21 15:42:50 +00:00
Sam Parker	98727bc261	[ARM] Armv8-R DFB instruction Implement MC support for the Armv8-R 'Data Full Barrier' instruction. Differential Revision: https://reviews.llvm.org/D41430 llvm-svn: 321256	2017-12-21 11:17:49 +00:00
Craig Topper	72c22f4366	[X86] Use PSHUFB for v32i16 shuffles before falling back to VPERMW/VPERMI2W. PSHUFB has the ability to implicitly 0 elements which VPERMI2W can't do. So give a chance to use it first. llvm-svn: 321251	2017-12-21 08:22:51 +00:00
Craig Topper	38af615b4c	[X86] Use VPERMI2B for v16i8 shuffles if we have VBMI+VLX and would have otherwise used two PSHUFBs ORed together. llvm-svn: 321249	2017-12-21 07:31:30 +00:00
Craig Topper	03b2bc4838	[X86] Use VPERMB/VPERMI2B for v32i8 shuffle lowering if VBMI and VLX are supported. llvm-svn: 321248	2017-12-21 05:58:31 +00:00
Sanjoy Das	747d1114d6	Revert "Expose a TargetMachine::getTargetTransformInfo function" This reverts commit r321234. It breaks the -DBUILD_SHARED_LIBS=ON build. llvm-svn: 321243	2017-12-21 02:34:39 +00:00
Sanjoy Das	0c3de350b4	Expose a TargetMachine::getTargetTransformInfo function Summary: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321234	2017-12-21 01:06:58 +00:00
Reid Kleckner	82b117f07f	Attempt to pacify 4.8.5 with makeArrayRef llvm-svn: 321233	2017-12-21 00:28:34 +00:00
Joel Galenson	6f4e827e4c	[ARM] Optimize {s,u}{add,sub}.with.overflow. The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG. This commit ports that code to the ARM backend. Differential revision: https://reviews.llvm.org/D35635 llvm-svn: 321224	2017-12-20 22:25:39 +00:00
Krzysztof Parzyszek	3f84c0f5d8	[Hexagon] Use ArrayRef member functions instead of custom ones llvm-svn: 321221	2017-12-20 20:54:13 +00:00
Krzysztof Parzyszek	e4ce92cabf	[Hexagon] Allow construction of HVX vector predicates Handle BUILD_VECTOR of boolean values. llvm-svn: 321220	2017-12-20 20:49:43 +00:00
Krzysztof Parzyszek	fb0fcacb9d	[Hexagon] Legalize vector elements to i32 in buildVector32/64 llvm-svn: 321218	2017-12-20 20:33:49 +00:00
Yonghong Song	25bf825961	bpf: add support for objdump -print-imm-hex Add support for 'objdump -print-imm-hex' for imm64, operand imm and branch target. If user programs encode immediate values as hex numbers, such an option will make it easy to correlate asm insns with source code. This option also makes it easy to correlate imm values with insn encoding. There is one changed behavior in this patch. In old way, we print the 64bit imm as u64: O << (uint64_t)Op.getImm(); and the new way is: O << formatImm(Op.getImm()); The formatImm is defined in llvm/MC/MCInstPrinter.h as format_object<int64_t> formatImm(int64_t Value) So the new way to print 64bit imm is i64 type. If a 64bit value has the highest bit set, the old way will print the value as a positive value and the new way will print as a negative value. The new way is consistent with x86_64. For the code (see the test program): ... if (a == 0xABCDABCDabcdabcdULL) ... x86_64 objdump, with and without -print-imm-hex, looks like: 48 b8 cd ab cd ab cd ab cd ab movabsq $-6067004223159161907, %rax 48 b8 cd ab cd ab cd ab cd ab movabsq $-0x5432543254325433, %rax Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 321215	2017-12-20 19:39:58 +00:00
Craig Topper	8ec5632521	[X86] Refactor DomainReassignment pass to make the Closure class not stores references to the main data structures of the pass itself Multiple Closure objects can be created and stored for a single function. It's not a good idea to devote so many fields of it to storing pointers and references to global data structures of the pass. The closure class should only store the things needed to represent the closure itself. This patch refactors many of the methods of Closure to belong to the pass object and to pass around a reference to the current Closure. The Closure class gains a few simple methods to add instructions and edges, and to return iterators to edges and instructions Differential Revision: https://reviews.llvm.org/D41327 llvm-svn: 321213	2017-12-20 19:36:43 +00:00
Craig Topper	07820f2fe4	[X86] Remove zext from vXi32 to vXi64 on indices of gather/scatter instructions if we can prove the pre-extended value is positive. Gather/scatter can implicitly sign extend from i32->i64 on indices. So if we know the sign bit of the input to a zext is 0 we can use the implicit extension. llvm-svn: 321209	2017-12-20 19:25:33 +00:00
Stefan Pintilie	4241821848	[PowerPC] Added an assert to make sure that the MBBI iterator is valid. The function createTailCallBranchInstr assumes that the iterator MBBI is valid. However, only one use of MBBI is guarded in the function. Fix this by adding an assert. Differential Revision: https://reviews.llvm.org/D41358 llvm-svn: 321205	2017-12-20 19:07:44 +00:00
Matt Arsenault	f7f59b5292	[AMDGPU, AsmParser] Enable the mnemonic spell corrector. Patch by Dmitry Venikov llvm-svn: 321202	2017-12-20 18:52:57 +00:00
Craig Topper	bc92e00f2e	[X86] Implement the fusing of MUL+SUBADD to FMSUBADD This patch turns shuffles of fadd/fsub with fmul into fmsubadd. Patch by Dmitry Venikov Differential Revision: https://reviews.llvm.org/D40335 llvm-svn: 321200	2017-12-20 18:05:15 +00:00
Krzysztof Parzyszek	8f6b0c850a	[Hexagon] Adjust the value type for BCvt in LowerFormalArguments llvm-svn: 321177	2017-12-20 14:44:05 +00:00
Sander de Smalen	a2f3bed642	Trivial commit to force LLVM to run TableGen for Mips target after a change to the AsmMatcherEmitter, and should fix the buildbot failure on llvm-clang-x86_64-expensive-checks-win. The issue is also described here: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119617.html llvm-svn: 321170	2017-12-20 12:45:40 +00:00
Diana Picus	75ce852abe	[ARM GlobalISel] Fix assertion in RegBankSelect We get an assertion in RegBankSelect for code along the lines of my_32_bit_int = my_64_bit_int, which tends to translate into a 64-bit load, followed by a G_TRUNC, followed by a 32-bit store. This appears in a couple of places in the test-suite. At the moment, the legalizer doesn't distinguish between integer and floating point scalars, so a 64-bit load will be marked as legal for targets with VFP, and so will the rest of the sequence, leading to a slightly bizarre G_TRUNC reaching RegBankSelect. Since the current support for 64-bit integers is rather immature, this patch works around the issue by explicitly handling this case in RegBankSelect and InstructionSelect. In the future, we may want to revisit this decision and make sure 64-bit integer loads are narrowed before reaching RegBankSelect. llvm-svn: 321165	2017-12-20 11:27:10 +00:00
Florian Hahn	c3aa6d83fd	[ARM] Lower unsigned saturation to USAT Summary: Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it. Patch by Marten Svanfeldt Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41348 llvm-svn: 321164	2017-12-20 11:13:57 +00:00
Sander de Smalen	cd6be960ce	[AArch64][SVE] Re-submit patch series for ZIP1/ZIP2 This patch resubmits the SVE ZIP1/ZIP2 patch series consisting of of r320992, r320986, r320973, and r320970 by reverting https://reviews.llvm.org/rL321024. The issue that caused r321024 has been addressed in https://reviews.llvm.org/rL321158, so this patch-series should be safe to resubmit. llvm-svn: 321163	2017-12-20 11:02:42 +00:00
Tim Northover	6db5d027c6	AArch64: fix one more place movi.2d could be created. Somehow got missed out of r320965. llvm-svn: 321162	2017-12-20 10:45:39 +00:00
Sander de Smalen	c067c30d9e	[AArch64] Asm: Fix parsing of register aliases that have a name starting with 'z' Summary: This fixes an issue as identified by @rnk in https://reviews.llvm.org/rL321029. Reviewers: rnk, fhahn, rengolin, efriedma, echristo, olista01 Reviewed By: rnk, fhahn Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D41382 llvm-svn: 321158	2017-12-20 09:45:45 +00:00
Sam Parker	daed9de622	[AArch64] CCSIDR2 system register Implement the 'Current Cache Size' register that has been introduced as part of the Armv8.3 architecture. I originally missed this, and (hopefully) should be the final patch for assembler support. Differential Revision: https://reviews.llvm.org/D41396 llvm-svn: 321155	2017-12-20 08:56:41 +00:00
Craig Topper	abed821c36	[X86] Optimize sign extends on index operand to gather/scatter to not sign extend past i32. The gather instruction will implicitly sign extend to the pointer width, we don't need to further extend it. This can prevent unnecessary splitting in some cases. There's still an issue that lowering on non-VLX can introduce another sign extend that doesn't get combined with shifts from a lowered sign_extend_inreg. llvm-svn: 321152	2017-12-20 07:36:59 +00:00

... 2 3 4 5 6 ...

45723 Commits